Linux CIFS filesystem development
 help / color / mirror / Atom feed
From: Andrew Paniakin <apanyaki@amazon.com>
To: <pc@cjr.nz>, <stfrench@microsoft.com>, <sashal@kernel.org>,
	<pc@manguebit.com>
Cc: <regressions@lists.linux.dev>, <stable@vger.kernel.org>,
	<linux-cifs@vger.kernel.org>, <abuehaze@amazon.com>,
	<simbarb@amazon.com>, <benh@amazon.com>
Subject: Re: [REGRESSION][BISECTED] Commit 60e3318e3e900 in stable/linux-6.1.y breaks cifs client failover to another server in DFS namespace
Date: Mon, 24 Jun 2024 10:59:20 -0700	[thread overview]
Message-ID: <Znmz-Pzi4UrZxlR0@3c06303d853a.ant.amazon.com> (raw)
In-Reply-To: <ZnMkNzmitQdP9OIC@3c06303d853a.ant.amazon.com>

On 19/06/2024, Andrew Paniakin wrote:
> Commit 60e3318e3e900 ("cifs: use fs_context for automounts") was
> released in v6.1.54 and broke the failover when one of the servers
> inside DFS becomes unavailable. We reproduced the problem on the EC2
> instances of different types. Reverting aforementioned commint on top of
> the latest stable verison v6.1.94 helps to resolve the problem.
> 
> Earliest working version is v6.2-rc1. There were two big merges of CIFS fixes:
> [1] and [2]. We would like to ask for the help to investigate this problem and
> if some of those patches need to be backported. Also, is it safe to just revert
> problematic commit until proper fixes/backports will be available?
> 
> We will help to do testing and confirm if fix works, but let me also list the
> steps we used to reproduce the problem if it will help to identify the problem:
> 1. Create Active Directory domain eg. 'corp.fsxtest.local' in AWS Directory
> Service with:
> - three AWS FSX file systems filesystem1..filesystem3
> - three Windows servers; They have DFS installed as per
>   https://learn.microsoft.com/en-us/windows-server/storage/dfs-namespaces/dfs-overview:
>     - dfs-srv1: EC2AMAZ-2EGTM59
>     - dfs-srv2: EC2AMAZ-1N36PRD
>     - dfs-srv3: EC2AMAZ-0PAUH2U 
> 
>  2. Create DFS namespace eg. 'dfs-namespace' in Windows server 2008 mode
>  and three folders targets in it:
> - referral-a mapped to filesystem1.corp.local
> - referral-b mapped to filesystem2.corp.local
> - referral-c mapped to filesystem3.corp.local
> - local folders dfs-srv1..dfs-srv3 in C:\DFSRoots\dfs-namespace of every
>   Windows server. This helps to quickly define underlying server when
>   DFS is mounted.
> 
> 3. Enabled cifs debug logs:
> ```
> echo 'module cifs +p' > /sys/kernel/debug/dynamic_debug/control
> echo 'file fs/cifs/* +p' > /sys/kernel/debug/dynamic_debug/control
> echo 7 > /proc/fs/cifs/cifsFYI
> ```
> 
> 4. Mount DFS namespace on Amazon Linux 2023 instance running any vanilla
> kernel v6.1.54+:
> ```
> dmesg -c &>/dev/null
> cd /mnt
> mount -t cifs -o cred=/mnt/creds,echo_interval=5 \
>     //corp.fsxtest.local/dfs-namespace \
>     ./dfs-namespace
> ```
> 
> 5. List DFS root, it's also required to avoid recursive mounts that happen
> during regular 'ls' run:
> ```
> sh -c 'ls dfs-namespace'
> dfs-srv2  referral-a  referral-b
> ```
> 
> The DFS server is EC2AMAZ-1N36PRD, it's also listed in mount:
> ```
> [root@ip-172-31-2-82 mnt]# mount | grep dfs
> //corp.fsxtest.local/dfs-namespace on /mnt/dfs-namespace type cifs (rw,relatime,vers=3.1.1,cache=strict,username=Admin,domain=corp.fsxtest.local,uid=0,noforceuid,gid=0,noforcegid,addr=172.31.11.26,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=5,actimeo=1,closetimeo=1)
> //EC2AMAZ-1N36PRD.corp.fsxtest.local/dfs-namespace/referral-a on /mnt/dfs-namespace/referral-a type cifs (rw,relatime,vers=3.1.1,cache=strict,username=Admin,domain=corp.fsxtest.local,uid=0,noforceuid,gid=0,noforcegid,addr=172.31.12.80,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=5,actimeo=1,closetimeo=1)
> ```
> 
> List files in first folder:
> ```
> sh -c 'ls dfs-namespace/referral-a'
> filea.txt.txt
> ```
> 
> 6. Shutdown DFS server-2.
> List DFS root again, server changed from dfs-srv2 to dfs-srv1 EC2AMAZ-2EGTM59:
> ```
> sh -c 'ls dfs-namespace'
> dfs-srv1  referral-a  referral-b
> ```
> 
> 7. Try to list files in another folder, this causes ls to fail with error:
> ```
> sh -c 'ls dfs-namespace/referral-b'
> ls: cannot access 'dfs-namespace/referral-b': No route to host```
> 
> Sometimes it's also 'Operation now in progress' error.
> 
> mount shows the same output:
> ```
> //corp.fsxtest.local/dfs-namespace on /mnt/dfs-namespace type cifs (rw,relatime,vers=3.1.1,cache=strict,username=Admin,domain=corp.fsxtest.local,uid=0,noforceuid,gid=0,noforcegid,addr=172.31.11.26,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=5,actimeo=1,closetimeo=1)
> //EC2AMAZ-1N36PRD.corp.fsxtest.local/dfs-namespace/referral-a on /mnt/dfs-namespace/referral-a type cifs (rw,relatime,vers=3.1.1,cache=strict,username=Admin,domain=corp.fsxtest.local,uid=0,noforceuid,gid=0,noforcegid,addr=172.31.12.80,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=5,actimeo=1,closetimeo=1)
> ```
> 
> I also attached kernel debug logs from this test.
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=851f657a86421
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0a924817d2ed9
> 
> Reported-by: Andrei Paniakin <apanyaki@amazon.com>
> Bisected-by: Simba Bonga <simbarb@amazon.com>
> ---
> 
> #regzbot introduced: v6.1.54..v6.2-rc1


Friendly reminder, did anyone had a chance to look into this report?

  reply	other threads:[~2024-06-24 17:59 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-19 18:32 [REGRESSION][BISECTED] Commit 60e3318e3e900 in stable/linux-6.1.y breaks cifs client failover to another server in DFS namespace Andrew Paniakin
2024-06-24 17:59 ` Andrew Paniakin [this message]
2024-06-25 11:07   ` [REGRESSION][BISECTED][STABLE] " Christian Heusel
2024-06-26 22:09     ` Andrew Paniakin
2024-06-27 20:16       ` Christian Heusel
2024-07-11  9:49         ` Linux regression tracking (Thorsten Leemhuis)
2024-07-13  3:22           ` Andrew Paniakin
2024-07-23  0:51             ` Andrew Paniakin
2024-09-27 10:56               ` Linux regression tracking (Thorsten Leemhuis)
2024-10-16 20:49                 ` Andrew Paniakin
2024-11-10 17:59                   ` Andrew Paniakin
2025-03-16 18:34                     ` Andrew Paniakin
2025-05-10  3:05                       ` Andrew Paniakin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Znmz-Pzi4UrZxlR0@3c06303d853a.ant.amazon.com \
    --to=apanyaki@amazon.com \
    --cc=abuehaze@amazon.com \
    --cc=benh@amazon.com \
    --cc=linux-cifs@vger.kernel.org \
    --cc=pc@cjr.nz \
    --cc=pc@manguebit.com \
    --cc=regressions@lists.linux.dev \
    --cc=sashal@kernel.org \
    --cc=simbarb@amazon.com \
    --cc=stable@vger.kernel.org \
    --cc=stfrench@microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox