* PROBLEM: DFS Caching feature causing problems traversing multi-tier DFS setups @ 2019-11-12 21:52 Matthew Ruffell 2019-11-13 1:11 ` Paulo Alcantara 0 siblings, 1 reply; 3+ messages in thread From: Matthew Ruffell @ 2019-11-12 21:52 UTC (permalink / raw) To: linux-cifs; +Cc: sfrench, palcantara Hi Steve, Paulo, and maintainers of CIFS, We have come across a problem where kernels 5.0-rc1 and onwards cannot mount a multi tier DFS setup, while kernels 4.20 and below can mount the share fine. The DFS tiering structure looks like this: Domain virtual DFS (i.e. \\company.com\folders\share) |-- Domain controller DFS (i.e. \\regional-dc.company.com\folders\share) |-- Regional DFS Server (i.e. \\regional-dfs.company.com\folders\share) |-- Actual file server (i.e. \\regional-svr.company.com\share) On the 5.x series kernels, after getting the DFS referrals list through to the Regional DFS Server, which responds with the correct server/share, instead of going to the Actual file server, the kernel backtracks from the Regional DFS Server back to the Domain controller and requests the share there. Of course, this share does not exist on the Domain controller, as it only exists on the Actual file server, and the connection dies. We have collected a packet capture, and the flow looks like this: 4.18.0-21-generic Ubuntu kernel - Good Host request/response -------------------------------------------- ---------------------------------------------------- Domain controller / Domain DFS Root company.com\folders Domain controller / Domain DFS Root Referral List Regional Domain Controller / Domain DFS Root start convo Regional Domain Controller / Domain DFS Root <Regional Domain Controller>\Folders\Country\<Share> referral Regional Domain Controller / Domain DFS Root <Regional Domain Controller>\Folders\Country\<Share> referral Regional DFS server start convo Regional DFS server <Regional DFS Server>\Root\Country\<Share> Regional DFS server STATUS_PATH_NOT_COVERED Regional DFS server request referrals Regional DFS server Referral List Actual File Server convo started Actual File Server <Actual File Server>\<Share> Actual File Server Good response 5.0.0-26-generic Ubuntu kernel - Bad Host request/response -------------------------------------------- ------------------------------------------- Domain controller / Domain DFS Root company.com\folders Regional Domain Controller / Domain DFS Root start convo Regional Domain Controller / Domain DFS Root <Regional Domain Controller>\Folders\Country\<Share> Regional Domain Controller / Domain DFS Root STATUS_PATH_NOT_COVERED Regional DFS server start convo Regional DFS server <Regional DFS Server>\Root\Country\<Share> Regional DFS server STATUS_PATH_NOT_COVERED Regional Domain Controller / Domain DFS Root <Regional DFS Server>\Root\Country\<Share> Regional Domain Controller / Domain DFS Root STATUS_PATH_NOT_COVERED If you are interested in any parts of the packet capture, let us know and I will provide you with portions that you need. We also enabled CIFS dynamic debugging with: echo 'module cifs +p' > /sys/kernel/debug/dynamic_debug/control echo 'file fs/cifs/* +p' > /sys/kernel/debug/dynamic_debug/control echo 7 > /proc/fs/cifs/cifsFYI From there the debugging output was more or less the same between the two kernel versions, until the problematic area: Linux 4.18: Full log: https://paste.ubuntu.com/p/D9XwBbvTXc/ Status code returned 0xc0000257 STATUS_PATH_NOT_COVERED fs/cifs/smb2maperror.c: Mapping SMB2 status code 0xc0000257 to POSIX err -66 fs/cifs/connect.c: build_unc_path_to_root: full_path=\\<Regional DFS Server>\Root\Country\<Share> fs/cifs/smb2ops.c: smb2_get_dfs_refer path <\<Regional DFS Server>\Root\Country\<Share>> fs/cifs/misc.c: num_referrals: 1 dfs flags: 0x2 ... fs/cifs/dns_resolve.c: dns_resolve_server_name_to_ip: resolved: <Actual File Server> to <IPV4 Address> fs/cifs/connect.c: Username: XXX // mounts the share successfully Linux 5.0: Full log: https://paste.ubuntu.com/p/9sXPj7WMQv/ Status code returned 0xc0000257 STATUS_PATH_NOT_COVERED fs/cifs/smb2maperror.c: Mapping SMB2 status code 0xc0000257 to POSIX err -66 fs/cifs/connect.c: build_unc_path_to_root: full_path=\\<Regional DFS Server>\Root\Country\<Share> fs/cifs/connect.c: build_unc_path_to_root: full_path=\\<Regional DFS Server>\Root\Country\<Share> fs/cifs/dfs_cache.c: do_dfs_cache_find: search path: \<Regional DFS Server>\Root\Country\<Share> fs/cifs/dfs_cache.c: do_dfs_cache_find: cache miss fs/cifs/dfs_cache.c: do_dfs_cache_find: DFS referral request for \<Regional DFS Server>\Root\Country\<Share> fs/cifs/smb2ops.c: smb2_get_dfs_refer path <\<Regional DFS Server>\Root\Country\<Share>> fs/cifs/smb2pdu.c: SMB2 IOCTL Status code returned 0xc0000225 STATUS_NOT_FOUND fs/cifs/smb2maperror.c: Mapping SMB2 status code 0xc0000225 to POSIX err -2 // mounting the share fails shortly after Since there seems to be problems with how referrals are handled, I examined the commit history, and came across the new DFS caching feature introduced in 5.0-rc1. I built a test kernel at: commit e7b602f43719fc6173ae86d2de8f6f07c6858ddd Author: Paulo Alcantara <palcantara@suse.de> Date: Wed Nov 14 15:38:51 2018 -0200 Subject: cifs: Save TTL value when parsing DFS referrals Which is the commit directly before the first commit of the DFS caching feature, commit 54be1f6c1c37498bba557049df646cc239fa37e3 Author: Paulo Alcantara <palcantara@suse.de> Date: Wed Nov 14 16:01:21 2018 -0200 Subject: cifs: Add DFS cache routines I also built a test kernel at the end of the patch series which implements the DFS caching feature, at: commit 14e92c5dc7a1a1d4a82fb7142b5642837fef962a Author: Steve French <stfrench@microsoft.com> Date: Mon Dec 24 01:05:22 2018 -0600 Subject: cifs: Minor Kconfig clarification We then tried to mount the CIFS share, and found that: e7b602f43719fc6173ae86d2de8f6f07c6858ddd -> mounts successfully 14e92c5dc7a1a1d4a82fb7142b5642837fef962a -> fails to mount. So there is a problem somewhere in the DFS caching feature, covered in the following commits: https://paste.ubuntu.com/p/XcNwp3dVBV/ We have also tried 5.3.5 mainline kernel, and the issue is still present. We are available to help out with gathering more debugging data and trying test patches. Please let us know what data to collect, or patches to test, and we will collect whatever you need to get this fixed. Thanks, Matthew Ruffell Sustaining Engineer @ Canonical ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: PROBLEM: DFS Caching feature causing problems traversing multi-tier DFS setups 2019-11-12 21:52 PROBLEM: DFS Caching feature causing problems traversing multi-tier DFS setups Matthew Ruffell @ 2019-11-13 1:11 ` Paulo Alcantara 2019-11-19 16:55 ` Matthew Ruffell 0 siblings, 1 reply; 3+ messages in thread From: Paulo Alcantara @ 2019-11-13 1:11 UTC (permalink / raw) To: Matthew Ruffell, linux-cifs; +Cc: sfrench Hi Matthew, Thanks for the report. Matthew Ruffell <matthew.ruffell@canonical.com> writes: > We have come across a problem where kernels 5.0-rc1 and onwards cannot mount > a multi tier DFS setup, while kernels 4.20 and below can mount the share fine. > > The DFS tiering structure looks like this: > > Domain virtual DFS (i.e. \\company.com\folders\share) > |-- Domain controller DFS (i.e. \\regional-dc.company.com\folders\share) > |-- Regional DFS Server (i.e. \\regional-dfs.company.com\folders\share) > |-- Actual file server (i.e. \\regional-svr.company.com\share) > > On the 5.x series kernels, after getting the DFS referrals list through to the > Regional DFS Server, which responds with the correct server/share, instead of > going to the Actual file server, the kernel backtracks from the Regional DFS > Server back to the Domain controller and requests the share there. Of course, > this share does not exist on the Domain controller, as it only exists on the > Actual file server, and the connection dies. I've got some DFS cache patches[1] and haven't sent them yet due to lack of time and testing. Those contain a lot of important fixes but none of them seem to fix the issue you're having -- thus I won't ask you to apply them on top. Instead, could you please try below changes? Thanks, Paulo [1] https://git.cjr.nz/linux.git/log/?h=cifs-dfscache diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index 3991d6c8f255..9158d5d14ac9 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -4777,6 +4777,15 @@ static int is_path_remote(struct cifs_sb_info *cifs_sb, struct smb_vol *vol, } #ifdef CONFIG_CIFS_DFS_UPCALL +static inline void set_root_tcon(struct cifs_tcon *tcon, + struct cifs_tcon **root) +{ + spin_lock(&cifs_tcp_ses_lock); + tcon->tc_count++; + spin_unlock(&cifs_tcp_ses_lock); + *root = tcon; +} + int cifs_mount(struct cifs_sb_info *cifs_sb, struct smb_vol *vol) { int rc = 0; @@ -4878,18 +4887,10 @@ int cifs_mount(struct cifs_sb_info *cifs_sb, struct smb_vol *vol) /* Cache out resolved root server */ (void)dfs_cache_find(xid, ses, cifs_sb->local_nls, cifs_remap(cifs_sb), root_path + 1, NULL, NULL); - /* - * Save root tcon for additional DFS requests to update or create a new - * DFS cache entry, or even perform DFS failover. - */ - spin_lock(&cifs_tcp_ses_lock); - tcon->tc_count++; - tcon->dfs_path = root_path; + kfree(root_path); root_path = NULL; - tcon->remap = cifs_remap(cifs_sb); - spin_unlock(&cifs_tcp_ses_lock); - root_tcon = tcon; + set_root_tcon(tcon, &root_tcon); for (count = 1; ;) { if (!rc && tcon) { @@ -4926,6 +4927,15 @@ int cifs_mount(struct cifs_sb_info *cifs_sb, struct smb_vol *vol) mount_put_conns(cifs_sb, xid, server, ses, tcon); rc = mount_get_conns(vol, cifs_sb, &xid, &server, &ses, &tcon); + /* + * Ensure that DFS referrals go through new root server. + */ + if (!rc && tcon && + (tcon->share_flags & (SHI1005_FLAGS_DFS | + SHI1005_FLAGS_DFS_ROOT))) { + cifs_put_tcon(root_tcon); + set_root_tcon(tcon, &root_tcon); + } } if (rc) { if (rc == -EACCES || rc == -EOPNOTSUPP) ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: PROBLEM: DFS Caching feature causing problems traversing multi-tier DFS setups 2019-11-13 1:11 ` Paulo Alcantara @ 2019-11-19 16:55 ` Matthew Ruffell 0 siblings, 0 replies; 3+ messages in thread From: Matthew Ruffell @ 2019-11-19 16:55 UTC (permalink / raw) To: Paulo Alcantara, linux-cifs; +Cc: sfrench Hi Paulo, Apologies for the delay. I built the patch you placed at the end of your message into a test kernel for our customer, based on 5.4-rc7. I also sent the customer a vanilla 5.4-rc7 to test against, to see if the patch is what fixes the problem. The customer just got back to me, and confirmed that vanilla 5.4-rc7 fails to mount their multi tier DFS share, and the patched 5.4-rc7 with the patch you provided successfully mounts their multi tier DFS share. Not working - Vanilla 5.4-rc7 $ uname -rv 5.4.0-rc7 #4 SMP Wed Nov 13 17:23:52 NZDT 2019 $ sudo mount -v -t cifs //company.com/folders/country/<share> -o defaults,user=<user> /mnt/share Password for <user>@//company.com/folders/country/<share>: ************ mount.cifs kernel mount options: ip=<IPv4 Address>,unc=\\company.com\folders,user=<user>,prefixpath=country/<share>,pass=********** mount error(2): No such file or directory Refer to the mount.cifs(8) manual page (e.g. man mount.cifs) Working - Patched 5.4-rc7 with https://paste.ubuntu.com/p/DnCTNMjQJC/ $ uname -rv 5.4.0-rc7+upstreampatch1+ #3 SMP Wed Nov 13 16:42:34 NZDT 2019 $ sudo mount -v -t cifs //company.com/folders/country/<share> -o defaults,user=<user> /mnt/share Password for <user>@//company.com/folders/country/<share>: ************ mount.cifs kernel mount options: ip=<IPv4 Address>,unc=\\company.com\folders,user=<user>,prefixpath=country/<share>,pass=********** $ cd /mnt/share/ :/mnt/share$ ll total 4 drwxr-xr-x 2 root root 0 Oct 18 2017 ./ drwxr-xr-x 3 root root 4096 Nov 19 15:51 ../ drwxr-xr-x 2 root root 0 Oct 23 15:27 <Directory>/ drwxr-xr-x 2 root root 0 Aug 13 2018 <Directory 2>/ Please, go ahead and prepare the patch for mainline inclusion, and as Steve mentioned before, consider cc stable. Thank you for developing the fix quickly. Matthew Ruffell Sustaining Engineer @ Canonical ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-11-19 16:56 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-11-12 21:52 PROBLEM: DFS Caching feature causing problems traversing multi-tier DFS setups Matthew Ruffell 2019-11-13 1:11 ` Paulo Alcantara 2019-11-19 16:55 ` Matthew Ruffell
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.