From: Matthew Ruffell <matthew.ruffell@canonical.com>
To: linux-cifs@vger.kernel.org
Cc: sfrench@samba.org, palcantara@suse.de
Subject: PROBLEM: DFS Caching feature causing problems traversing multi-tier DFS setups
Date: Wed, 13 Nov 2019 10:52:44 +1300 [thread overview]
Message-ID: <05aa2995-e85e-0ff4-d003-5bb08bd17a22@canonical.com> (raw)
Hi Steve, Paulo, and maintainers of CIFS,
We have come across a problem where kernels 5.0-rc1 and onwards cannot mount
a multi tier DFS setup, while kernels 4.20 and below can mount the share fine.
The DFS tiering structure looks like this:
Domain virtual DFS (i.e. \\company.com\folders\share)
|-- Domain controller DFS (i.e. \\regional-dc.company.com\folders\share)
|-- Regional DFS Server (i.e. \\regional-dfs.company.com\folders\share)
|-- Actual file server (i.e. \\regional-svr.company.com\share)
On the 5.x series kernels, after getting the DFS referrals list through to the
Regional DFS Server, which responds with the correct server/share, instead of
going to the Actual file server, the kernel backtracks from the Regional DFS
Server back to the Domain controller and requests the share there. Of course,
this share does not exist on the Domain controller, as it only exists on the
Actual file server, and the connection dies.
We have collected a packet capture, and the flow looks like this:
4.18.0-21-generic Ubuntu kernel - Good
Host request/response
-------------------------------------------- ----------------------------------------------------
Domain controller / Domain DFS Root company.com\folders
Domain controller / Domain DFS Root Referral List
Regional Domain Controller / Domain DFS Root start convo
Regional Domain Controller / Domain DFS Root <Regional Domain Controller>\Folders\Country\<Share> referral
Regional Domain Controller / Domain DFS Root <Regional Domain Controller>\Folders\Country\<Share> referral
Regional DFS server start convo
Regional DFS server <Regional DFS Server>\Root\Country\<Share>
Regional DFS server STATUS_PATH_NOT_COVERED
Regional DFS server request referrals
Regional DFS server Referral List
Actual File Server convo started
Actual File Server <Actual File Server>\<Share>
Actual File Server Good response
5.0.0-26-generic Ubuntu kernel - Bad
Host request/response
-------------------------------------------- -------------------------------------------
Domain controller / Domain DFS Root company.com\folders
Regional Domain Controller / Domain DFS Root start convo
Regional Domain Controller / Domain DFS Root <Regional Domain Controller>\Folders\Country\<Share>
Regional Domain Controller / Domain DFS Root STATUS_PATH_NOT_COVERED
Regional DFS server start convo
Regional DFS server <Regional DFS Server>\Root\Country\<Share>
Regional DFS server STATUS_PATH_NOT_COVERED
Regional Domain Controller / Domain DFS Root <Regional DFS Server>\Root\Country\<Share>
Regional Domain Controller / Domain DFS Root STATUS_PATH_NOT_COVERED
If you are interested in any parts of the packet capture, let us know and I will
provide you with portions that you need.
We also enabled CIFS dynamic debugging with:
echo 'module cifs +p' > /sys/kernel/debug/dynamic_debug/control
echo 'file fs/cifs/* +p' > /sys/kernel/debug/dynamic_debug/control
echo 7 > /proc/fs/cifs/cifsFYI
From there the debugging output was more or less the same between the two kernel
versions, until the problematic area:
Linux 4.18:
Full log: https://paste.ubuntu.com/p/D9XwBbvTXc/
Status code returned 0xc0000257 STATUS_PATH_NOT_COVERED
fs/cifs/smb2maperror.c: Mapping SMB2 status code 0xc0000257 to POSIX err -66
fs/cifs/connect.c: build_unc_path_to_root: full_path=\\<Regional DFS Server>\Root\Country\<Share>
fs/cifs/smb2ops.c: smb2_get_dfs_refer path <\<Regional DFS Server>\Root\Country\<Share>>
fs/cifs/misc.c: num_referrals: 1 dfs flags: 0x2 ...
fs/cifs/dns_resolve.c: dns_resolve_server_name_to_ip: resolved: <Actual File Server> to <IPV4 Address>
fs/cifs/connect.c: Username: XXX
// mounts the share successfully
Linux 5.0:
Full log: https://paste.ubuntu.com/p/9sXPj7WMQv/
Status code returned 0xc0000257 STATUS_PATH_NOT_COVERED
fs/cifs/smb2maperror.c: Mapping SMB2 status code 0xc0000257 to POSIX err -66
fs/cifs/connect.c: build_unc_path_to_root: full_path=\\<Regional DFS Server>\Root\Country\<Share>
fs/cifs/connect.c: build_unc_path_to_root: full_path=\\<Regional DFS Server>\Root\Country\<Share>
fs/cifs/dfs_cache.c: do_dfs_cache_find: search path: \<Regional DFS Server>\Root\Country\<Share>
fs/cifs/dfs_cache.c: do_dfs_cache_find: cache miss
fs/cifs/dfs_cache.c: do_dfs_cache_find: DFS referral request for \<Regional DFS Server>\Root\Country\<Share>
fs/cifs/smb2ops.c: smb2_get_dfs_refer path <\<Regional DFS Server>\Root\Country\<Share>>
fs/cifs/smb2pdu.c: SMB2 IOCTL
Status code returned 0xc0000225 STATUS_NOT_FOUND
fs/cifs/smb2maperror.c: Mapping SMB2 status code 0xc0000225 to POSIX err -2
// mounting the share fails shortly after
Since there seems to be problems with how referrals are handled, I examined
the commit history, and came across the new DFS caching feature introduced in
5.0-rc1.
I built a test kernel at:
commit e7b602f43719fc6173ae86d2de8f6f07c6858ddd
Author: Paulo Alcantara <palcantara@suse.de>
Date: Wed Nov 14 15:38:51 2018 -0200
Subject: cifs: Save TTL value when parsing DFS referrals
Which is the commit directly before the first commit of the DFS caching feature,
commit 54be1f6c1c37498bba557049df646cc239fa37e3
Author: Paulo Alcantara <palcantara@suse.de>
Date: Wed Nov 14 16:01:21 2018 -0200
Subject: cifs: Add DFS cache routines
I also built a test kernel at the end of the patch series which implements the
DFS caching feature, at:
commit 14e92c5dc7a1a1d4a82fb7142b5642837fef962a
Author: Steve French <stfrench@microsoft.com>
Date: Mon Dec 24 01:05:22 2018 -0600
Subject: cifs: Minor Kconfig clarification
We then tried to mount the CIFS share, and found that:
e7b602f43719fc6173ae86d2de8f6f07c6858ddd -> mounts successfully
14e92c5dc7a1a1d4a82fb7142b5642837fef962a -> fails to mount.
So there is a problem somewhere in the DFS caching feature, covered in the
following commits:
https://paste.ubuntu.com/p/XcNwp3dVBV/
We have also tried 5.3.5 mainline kernel, and the issue is still present.
We are available to help out with gathering more debugging data and trying
test patches. Please let us know what data to collect, or patches to test, and
we will collect whatever you need to get this fixed.
Thanks,
Matthew Ruffell
Sustaining Engineer @ Canonical
next reply other threads:[~2019-11-12 21:52 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-12 21:52 Matthew Ruffell [this message]
2019-11-13 1:11 ` PROBLEM: DFS Caching feature causing problems traversing multi-tier DFS setups Paulo Alcantara
2019-11-19 16:55 ` Matthew Ruffell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=05aa2995-e85e-0ff4-d003-5bb08bd17a22@canonical.com \
--to=matthew.ruffell@canonical.com \
--cc=linux-cifs@vger.kernel.org \
--cc=palcantara@suse.de \
--cc=sfrench@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.