From: NeilBrown <neilb@suse.com>
To: Lutz Vieweg <lvml@5t9.de>, linux-nfs@vger.kernel.org
Subject: Re: PROBLEM: nfs I/O errors with sqlite applications
Date: Fri, 09 Jun 2017 08:07:23 +1000 [thread overview]
Message-ID: <871squb0bo.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <59399945.2020507@5t9.de>
[-- Attachment #1: Type: text/plain, Size: 3349 bytes --]
On Thu, Jun 08 2017, Lutz Vieweg wrote:
> On 06/07/2017 05:08 AM, NeilBrown wrote:
>>>> fcntl(3, F_SETLK, {type=F_RDLCK, whence=SEEK_SET, start=1073741824, len=1}) = -1 EIO (Input/output error)
>>>> write(2, "Error: disk I/O error\n", 22Error: disk I/O error
>>>
>>> But unlike the original reporter, we use the NFS v3 protocol:
>>>> myserver:/data on /data type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=sys,mountvers=3,mountport=20048,mountproto=udp,local_lock=none)
>>
>> Using "soft" is not a good idea. It could be the cause, but it isn't very
>> likely if NFS is otherwise working OK.
>
> NFS v3 has been working very well for us for many years.
> When we upgraded those two servers ~3 years ago, we did try NFS v4 first, but
> that had caused frequent occurences of "un-killable processes in D state",
> so we had to revert to v3 to allow for stable operation.
I queried the use of "soft" - as opposed to "hard".
You defend the use of v3 as opposed to v4.
I think there is some miscommunication happening here.
If v3 works better for you than v4, then certainly use it.
You could try reporting details of the problems with v4, but I cannot
promise a helpful response, so it is totally up to you.
But "soft" is generally a bad idea. It can lead to data corruption in
various way as it ports errors to user-space which user-space is often
not expecting.
These days, the processes in D state are (usually) killable.
>
>> It might help to run
>> rpcdebug -m nfs -s all; rpcdebug -m nlm -s all ;rpcdebug -m rpc -s all
>> #repeat your test
>> rpcdebug -m nfs -c all; rpcdebug -m nlm -c all ;rpcdebug -m rpc -c all
>>
>> then collect the kernel logs (possibly just run "dmesg") and post all
>> the messages which happened at that time.
>
> Ok, attaching a log generated like this while running:
>
> sqlite3 x.sqlite "PRAGMA case_sensitive_like=1;PRAGMA synchronous=OFF;PRAGMA
> recursive_triggers=ON;PRAGMA foreign_keys=OFF;PRAGMA locking_mode = NORMAL;PRAGMA journal_mode =
> TRUNCATE;"
Thanks. Probably the key line is
[2339904.695240] RPC: 46702 remote rpcbind: RPC program/version unavailable
The client is trying to talk to lockd on the server, and lockd doesn't
seem to be there.
>
>> It might also help to find the port number that lockd is running on
>> rpcinfo -p $SERVER | grep 'tcp.*nlockmgr'
>
> None of the ports reported this way contains the string "nlockmgr":
This agrees with the line from the log. If nlockmgr isn't listed, then
locking cannot work. This is the cause of your problem.
>> rpcinfo -p myserver
>> program vers proto port service
>> 100000 4 tcp 111 portmapper
>> 100000 3 tcp 111 portmapper
>> 100000 2 tcp 111 portmapper
>> 100000 4 udp 111 portmapper
>> 100000 3 udp 111 portmapper
>> 100000 2 udp 111 portmapper
Even "nfs" isn't listed - but clearly the nfs server is running.
My guess is that rpcbind was restarted with the "-w" flag, so it lost
all the state that it previosly had.
If you stop and restart NFS service on the server, it might start
working again. Otherwise just reboot the nfs server.
NeilBrown
>
> Regards,
>
> Lutz Vieweg
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
next prev parent reply other threads:[~2017-06-08 22:07 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-12 16:48 PROBLEM: nfs I/O errors with sqlite applications Nick Bowler
2015-10-12 19:25 ` J. Bruce Fields
2015-10-12 19:46 ` J. Bruce Fields
2015-10-13 3:01 ` Nick Bowler
2015-10-13 10:52 ` Jeff Layton
2015-10-13 12:54 ` Nick Bowler
2016-07-29 16:43 ` Nick Bowler
2016-07-29 17:52 ` Jeff Layton
2017-06-06 16:46 ` Lutz Vieweg
2017-06-07 3:08 ` NeilBrown
2017-06-08 18:36 ` Lutz Vieweg
2017-06-08 22:07 ` NeilBrown [this message]
2017-06-09 11:01 ` Lutz Vieweg
2017-06-09 22:01 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=871squb0bo.fsf@notabene.neil.brown.name \
--to=neilb@suse.com \
--cc=linux-nfs@vger.kernel.org \
--cc=lvml@5t9.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).