From: NeilBrown <neilb@suse.com>
To: Lutz Vieweg <lvml@5t9.de>, linux-nfs@vger.kernel.org
Subject: Re: PROBLEM: nfs I/O errors with sqlite applications
Date: Fri, 09 Jun 2017 08:07:23 +1000 [thread overview]
Message-ID: <871squb0bo.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <59399945.2020507@5t9.de>
[-- Attachment #1: Type: text/plain, Size: 3349 bytes --]
On Thu, Jun 08 2017, Lutz Vieweg wrote:
> On 06/07/2017 05:08 AM, NeilBrown wrote:
>>>> fcntl(3, F_SETLK, {type=F_RDLCK, whence=SEEK_SET, start=1073741824, len=1}) = -1 EIO (Input/output error)
>>>> write(2, "Error: disk I/O error\n", 22Error: disk I/O error
>>>
>>> But unlike the original reporter, we use the NFS v3 protocol:
>>>> myserver:/data on /data type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=sys,mountvers=3,mountport=20048,mountproto=udp,local_lock=none)
>>
>> Using "soft" is not a good idea. It could be the cause, but it isn't very
>> likely if NFS is otherwise working OK.
>
> NFS v3 has been working very well for us for many years.
> When we upgraded those two servers ~3 years ago, we did try NFS v4 first, but
> that had caused frequent occurences of "un-killable processes in D state",
> so we had to revert to v3 to allow for stable operation.
I queried the use of "soft" - as opposed to "hard".
You defend the use of v3 as opposed to v4.
I think there is some miscommunication happening here.
If v3 works better for you than v4, then certainly use it.
You could try reporting details of the problems with v4, but I cannot
promise a helpful response, so it is totally up to you.
But "soft" is generally a bad idea. It can lead to data corruption in
various way as it ports errors to user-space which user-space is often
not expecting.
These days, the processes in D state are (usually) killable.
>
>> It might help to run
>> rpcdebug -m nfs -s all; rpcdebug -m nlm -s all ;rpcdebug -m rpc -s all
>> #repeat your test
>> rpcdebug -m nfs -c all; rpcdebug -m nlm -c all ;rpcdebug -m rpc -c all
>>
>> then collect the kernel logs (possibly just run "dmesg") and post all
>> the messages which happened at that time.
>
> Ok, attaching a log generated like this while running:
>
> sqlite3 x.sqlite "PRAGMA case_sensitive_like=1;PRAGMA synchronous=OFF;PRAGMA
> recursive_triggers=ON;PRAGMA foreign_keys=OFF;PRAGMA locking_mode = NORMAL;PRAGMA journal_mode =
> TRUNCATE;"
Thanks. Probably the key line is
[2339904.695240] RPC: 46702 remote rpcbind: RPC program/version unavailable
The client is trying to talk to lockd on the server, and lockd doesn't
seem to be there.
>
>> It might also help to find the port number that lockd is running on
>> rpcinfo -p $SERVER | grep 'tcp.*nlockmgr'
>
> None of the ports reported this way contains the string "nlockmgr":
This agrees with the line from the log. If nlockmgr isn't listed, then
locking cannot work. This is the cause of your problem.
>> rpcinfo -p myserver
>> program vers proto port service
>> 100000 4 tcp 111 portmapper
>> 100000 3 tcp 111 portmapper
>> 100000 2 tcp 111 portmapper
>> 100000 4 udp 111 portmapper
>> 100000 3 udp 111 portmapper
>> 100000 2 udp 111 portmapper
Even "nfs" isn't listed - but clearly the nfs server is running.
My guess is that rpcbind was restarted with the "-w" flag, so it lost
all the state that it previosly had.
If you stop and restart NFS service on the server, it might start
working again. Otherwise just reboot the nfs server.
NeilBrown
>
> Regards,
>
> Lutz Vieweg
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
next prev parent reply other threads:[~2017-06-08 22:07 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-12 16:48 PROBLEM: nfs I/O errors with sqlite applications Nick Bowler
2015-10-12 19:25 ` J. Bruce Fields
2015-10-12 19:46 ` J. Bruce Fields
2015-10-13 3:01 ` Nick Bowler
2015-10-13 10:52 ` Jeff Layton
2015-10-13 12:54 ` Nick Bowler
2016-07-29 16:43 ` Nick Bowler
2016-07-29 17:52 ` Jeff Layton
2017-06-06 16:46 ` Lutz Vieweg
2017-06-07 3:08 ` NeilBrown
2017-06-08 18:36 ` Lutz Vieweg
2017-06-08 22:07 ` NeilBrown [this message]
2017-06-09 11:01 ` Lutz Vieweg
2017-06-09 22:01 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=871squb0bo.fsf@notabene.neil.brown.name \
--to=neilb@suse.com \
--cc=linux-nfs@vger.kernel.org \
--cc=lvml@5t9.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.