All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
To: Andrew Rybchenko <arybchenko@solarflare.com>, dev@dpdk.org
Cc: bruce.richardson@intel.com, thomas@monjalon.net
Subject: Re: [PATCH v3 2/2] mem: revert to using flock() and add per-segment lockfiles
Date: Mon, 30 Apr 2018 12:31:02 +0100	[thread overview]
Message-ID: <40b42df8-cc22-e529-e85c-c41edc36a0ff@intel.com> (raw)
In-Reply-To: <925daa23-9265-2547-720b-4048ba7aa623@solarflare.com>

On 28-Apr-18 10:38 AM, Andrew Rybchenko wrote:
> On 04/25/2018 01:36 PM, Anatoly Burakov wrote:
>> The original implementation used flock() locks, but was later
>> switched to using fcntl() locks for page locking, because
>> fcntl() locks allow locking parts of a file, which is useful
>> for single-file segments mode, where locking the entire file
>> isn't as useful because we still need to grow and shrink it.
>>
>> However, according to fcntl()'s Ubuntu manpage [1], semantics of
>> fcntl() locks have a giant oversight:
>>
>>    This interface follows the completely stupid semantics of System
>>    V and IEEE Std 1003.1-1988 (“POSIX.1”) that require that all
>>    locks associated with a file for a given process are removed
>>    when any file descriptor for that file is closed by that process.
>>    This semantic means that applications must be aware of any files
>>    that a subroutine library may access.
>>
>> Basically, closing *any* fd with an fcntl() lock (which we do because
>> we don't want to leak fd's) will drop the lock completely.
>>
>> So, in this commit, we will be reverting back to using flock() locks
>> everywhere. However, that still leaves the problem of locking parts
>> of a memseg list file in single file segments mode, and we will be
>> solving it with creating separate lock files per each page, and
>> tracking those with flock().
>>
>> We will also be removing all of this tailq business and replacing it
>> with a simple array - saving a few bytes is not worth the extra
>> hassle of dealing with pointers and potential memory allocation
>> failures. Also, remove the tailq lock since it is not needed - these
>> fd lists are per-process, and within a given process, it is always
>> only one thread handling access to hugetlbfs.
>>
>> So, first one to allocate a segment will create a lockfile, and put
>> a shared lock on it. When we're shrinking the page file, we will be
>> trying to take out a write lock on that lockfile, which would fail if
>> any other process is holding onto the lockfile as well. This way, we
>> can know if we can shrink the segment file. Also, if no other locks
>> are found in the lock list for a given memseg list, the memseg list
>> fd is automatically closed.
>>
>> One other thing to note is, according to flock() Ubuntu manpage [2],
>> upgrading the lock from shared to exclusive is implemented by dropping
>> and reacquiring the lock, which is not atomic and thus would have
>> created race conditions. So, on attempting to perform operations in
>> hugetlbfs, we will take out a writelock on hugetlbfs directory, so
>> that only one process could perform hugetlbfs operations concurrently.
>>
>> [1] 
>> http://manpages.ubuntu.com/manpages/artful/en/man2/fcntl.2freebsd.html
>> [2] http://manpages.ubuntu.com/manpages/bionic/en/man2/flock.2.html
>>
>> Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
>> Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
>> Fixes: a5ff05d60fc5 ("mem: support unmapping pages at runtime")
>> Fixes: 2a04139f66b4 ("eal: add single file segments option")
>> Cc: anatoly.burakov@intel.com
>>
>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
>> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
> 
> We have a problem with the changeset if EAL option -m or --socket-mem is 
> used.
> EAL initialization hangs just after EAL: Probing VFIO support...
> strace points to flock(7, LOCK_EX
> List of file descriptors:
> # ls /proc/25452/fd -l
> total 0
> lrwx------ 1 root root 64 Apr 28 10:34 0 -> /dev/pts/0
> lrwx------ 1 root root 64 Apr 28 10:34 1 -> /dev/pts/0
> lrwx------ 1 root root 64 Apr 28 10:32 2 -> /dev/pts/0
> lrwx------ 1 root root 64 Apr 28 10:34 3 -> /run/.rte_config
> lrwx------ 1 root root 64 Apr 28 10:34 4 -> socket:[154166]
> lrwx------ 1 root root 64 Apr 28 10:34 5 -> socket:[154158]
> lr-x------ 1 root root 64 Apr 28 10:34 6 -> /dev/hugepages
> lr-x------ 1 root root 64 Apr 28 10:34 7 -> /dev/hugepages
> 
> I guess the problem is that there are two /dev/hugepages and
> it hangs on the second.
> 
> Ideas how to solve it?
> 
> Andrew.
> 

Hi Andrew,

Please try the following patch:

http://dpdk.org/dev/patchwork/patch/39166/

This should fix the issue.

-- 
Thanks,
Anatoly

  parent reply	other threads:[~2018-04-30 11:31 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-19 12:26 [PATCH 0/2] Fix file locking in EAL memory Anatoly Burakov
2018-04-19 12:26 ` [PATCH 1/2] mem: add memalloc init stage Anatoly Burakov
2018-04-24 14:06   ` Bruce Richardson
2018-04-19 12:26 ` [PATCH 2/2] mem: revert to using flock() and add per-segment lockfiles Anatoly Burakov
2018-04-24 13:57   ` Bruce Richardson
2018-04-24 14:07   ` Bruce Richardson
2018-04-24 15:54 ` [PATCH v2 0/2] Fix file locking in EAL memory Anatoly Burakov
2018-04-24 15:54   ` [PATCH v2 1/2] mem: add memalloc init stage Anatoly Burakov
2018-04-24 15:54   ` [PATCH v2 2/2] mem: revert to using flock() and add per-segment lockfiles Anatoly Burakov
2018-04-24 16:32   ` [PATCH v2 0/2] Fix file locking in EAL memory Stephen Hemminger
2018-04-24 17:25     ` Burakov, Anatoly
2018-04-24 20:05       ` Thomas Monjalon
2018-04-24 20:34         ` Stephen Hemminger
2018-04-25 10:36   ` [PATCH v3 " Anatoly Burakov
2018-04-25 10:36     ` [PATCH v3 1/2] mem: add memalloc init stage Anatoly Burakov
2018-04-25 10:36     ` [PATCH v3 2/2] mem: revert to using flock() and add per-segment lockfiles Anatoly Burakov
2018-04-28  9:38       ` Andrew Rybchenko
2018-04-30  8:29         ` Burakov, Anatoly
2018-04-30 11:31         ` Burakov, Anatoly [this message]
2018-04-30 11:51           ` Maxime Coquelin
2018-04-30 13:08           ` Andrew Rybchenko
2018-04-27 21:50     ` [PATCH v3 0/2] Fix file locking in EAL memory Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=40b42df8-cc22-e529-e85c-c41edc36a0ff@intel.com \
    --to=anatoly.burakov@intel.com \
    --cc=arybchenko@solarflare.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.