qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Joao Martins <joao.m.martins@oracle.com>,
	Dan Williams <dan.j.williams@intel.com>
Cc: Jingqi Liu <jingqi.liu@intel.com>,
	Qemu Developers <qemu-devel@nongnu.org>,
	Richard Henderson <rth@twiddle.net>
Subject: Re: [PATCH] exec: fetch the alignment of Linux devdax pmem character device nodes
Date: Tue, 7 Apr 2020 20:29:40 +0200	[thread overview]
Message-ID: <6d9ef17e-315c-e01e-db56-bde97f0ab1a8@redhat.com> (raw)
In-Reply-To: <3873cb30-608c-6a27-c19f-f6446898796f@oracle.com>

On 07/04/20 20:28, Joao Martins wrote:
> On 4/7/20 5:55 PM, Dan Williams wrote:
>> On Tue, Apr 7, 2020 at 4:01 AM Joao Martins <joao.m.martins@oracle.com> wrote:
>>> On 4/1/20 4:13 AM, Jingqi Liu wrote:
>>>> If the backend file is devdax pmem character device, the alignment
>>>> specified by the option 'align=NUM' in the '-object memory-backend-file'
>>>> needs to match the alignment requirement of the devdax pmem character device.
>>>>
>>>> This patch fetches the devdax pmem file 'align', so that we can compare
>>>> it with the NUM of 'align=NUM'.
>>>> The NUM needs to be larger than or equal to the devdax pmem file 'align'.
>>>>
>>>> It also fixes the problem that mmap() returns failure in qemu_ram_mmap()
>>>> when the NUM of 'align=NUM' is less than the devdax pmem file 'align'.
>>>>
>>>> Cc: Dan Williams <dan.j.williams@intel.com>
>>>> Signed-off-by: Jingqi Liu <jingqi.liu@intel.com>
>>>> ---
>>>>  exec.c | 46 +++++++++++++++++++++++++++++++++++++++++++++-
>>>>  1 file changed, 45 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/exec.c b/exec.c
>>>> index de9d949902..8221abffec 100644
>>>> --- a/exec.c
>>>> +++ b/exec.c
>>>> @@ -1736,6 +1736,42 @@ static int64_t get_file_size(int fd)
>>>>      return size;
>>>>  }
>>>>
>>>> +static int64_t get_file_align(int fd)
>>>> +{
>>>> +    int64_t align = -1;
>>>> +#if defined(__linux__)
>>>> +    struct stat st;
>>>> +
>>>> +    if (fstat(fd, &st) < 0) {
>>>> +        return -errno;
>>>> +    }
>>>> +
>>>> +    /* Special handling for devdax character devices */
>>>> +    if (S_ISCHR(st.st_mode)) {
>>>> +        g_autofree char *subsystem_path = NULL;
>>>> +        g_autofree char *subsystem = NULL;
>>>> +
>>>> +        subsystem_path = g_strdup_printf("/sys/dev/char/%d:%d/subsystem",
>>>> +                                         major(st.st_rdev), minor(st.st_rdev));
>>>> +        subsystem = g_file_read_link(subsystem_path, NULL);
>>>> +
>>>> +        if (subsystem && g_str_has_suffix(subsystem, "/dax")) {
>>>> +            g_autofree char *align_path = NULL;
>>>> +            g_autofree char *align_str = NULL;
>>>> +
>>>> +            align_path = g_strdup_printf("/sys/dev/char/%d:%d/device/align",
>>>> +                                    major(st.st_rdev), minor(st.st_rdev));
>>>> +
>>>
>>> Perhaps, you meant instead:
>>>
>>>         /sys/dev/char/%d:%d/align
>>>
>>
>> Hmm, are you sure that's working? 
> 
> It is, except that I made the slight mistake of testing with a bunch of wip
> patches on top which one of them actually adds the 'align' to child dax device.
> 
> Argh, my apologies - and thanks for noticing.
> 
>> I expect the alignment to be found
>> in the region device:
>>
>> /sys/class/dax:
>> /sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus1/region1/dax1.1/dax1.0
>> $(readlink -f /sys/dev/char/253\:263)/../align
>> $(readlink -f /sys/dev/char/253\:263)/device/align
>>
>>
>> /sys/bus/dax:
>> /sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus1/region1/dax1.0/dax1.0
>> $(readlink -f /sys/dev/char/253\:265)/../align
>> $(readlink -f /sys/dev/char/253\:265)/device/align <-- No such file
>>
>> The use of the /sys/dev/char/%d:%d/device is only supported by the
>> deprecated /sys/class/dax. 
> 
> I don't have the deprecated dax class enabled as could you tell, so the second
> case is what I was testing. Except it wasn't a namespace/nvdimm but rather an
> hmem device-dax.
> 
> '../align' though covers only one case? What about hmem which '../align' returns
> ENOENT; perhaps using '../dax_region/align' instead which is common to both?
> Albeit that wouldn't address the sub-division devices (that I mention above)

Clearly a 5.1 patch then. :)

Paolo




  reply	other threads:[~2020-04-07 18:31 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-01  3:13 [PATCH] exec: fetch the alignment of Linux devdax pmem character device nodes Jingqi Liu
2020-04-07  7:29 ` Liu, Jingqi
2020-04-07  8:08   ` Paolo Bonzini
2020-04-07  8:16     ` Dan Williams
2020-04-07 11:42       ` Joao Martins
2020-04-07  8:39     ` Liu, Jingqi
2020-04-07 10:59 ` Joao Martins
2020-04-07 14:31   ` Paolo Bonzini
2020-04-07 15:51     ` Joao Martins
2020-04-08  1:16       ` Liu, Jingqi
2020-04-08  9:28         ` Joao Martins
2020-04-07 16:55   ` Dan Williams
2020-04-07 18:28     ` Joao Martins
2020-04-07 18:29       ` Paolo Bonzini [this message]
2020-04-08  2:25       ` Liu, Jingqi
2020-04-08  9:42         ` Joao Martins
2020-04-09 14:33           ` Liu, Jingqi
2020-04-09 16:46             ` Dan Williams
2020-04-09 17:02               ` Paolo Bonzini
2020-04-10  1:48               ` Liu, Jingqi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6d9ef17e-315c-e01e-db56-bde97f0ab1a8@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=jingqi.liu@intel.com \
    --cc=joao.m.martins@oracle.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).