Re: [PATCH v3] libnvdimm/nsio: differentiate between probe mapping and runtime mapping

public inbox for linux-nvdimm@lists.01.org
 help / color / mirror / Atom feed

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-nvdimm <linux-nvdimm@lists.01.org>
Subject: Re: [PATCH v3] libnvdimm/nsio: differentiate between probe mapping and runtime mapping
Date: Thu, 31 Oct 2019 09:40:48 +0530	[thread overview]
Message-ID: <f24ff341-95bf-4840-535a-d558b4f084f9@linux.ibm.com> (raw)
In-Reply-To: <CAPcyv4g9bU+E6JuOT34gUjQGP4LBYg7w6Y+5ei5pxD3frFXcaw@mail.gmail.com>

On 10/31/19 9:25 AM, Dan Williams wrote:
> On Wed, Oct 30, 2019 at 4:33 PM Dan Williams <dan.j.williams@intel.com> wrote:
>>
>> On Thu, Oct 24, 2019 at 2:07 AM Aneesh Kumar K.V
>> <aneesh.kumar@linux.ibm.com> wrote:
>>>
>>> On 10/24/19 7:36 AM, Dan Williams wrote:
>>>> On Thu, Oct 17, 2019 at 12:33 AM Aneesh Kumar K.V
>>>> <aneesh.kumar@linux.ibm.com> wrote:
>>>>>
>>>>> nvdimm core currently maps the full namespace to an ioremap range
>>>>> while probing the namespace mode. This can result in probe failures
>>>>> on architectures that have limited ioremap space.
>>>>>
>>>>> For example, with a large btt namespace that consumes most of I/O remap
>>>>> range, depending on the sequence of namespace initialization, the user can find
>>>>> a pfn namespace initialization failure due to unavailable I/O remap space
>>>>> which nvdimm core uses for temporary mapping.
>>>>>
>>>>> nvdimm core can avoid this failure by only mapping the reserver block area to
>>>>
>>>> s/reserver/reserved/
>>>
>>> Will fix
>>>
>>>
>>>>
>>>>> check for pfn superblock type and map the full namespace resource only before
>>>>> using the namespace.
>>>>
>>>> Are you going to follow up with the BTT patch that uses this new facility?
>>>>
>>>
>>> I am not yet sure about that. ioremap/vmap area is the way we get a
>>> kernel mapping without struct page backing. What we are suggesting hee
>>> is the ability to have a direct mapped mapping without struct page. I
>>> need to look at closely about what that imply.
>>>
>>>>>
>>>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>>>>> ---
>>>>> Changes from v2:
>>>>> * update changelog
>>>>>
>>>>> Changes from V1:
>>>>> * update changelog
>>>>> * update patch based on review feedback.
>>>>>
>>>>>    drivers/dax/pmem/core.c   |  2 +-
>>>>>    drivers/nvdimm/claim.c    |  7 +++----
>>>>>    drivers/nvdimm/nd.h       |  4 ++--
>>>>>    drivers/nvdimm/pfn.h      |  6 ++++++
>>>>>    drivers/nvdimm/pfn_devs.c |  5 -----
>>>>>    drivers/nvdimm/pmem.c     | 15 ++++++++++++---
>>>>>    6 files changed, 24 insertions(+), 15 deletions(-)
>>>>>
>>>>> diff --git a/drivers/dax/pmem/core.c b/drivers/dax/pmem/core.c
>>>>> index 6eb6dfdf19bf..f174dbfbe1c4 100644
>>>>> --- a/drivers/dax/pmem/core.c
>>>>> +++ b/drivers/dax/pmem/core.c
>>>>> @@ -28,7 +28,7 @@ struct dev_dax *__dax_pmem_probe(struct device *dev, enum dev_dax_subsys subsys)
>>>>>           nsio = to_nd_namespace_io(&ndns->dev);
>>>>>
>>>>>           /* parse the 'pfn' info block via ->rw_bytes */
>>>>> -       rc = devm_nsio_enable(dev, nsio);
>>>>> +       rc = devm_nsio_enable(dev, nsio, info_block_reserve());
>>>>>           if (rc)
>>>>>                   return ERR_PTR(rc);
>>>>>           rc = nvdimm_setup_pfn(nd_pfn, &pgmap);
>>>>> diff --git a/drivers/nvdimm/claim.c b/drivers/nvdimm/claim.c
>>>>> index 2985ca949912..d89d2c039e25 100644
>>>>> --- a/drivers/nvdimm/claim.c
>>>>> +++ b/drivers/nvdimm/claim.c
>>>>> @@ -300,12 +300,12 @@ static int nsio_rw_bytes(struct nd_namespace_common *ndns,
>>>>>           return rc;
>>>>>    }
>>>>>
>>>>> -int devm_nsio_enable(struct device *dev, struct nd_namespace_io *nsio)
>>>>> +int devm_nsio_enable(struct device *dev, struct nd_namespace_io *nsio, unsigned long size)
>>>>>    {
>>>>>           struct resource *res = &nsio->res;
>>>>>           struct nd_namespace_common *ndns = &nsio->common;
>>>>>
>>>>> -       nsio->size = resource_size(res);
>>>>> +       nsio->size = size;
>>>>
>>>> This needs a:
>>>>
>>>> if (nsio->size)
>>>>      devm_memunmap(dev, nsio->addr);
>>>
>>>
>>> Why ? we should not be calling devm_nsio_enable twice now.
>>>
>>>>
>>>>
>>>>>           if (!devm_request_mem_region(dev, res->start, resource_size(res),
>>>>>                                   dev_name(&ndns->dev))) {
>>>>
>>>> Also don't repeat the devm_request_mem_region() if one was already done.
>>>>
>>>> Another thing to check is if nsio->size gets cleared when the
>>>> namespace is disabled, if not that well need to be added in the
>>>> shutdown path.
>>>>
>>>
>>>
>>> Not sure I understand that. So with this patch when we probe we always
>>> use info_block_reserve() size irrespective of the device. This probe
>>> will result in us adding a btt/pfn/dax device and we return -ENXIO after
>>> this probe. This return value will cause us to destroy the I/O remap
>>> mmapping we did with info_block_reserve() size. Also the nsio data
>>> structure is also freed.
>>>
>>> nd_pmem_probe is then again called with btt device type and in that case
>>> we do  devm_memremap again.
>>>
>>> if (btt)
>>>       remap the full namespace size.
>>> else
>>>      remap the info_block_reserve_size.
>>>
>>>
>>> This infor block reserve mapping is unmapped in pmem_attach_disk()
>>>
>>>
>>> Anything i am missing in the above flow?
>>
>> Oh no, you're right, this does make the implementation only call
>> devm_nsio_enable() once. However, now that I look I want to suggest
>> some small reorganizations of where devm_nsio_enable() is called. I'll
>> take a stab at folding some changes on top of your patch.
> 
> This change is causing the "pfn-meta-errors.sh" test to fail. It is
> due to the fact the info_block_reserve() is not a sufficient
> reservation for errors in the page metadata area.
> 

Can you share the call stack? we clear bad blocks on write isn't? and we 
do only read while probing?

We are still working on enabling `ndctl test` to run on POWER. Hence was 
not able capture these issues. Will try to get that resolved soon.

-aneesh
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

next prev parent reply	other threads:[~2019-10-31  4:10 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-17  7:33 [PATCH v3] libnvdimm/nsio: differentiate between probe mapping and runtime mapping Aneesh Kumar K.V
2019-10-24  2:06 ` Dan Williams
2019-10-24  9:07   ` Aneesh Kumar K.V
2019-10-30 23:33     ` Dan Williams
2019-10-31  3:55       ` Dan Williams
2019-10-31  4:10         ` Aneesh Kumar K.V [this message]
2019-10-31  4:19           ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f24ff341-95bf-4840-535a-d558b4f084f9@linux.ibm.com \
    --to=aneesh.kumar@linux.ibm.com \
    --cc=dan.j.williams@intel.com \
    --cc=linux-nvdimm@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox