From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-nvdimm <linux-nvdimm@lists.01.org>
Subject: Re: [PATCH v3] libnvdimm/nsio: differentiate between probe mapping and runtime mapping
Date: Thu, 31 Oct 2019 09:40:48 +0530 [thread overview]
Message-ID: <f24ff341-95bf-4840-535a-d558b4f084f9@linux.ibm.com> (raw)
In-Reply-To: <CAPcyv4g9bU+E6JuOT34gUjQGP4LBYg7w6Y+5ei5pxD3frFXcaw@mail.gmail.com>
On 10/31/19 9:25 AM, Dan Williams wrote:
> On Wed, Oct 30, 2019 at 4:33 PM Dan Williams <dan.j.williams@intel.com> wrote:
>>
>> On Thu, Oct 24, 2019 at 2:07 AM Aneesh Kumar K.V
>> <aneesh.kumar@linux.ibm.com> wrote:
>>>
>>> On 10/24/19 7:36 AM, Dan Williams wrote:
>>>> On Thu, Oct 17, 2019 at 12:33 AM Aneesh Kumar K.V
>>>> <aneesh.kumar@linux.ibm.com> wrote:
>>>>>
>>>>> nvdimm core currently maps the full namespace to an ioremap range
>>>>> while probing the namespace mode. This can result in probe failures
>>>>> on architectures that have limited ioremap space.
>>>>>
>>>>> For example, with a large btt namespace that consumes most of I/O remap
>>>>> range, depending on the sequence of namespace initialization, the user can find
>>>>> a pfn namespace initialization failure due to unavailable I/O remap space
>>>>> which nvdimm core uses for temporary mapping.
>>>>>
>>>>> nvdimm core can avoid this failure by only mapping the reserver block area to
>>>>
>>>> s/reserver/reserved/
>>>
>>> Will fix
>>>
>>>
>>>>
>>>>> check for pfn superblock type and map the full namespace resource only before
>>>>> using the namespace.
>>>>
>>>> Are you going to follow up with the BTT patch that uses this new facility?
>>>>
>>>
>>> I am not yet sure about that. ioremap/vmap area is the way we get a
>>> kernel mapping without struct page backing. What we are suggesting hee
>>> is the ability to have a direct mapped mapping without struct page. I
>>> need to look at closely about what that imply.
>>>
>>>>>
>>>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>>>>> ---
>>>>> Changes from v2:
>>>>> * update changelog
>>>>>
>>>>> Changes from V1:
>>>>> * update changelog
>>>>> * update patch based on review feedback.
>>>>>
>>>>> drivers/dax/pmem/core.c | 2 +-
>>>>> drivers/nvdimm/claim.c | 7 +++----
>>>>> drivers/nvdimm/nd.h | 4 ++--
>>>>> drivers/nvdimm/pfn.h | 6 ++++++
>>>>> drivers/nvdimm/pfn_devs.c | 5 -----
>>>>> drivers/nvdimm/pmem.c | 15 ++++++++++++---
>>>>> 6 files changed, 24 insertions(+), 15 deletions(-)
>>>>>
>>>>> diff --git a/drivers/dax/pmem/core.c b/drivers/dax/pmem/core.c
>>>>> index 6eb6dfdf19bf..f174dbfbe1c4 100644
>>>>> --- a/drivers/dax/pmem/core.c
>>>>> +++ b/drivers/dax/pmem/core.c
>>>>> @@ -28,7 +28,7 @@ struct dev_dax *__dax_pmem_probe(struct device *dev, enum dev_dax_subsys subsys)
>>>>> nsio = to_nd_namespace_io(&ndns->dev);
>>>>>
>>>>> /* parse the 'pfn' info block via ->rw_bytes */
>>>>> - rc = devm_nsio_enable(dev, nsio);
>>>>> + rc = devm_nsio_enable(dev, nsio, info_block_reserve());
>>>>> if (rc)
>>>>> return ERR_PTR(rc);
>>>>> rc = nvdimm_setup_pfn(nd_pfn, &pgmap);
>>>>> diff --git a/drivers/nvdimm/claim.c b/drivers/nvdimm/claim.c
>>>>> index 2985ca949912..d89d2c039e25 100644
>>>>> --- a/drivers/nvdimm/claim.c
>>>>> +++ b/drivers/nvdimm/claim.c
>>>>> @@ -300,12 +300,12 @@ static int nsio_rw_bytes(struct nd_namespace_common *ndns,
>>>>> return rc;
>>>>> }
>>>>>
>>>>> -int devm_nsio_enable(struct device *dev, struct nd_namespace_io *nsio)
>>>>> +int devm_nsio_enable(struct device *dev, struct nd_namespace_io *nsio, unsigned long size)
>>>>> {
>>>>> struct resource *res = &nsio->res;
>>>>> struct nd_namespace_common *ndns = &nsio->common;
>>>>>
>>>>> - nsio->size = resource_size(res);
>>>>> + nsio->size = size;
>>>>
>>>> This needs a:
>>>>
>>>> if (nsio->size)
>>>> devm_memunmap(dev, nsio->addr);
>>>
>>>
>>> Why ? we should not be calling devm_nsio_enable twice now.
>>>
>>>>
>>>>
>>>>> if (!devm_request_mem_region(dev, res->start, resource_size(res),
>>>>> dev_name(&ndns->dev))) {
>>>>
>>>> Also don't repeat the devm_request_mem_region() if one was already done.
>>>>
>>>> Another thing to check is if nsio->size gets cleared when the
>>>> namespace is disabled, if not that well need to be added in the
>>>> shutdown path.
>>>>
>>>
>>>
>>> Not sure I understand that. So with this patch when we probe we always
>>> use info_block_reserve() size irrespective of the device. This probe
>>> will result in us adding a btt/pfn/dax device and we return -ENXIO after
>>> this probe. This return value will cause us to destroy the I/O remap
>>> mmapping we did with info_block_reserve() size. Also the nsio data
>>> structure is also freed.
>>>
>>> nd_pmem_probe is then again called with btt device type and in that case
>>> we do devm_memremap again.
>>>
>>> if (btt)
>>> remap the full namespace size.
>>> else
>>> remap the info_block_reserve_size.
>>>
>>>
>>> This infor block reserve mapping is unmapped in pmem_attach_disk()
>>>
>>>
>>> Anything i am missing in the above flow?
>>
>> Oh no, you're right, this does make the implementation only call
>> devm_nsio_enable() once. However, now that I look I want to suggest
>> some small reorganizations of where devm_nsio_enable() is called. I'll
>> take a stab at folding some changes on top of your patch.
>
> This change is causing the "pfn-meta-errors.sh" test to fail. It is
> due to the fact the info_block_reserve() is not a sufficient
> reservation for errors in the page metadata area.
>
Can you share the call stack? we clear bad blocks on write isn't? and we
do only read while probing?
We are still working on enabling `ndctl test` to run on POWER. Hence was
not able capture these issues. Will try to get that resolved soon.
-aneesh
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
next prev parent reply other threads:[~2019-10-31 4:10 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-17 7:33 [PATCH v3] libnvdimm/nsio: differentiate between probe mapping and runtime mapping Aneesh Kumar K.V
2019-10-24 2:06 ` Dan Williams
2019-10-24 9:07 ` Aneesh Kumar K.V
2019-10-30 23:33 ` Dan Williams
2019-10-31 3:55 ` Dan Williams
2019-10-31 4:10 ` Aneesh Kumar K.V [this message]
2019-10-31 4:19 ` Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f24ff341-95bf-4840-535a-d558b4f084f9@linux.ibm.com \
--to=aneesh.kumar@linux.ibm.com \
--cc=dan.j.williams@intel.com \
--cc=linux-nvdimm@lists.01.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox