From: NeilBrown <neilb@suse.com>
To: lustre-devel@lists.lustre.org
Subject: [lustre-devel] Lustre upstream client TODO list
Date: Mon, 12 Feb 2018 10:54:51 +1100 [thread overview]
Message-ID: <87bmgvrsj8.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <alpine.LFD.2.21.1802112308150.3291@casper.infradead.org>
On Sun, Feb 11 2018, James Simmons wrote:
> So I sent a patch upstream that laid out what most needs to be done for
> the linux lustre client to leave staging. I placed the new text here for
> ease of read so you don't have to go searching for it. Feed back is
> welcomed. Hoepfully posting it will make it clear what needs to be done.
Thanks so much for putting this together and pushing it out. I really
appreciated it and hope to show that appreciation with patches :-)
NeilBrown
>
> Currently all the work directed toward the lustre upstream client is tracked
> at the following link:
>
> https://jira.hpdd.intel.com/browse/LU-9679
>
> Under this ticket you will see the following work items that need to be
> addressed:
>
> ******************************************************************************
> * libcfs cleanup
> *
> * https://jira.hpdd.intel.com/browse/LU-9859
> *
> * Track all the cleanups and simplification of the libcfs module. Remove
> * functions the kernel provides. Possible intergrate some of the functionality
> * into the kernel proper.
> *
> ******************************************************************************
>
> https://jira.hpdd.intel.com/browse/LU-100086
>
> LNET_MINOR conflicts with USERIO_MINOR
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8130
>
> Fix and simplify libcfs hash handling
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8703
>
> The current way we handle SMP is wrong. Platforms like ARM and KNL can have
> core and NUMA setups with things like NUMA nodes with no cores. We need to
> handle such cases. This work also greatly simplified the lustre SMP code.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9019
>
> Replace libcfs time API with standard kernel APIs. Also migrate away from
> jiffies. We found jiffies can vary on nodes which can lead to corner cases
> that can break the file system due to nodes having inconsistent behavior.
> So move to time64_t and ktime_t as much as possible.
>
> ******************************************************************************
> * Proper IB support for ko2iblnd
> ******************************************************************************
> https://jira.hpdd.intel.com/browse/LU-9179
>
> Poor performance for the ko2iblnd driver. This is related to many of the
> patches below that are missing from the linux client.
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9886
>
> Crash in upstream kiblnd_handle_early_rxs()
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10394 / LU-10526 / LU-10089
>
> Default to default to using MEM_REG
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10459
>
> throttle tx based on queue depth
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9943
>
> correct WR fast reg accounting
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10291
>
> remove concurrent_sends tunable
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10213
>
> calculate qp max_send_wrs properly
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9810
>
> use less CQ entries for each connection
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10129 / LU-9180
>
> rework map_on_demand behavior
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10129
>
> query device capabilities
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10015
>
> fix race at kiblnd_connect_peer
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9983
>
> allow for discontiguous fragments
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9500
>
> Don't Page Align remote_addr with FastReg
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9448
>
> handle empty CPTs
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9507
>
> Don't Assert On Reconnect with MultiQP
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9472
>
> Fix FastReg map/unmap for MLX5
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9425
>
> Turn on 2 sges by default
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8943
>
> Enable Multiple OPA Endpoints between Nodes
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-5718
>
> multiple sges for work request
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9094
>
> kill timedout txs from ibp_tx_queue
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9094
>
> reconnect peer for REJ_INVALID_SERVICE_ID
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8752
>
> Stop MLX5 triggering a dump_cqe
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8874
>
> Move ko2iblnd to latest RDMA changes
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8875 / LU-8874
>
> Change to new RDMA done callback mechanism
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9164 / LU-8874
>
> Incorporate RDMA map/unamp API's into ko2iblnd
>
> ******************************************************************************
> * sysfs/debugfs fixes
> *
> * https://jira.hpdd.intel.com/browse/LU-8066
> *
> * The original migration to sysfs was done in haste without properly working
> * utilities to test the changes. This covers the work to restore the proper
> * behavior. Huge project to make this right.
> *
> ******************************************************************************
>
> https://jira.hpdd.intel.com/browse/LU-9431
>
> The function class_process_proc_param was used for our mass updates of proc
> tunables. It didn't work with sysfs and it was just ugly so it was removed.
> In the process the ability to mass update thousands of clients was lost. This
> work restores this in a sane way.
>
> ------------------------------------------------------------------------------
> https://jira.hpdd.intel.com/browse/LU-9091
>
> One the major request of users is the ability to pass in parameters into a
> sysfs file in various different units. For example we can set max_pages_per_rpc
> but this can vary on platforms due to different platform sizes. So you can
> set this like max_pages_per_rpc=16MiB. The original code to handle this written
> before the string helpers were created so the code doesn't follow that format
> but it would be easy to move to. Currently the string helpers does the reverse
> of what we need, changing bytes to string. We need to change a string to bytes.
>
> ******************************************************************************
> * Proper user land to kernel space interface for Lustre
> *
> * https://jira.hpdd.intel.com/browse/LU-9680
> *
> ******************************************************************************
>
> https://jira.hpdd.intel.com/browse/LU-8915
>
> Don't use linux list structure as user land arguments for lnet selftest.
> This code is pretty poor quality and really needs to be reworked.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8834
>
> The lustre ioctl LL_IOC_FUTIMES_3 is very generic. Need to either work with
> other file systems with similar functionality and make a common syscall
> interface or rework our server code to automagically do it for us.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-6202
>
> Cleanup up ioctl handling. We have many obsolete ioctls. Also the way we do
> ioctls can be changed over to netlink. This also has the benefit of working
> better with HPC systems that do IO forwarding. Such systems don't like ioctls
> very well.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9667
>
> More cleanups by making our utilities use sysfs instead of ioctls for LNet.
> Also it has been requested to move the remaining ioctls to the netlink API.
>
> ******************************************************************************
> * Misc
> ******************************************************************************
>
> ------------------------------------------------------------------------------
> https://jira.hpdd.intel.com/browse/LU-9855
>
> Clean up obdclass preprocessor code. One of the major eye sores is the various
> pointer redirections and macros used by the obdclass. This makes the code very
> difficult to understand. It was requested by the Al Viro to clean this up before
> we leave staging.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9633
>
> Migrate to sphinx kernel-doc style comments. Add documents in Documentation.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-6142
>
> Possible remaining coding style fix. Remove deadcode. Enforce kernel code
> style. Other minor misc cleanups...
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8837
>
> Separate client/server functionality. Functions only used by server can be
> removed from client. Most of this has been done but we need a inspect of the
> code to make sure.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8964
>
> Lustre client readahead/writeback control needs to better suit kernel providings.
> Currently its being explored. We could end up replacing the CLIO read ahead
> abstract with the kernel proper version.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9862
>
> Patch that landed for LU-7890 leads to static checker errors
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9868
>
> dcache/namei fixes for lustre
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10467
>
> use standard linux wait_events macros work by Neil Brown
>
> ------------------------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180212/d13cd8f1/attachment.sig>
next prev parent reply other threads:[~2018-02-11 23:54 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-11 23:17 [lustre-devel] Lustre upstream client TODO list James Simmons
2018-02-11 23:54 ` NeilBrown [this message]
2018-02-12 1:15 ` Patrick Farrell
2018-02-12 2:09 ` NeilBrown
2018-03-22 23:21 ` [lustre-devel] Current results and status of my upstream work James Simmons
2018-03-27 5:32 ` NeilBrown
2018-03-27 6:17 ` Dilger, Andreas
2018-03-27 21:17 ` Jinshan Xiong
2018-03-27 21:58 ` NeilBrown
2018-03-30 18:55 ` James Simmons
2018-03-31 5:47 ` NeilBrown
2019-12-19 5:31 ` [lustre-devel] Lustre upstreaming status NeilBrown
2019-12-27 16:04 ` Degremont, Aurelien
2020-01-07 0:02 ` James Simmons
2020-01-07 1:53 ` Andreas Dilger
2020-01-07 2:24 ` Andreas Dilger
2020-01-07 4:32 ` NeilBrown
2020-01-07 4:05 ` NeilBrown
2020-01-08 21:18 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87bmgvrsj8.fsf@notabene.neil.brown.name \
--to=neilb@suse.com \
--cc=lustre-devel@lists.lustre.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).