From: Kyle Moffett <mrlinuxman@mac.com>
To: Lars Ellenberg <lars@linbit.com>
Cc: Jens Axboe <axboe@kernel.dk>, Andrew Morton <akpm@osdl.org>,
lkml <linux-kernel@vger.kernel.org>,
drbd-user@linbit.com
Subject: Re: [DRIVER SUBMISSION] DRBD wants to go mainline
Date: Sun, 22 Jul 2007 21:32:02 -0400 [thread overview]
Message-ID: <20070722213202.1f5d1cab.mrlinuxman@mac.com> (raw)
In-Reply-To: <20070721203819.GA10706@mail.linbit.com>
Ok, I didn't have a chance to get through anywhere near all of it, but
here's my comments so far. I didn't really go through things in any
particular order but most of these comments are about your drbd_int.h
header file. Hopefully a lot of the comments will be useful to fix
similar/identical problems in your C files.
First of all, if you could break this up into chunks (even if they
aren't useful individually) just to make it easier to review.
Divisions like "This code does on-disk bitmap management", and "This
code does network protocol encoding/decoding" would be extremely
helpful when digging through this stuff. I ended up only really doing
a cursory review for low-level/style issues, since without the bigger
picture (and without the fixed style issues and macro cleanups) it's
very hard to really give a good high-level review.
Cheers,
Kyle Moffett
+drbd-objs := drbd_buildtag.o drbd_bitmap.o drbd_proc.o \
+ drbd_worker.o drbd_receiver.o drbd_req.o drbd_actlog.o \
+ lru_cache.o drbd_main.o drbd_strings.o drbd_nl.o
+
+obj-$(CONFIG_BLK_DEV_DRBD) += drbd.o
Don't use foo-objs, use foo-y instead.
+#undef DEVICE_NAME
+#define DEVICE_NAME "drbd"
This is never actually defined/redefined anywhere else. Just hardcode the
"drbd" in your printk strings and save yourself 5-6 characters per use.
+/* I don't remember why XCPU ...
+ * This is used to wake the asender,
+ * and to interrupt sending the sending task
+ * on disconnect.
+ */
+#define DRBD_SIG SIGXCPU
+
+/* This is used to stop/restart our threads.
+ * Cannot use SIGTERM nor SIGKILL, since these
+ * are sent out by init on runlevel changes
+ * I choose SIGHUP for now.
+ *
+ * FIXME btw, we should register some reboot notifier.
+ */
+#define DRBD_SIGKILL SIGHUP
Don't use signals between kernel threads, use proper primitives like
notifiers and waitqueues, which means you should also probably switch away
from kernel_thread() to the kthread_*() APIs. Also you should fix this
FIXME or remove it if it no longer applies:-D.
+#ifdef PARANOIA
+# define PARANOIA_BUG_ON(x) BUG_ON(x)
+#else
+# define PARANOIA_BUG_ON(x)
+#endif
This is only ever used in one place for a simple != test:
+ PARANOIA_BUG_ON(w != &mdev->resync_work);
Just delete PARANOIA_BUG_ON and convert the above to a straight BUG_ON()
+#define STATIC
+#define STATIC static
These two lines are found in different files, but the symbol "STATIC" isn't
used anywhere. Just get rid of it.
+/* handy macro: DUMPP(somepointer) */
+#define DUMPP(A) ERR( #A " = %p in %s:%d\n", (A), __FILE__, __LINE__);
[...]
+/* Info: do not remove the spaces around the "," before ##
+ * Otherwise this is not portable from gcc-2.95 to gcc-3.3 */
+#define PRINTK(level, fmt, args...) \
+ printk(level DEVICE_NAME "%d: " fmt, \
+ mdev->minor , ##args)
+
+#define ALERT(fmt, args...) PRINTK(KERN_ALERT, fmt , ##args)
[...]
No more custom debugging macros please, we have plenty of standardized ones
in the kernel already (and all togther too many nonstandardized ones).
Please take a good look at dev_printk, etc to make some of your printk()s
shorter, but don't add more icky macros. Also gcc < 3.1 is now unsupported,
so please remove gcc-2.95 portability comments/cruft (although in this case
the code itself doesn't need changing, just the comments).
+/* see kernel/printk.c:printk_ratelimit
+ * macro, so it is easy do have independend rate limits at different locations
+ * "initializer element not constant ..." with kernel 2.4 :(
+ * so I initialize toks to something large
+ */
+#define DRBD_ratelimit(ratelimit_jiffies, ratelimit_burst) \
Any particular reason you can't just use printk_ratelimit for this? Also you
should remove any linux 2.4-related code/comments as they won't apply for
code submitted to 2.6 mainline.
+#ifdef DBG_ASSERTS
+extern void drbd_assert_breakpoint(struct drbd_conf *, char *, char *, int );
+# define D_ASSERT(exp) if (!(exp)) \
+ drbd_assert_breakpoint(mdev, #exp, __FILE__, __LINE__)
+#else
+# define D_ASSERT(exp) if (!(exp)) \
+ ERR("ASSERT( " #exp " ) in %s:%d\n", __FILE__, __LINE__)
+#endif
+#define ERR_IF(exp) if (({ \
+ int _b = (exp) != 0; \
+ if (_b) ERR("%s: (" #exp ") in %s:%d\n", \
+ __func__, __FILE__, __LINE__); \
+ _b; \
+ }))
Yuck, more debugging macros.
+/* Defines to control fault insertion */
+enum {
+ DRBD_FAULT_MD_WR = 0, /* meta data write */
+ DRBD_FAULT_MD_RD, /* read */
+ DRBD_FAULT_RS_WR, /* resync */
+ DRBD_FAULT_RS_RD,
+ DRBD_FAULT_DT_WR, /* data */
+ DRBD_FAULT_DT_RD,
+ DRBD_FAULT_DT_RA, /* data read ahead */
+
+ DRBD_FAULT_MAX,
+};
We have some existing failure-injection code, any chance you could rip out
your custom logic and just plug that? I haven't looked over it, though, so
I can't really offer any useful suggestions about it.
+#include <linux/stringify.h>
Put the include at the top of the file, please?
+/* integer division, round _UP_ to the next integer */
+#define div_ceil(A, B) ( (A)/(B) + ((A)%(B) ? 1 : 0) )
+/* usual integer division */
+#define div_floor(A, B) ( (A)/(B) )
Your div_ceil function looks OK but I think we already have a standard one
somewhere. You might remove that div_floor function, though, as it isn't
used anywhere.
+/*
+ * Compatibility Section
+ *************************/
+
+#define LOCK_SIGMASK(task, flags) spin_lock_irqsave(&task->sighand->siglock, flags)
+#define UNLOCK_SIGMASK(task, flags) spin_unlock_irqrestore(&task->sighand->siglock, flags)
+#define RECALC_SIGPENDING() recalc_sigpending();
These aren't used anywhere, remove please?
+#if defined(DBG_SPINLOCKS) && defined(__SMP__)
+# define MUST_HOLD(lock) if (!spin_is_locked(lock)) { ERR("Not holding lock! in %s\n", __FUNCTION__ ); }
+#else
+# define MUST_HOLD(lock)
+#endif
Why not just open-code "WARN_ON_SMP(!spin_is_locked(lock))" in the few places
this is used? Alternatively if you need this to actually BUG(), you might
either add a BUG_ON_SMP() to include/asm-generic/bug.h or add a helper to
include/linux/spinlock.h:
#if defined(CONFIG_SMP) && defined(CONFIG_DEBUG_SPINLOCK)
# define spin_lock_musthold(lock) BUG_ON(!spin_is_locked(lock))
#else
# define spin_lock_musthold(lock) do { } while(0)
#endif
+#define SET_MDEV_MAGIC(x) \
+ ({ typecheck(struct drbd_conf*, x); \
+ (x)->magic = (long)(x) ^ DRBD_MAGIC; })
+#define IS_VALID_MDEV(x) \
+ ( typecheck(struct drbd_conf*, x) && \
+ ((x) ? (((x)->magic ^ DRBD_MAGIC) == (long)(x)):0))
SET_MDEV_MAGIC is only used in one place, can't you open-code it there? Even
if it were used lots of places a little inline function would handle the
"typecheck" for you and be easier to read. The IS_VALID_MDEV() isn't used
anywhere, so delete it.
+enum Drbd_thread_state {
+ None,
+ Running,
+ Exiting,
+ Restarting
+};
+
+struct Drbd_thread {
+ spinlock_t t_lock;
+ struct task_struct *task;
+ struct completion startstop;
+ enum Drbd_thread_state t_state;
+ int (*function) (struct Drbd_thread *);
+ struct drbd_conf *mdev;
+};
As somebody already mentioned, you need to switch from kernel_thread to
kthread_* helpers.
+/*
+ * Having this as the first member of a struct provides sort of "inheritance".
+ * "derived" structs can be "drbd_queue_work()"ed.
+ * The callback should know and cast back to the descendant struct.
+ * drbd_request and Tl_epoch_entry are descendants of drbd_work.
+ */
+struct drbd_work;
+typedef int (*drbd_work_cb)(struct drbd_conf *, struct drbd_work *, int cancel);
+struct drbd_work {
+ struct list_head list;
+ drbd_work_cb cb;
+};
+
+struct drbd_barrier;
+struct drbd_request {
+ struct drbd_work w;
Eww, please don't do opencoded casting of these. You should use the proper
container_of() macro instead:
> struct foo {
> int a;
> };
> struct bar {
> int b;
> struct foo thefoo;
> };
> void mycallback(struct foo *myfoo)
> {
> struct bar *mybar = container_of(myfoo, struct bar, thefoo);
> process_a_bar(mybar);
> }
It's not *much* better typechecking, but it's at least a little. It also
lets you put a struct drbd_work at a non-zero offset inside of some other
struct, as container_of() does internal subtraction of the offsetof().
+/* THINK maybe we actually want to use the default "event/%s" worker threads
+ * or similar in linux 2.6, which uses per cpu data and threads.
+ *
+ * To be general, this might need a spin_lock member.
+ * For now, please use the mdev->req_lock to protect list_head,
+ * see drbd_queue_work below.
+ */
+struct drbd_work_queue {
+ struct list_head q;
+ struct semaphore s; /* producers up it, worker down()s it */
+ spinlock_t q_lock; /* to protect the list. */
+};
Umm, how about fixing this to actually use proper workqueues or something
instead of this open-coded mess?
+/* for sync_conf and other types... */
+#define PACKET(name, number, fields) struct name { fields };
+#define INTEGER(pn, pr, member) int member;
+#define INT64(pn, pr, member) __u64 member;
+#define BIT(pn, pr, member) unsigned member : 1;
+#define STRING(pn, pr, member, len) \
+ unsigned char member[len]; int member ## _len;
+#include "linux/drbd_nl.h"
The only light at the end of this tunnel is the greedily burning fires of
Macro Hell(TM). Isn't there any better way to do this than all those horrid
macros? I can sorta see what drbd_nl.h is trying to do, but the way it's
included multiple times with all those insane macros defined makes it really
hard to mentally parse.
+#ifdef PARANOIA
+ long magic;
+#endif
Any particular reason you can't just unconditionally use a magic number?
It's little more than an extra word per device and a couple simple integer
tests.
+/* returns 1 if it was successfull,
+ * returns 0 if there was no data socket.
+ * so wherever you are going to use the data.socket, e.g. do
+ * if (!drbd_get_data_sock(mdev))
+ * return 0;
+ * CODE();
+ * drbd_put_data_sock(mdev);
+ */
+static inline int drbd_get_data_sock(struct drbd_conf *mdev)
+{
+ down(&mdev->data.mutex);
+ /* drbd_disconnect() could have called drbd_free_sock()
+ * while we were waiting in down()... */
+ if (unlikely(mdev->data.socket == NULL)) {
+ up(&mdev->data.mutex);
+ return 0;
+ }
+ return 1;
+}
Any reason this can't be open-coded? The extra unlock logic should just be
moved into the cleanup handlers for the appropriate function. It looks like
you are using the semaphore as a simple binary mutex, so please switch this
to "struct mutex" as well.
+#if BITS_PER_LONG == 32
+#define LN2_BPL 5
+#define cpu_to_lel(A) cpu_to_le32(A)
+#define lel_to_cpu(A) le32_to_cpu(A)
+#elif BITS_PER_LONG == 64
+#define LN2_BPL 6
+#define cpu_to_lel(A) cpu_to_le64(A)
+#define lel_to_cpu(A) le64_to_cpu(A)
+#else
+#error "LN2 of BITS_PER_LONG unknown!"
+#endif
Hrm, any reason you can't just make the bitmap an array of __u64 and always
use the cpu_to_le64()/le64_to_cpu() functions?
+/* I want the packet to fit within one page
+ * THINK maybe use a special bitmap header,
+ * including offset and compression scheme and whatnot
+ * Do not use PAGE_SIZE here! Use a architecture agnostic constant!
+ */
+#define BM_PACKET_WORDS ((4096-sizeof(struct Drbd_Header))/sizeof(long))
Yuck. Definitely use PAGE_SIZE here, so at least if it's broken on an arch
with multiple page sizes, somebody can grep for PAGE_SIZE to fix it. It also
means that on archs/configs with 8k or 64k pages you won't waste a bunch of
memory.
+/* Dynamic tracing framework */
+#ifdef ENABLE_DYNAMIC_TRACE
+
+extern int trace_type;
+extern int trace_devs;
[...]
AGH!! Not another dynamic tracing framework! Please use one of the existing
solutions like kprobes or something. Maybe define some static tracepoints
if that would be useful?
+static inline void drbd_tcp_cork(struct socket *sock)
+{
+#if 1
+ mm_segment_t oldfs = get_fs();
+ int val = 1;
+
+ set_fs(KERNEL_DS);
+ tcp_setsockopt(sock->sk, SOL_TCP, TCP_CORK, (char *)&val, sizeof(val) );
+ set_fs(oldfs);
+#else
+ tcp_sk(sock->sk)->nonagle |= TCP_NAGLE_CORK;
+#endif
+}
Yuck, why'd you do it this way? Probably because your tcp_sk(sock->sk) stuff
doesn't have proper locking, I'll bet. You can avoid all the extra wrapper
crap by just looking in "do_tcp_setsockopt" and taking the appropriate lock:
static inline void drbd_tcp_cork(struct socket *sock)
{
struct sock *sk = sock->sk;
lock_sock(sk);
tcp_sk(sk)->nonagle |= TCP_NAGLE_CORK;
release-sock(sk);
}
On the other hand, none of the currently in-tree network block devices or
network filesystems seem to feel the need to try to TCP_CORK their output,
why does drbd?
+#define peer_mask role_mask
+#define pdsk_mask disk_mask
+#define susp_mask 1
+#define user_isp_mask 1
+#define aftr_isp_mask 1
+
+#define NS(T, S) \
+ ({ union drbd_state_t mask; mask.i = 0; mask.T = T##_mask; mask; }), \
+ ({ union drbd_state_t val; val.i = 0; val.T = (S); val; })
+#define NS2(T1, S1, T2, S2) \
+ ({ union drbd_state_t mask; mask.i = 0; mask.T1 = T1##_mask; \
+ mask.T2 = T2##_mask; mask; }), \
+ ({ union drbd_state_t val; val.i = 0; val.T1 = (S1); \
+ val.T2 = (S2); val; })
+#define NS3(T1, S1, T2, S2, T3, S3) \
+ ({ union drbd_state_t mask; mask.i = 0; mask.T1 = T1##_mask; \
+ mask.T2 = T2##_mask; mask.T3 = T3##_mask; mask; }), \
+ ({ union drbd_state_t val; val.i = 0; val.T1 = (S1); \
+ val.T2 = (S2); val.T3 = (S3); val; })
+
+#define _NS(D, T, S) \
+ D, ({ union drbd_state_t ns; ns.i = D->state.i; ns.T = (S); ns; })
+#define _NS2(D, T1, S1, T2, S2) \
+ D, ({ union drbd_state_t ns; ns.i = D->state.i; ns.T1 = (S1); \
+ ns.T2 = (S2); ns; })
+#define _NS3(D, T1, S1, T2, S2, T3, S3) \
+ D, ({ union drbd_state_t ns; ns.i = D->state.i; ns.T1 = (S1); \
+ ns.T2 = (S2); ns.T3 = (S3); ns; })
Grumble. When I earlier said I thought I was in macro hell, well, I was
wrong. *THIS* is macro hell. What the fsck is that supposed to do? And it
doesn't even include a *SINGLE* comment!!!
+static inline void drbd_state_lock(struct drbd_conf *mdev)
+{
+ wait_event(mdev->misc_wait,
+ !test_and_set_bit(CLUSTER_ST_CHANGE, &mdev->flags));
+}
Umm, what's wrong with a mutex here?
+static inline int semaphore_is_locked(struct semaphore *s)
+{
+ if (!down_trylock(s)) {
+ up(s);
+ return 0;
+ }
+ return 1;
+}
This would be better implemented in the mutex/semaphore generic code, instead
of stuffed in a network blockdev. On the other hand, since the only user is
this:
+ D_ASSERT(semaphore_is_locked(&mdev->md_io_mutex));
Can't you just either get rid of the check or open-code it? It would also
give you a chance to substitute D_ASSERT(foo) for a proper BUG_ON(!foo) :-D
+static inline void
+_drbd_queue_work(struct drbd_work_queue *q, struct drbd_work *w)
[...]
Yeah, this stuff should really be fixed to use generic workqueues or
something.
+#define ERR_IF_CNT_IS_NEGATIVE(which) \
+ if (atomic_read(&mdev->which) < 0) \
+ ERR("in %s:%d: " #which " = %d < 0 !\n", \
+ __func__ , __LINE__ , \
+ atomic_read(&mdev->which))
Another macro with an implicit parameter (mdev). Please fix all of these
+#define dec_unacked(mdev) do { \
+ typecheck(struct drbd_conf *, mdev); \
+ atomic_dec(&mdev->unacked_cnt); \
+ ERR_IF_CNT_IS_NEGATIVE(unacked_cnt); } while (0)
[...]
More macros that should be inline functions.
+ if (!have_net_conf) dec_net(mdev);
There's lots of coding-style violations like these in there. You should put
the "dec_net(mdev);" indented on the line *AFTER* the if statememt. Please
read linux/Documentation/CodingStyle for full details.
+ int mxb = 1000000; /* arbitrary limit on open requests */
Arbitrary limits are usually bad. Derive it from system properties, make it
user-configurable, etc.
+/* I'd like to use wait_event_lock_irq,
+ * but I'm not sure when it got introduced,
+ * and not sure when it has 3 or 4 arguments */
+static inline void inc_ap_bio(struct drbd_conf *mdev)
[...]
+ spin_lock_irq(&mdev->req_lock);
+ while (!__inc_ap_bio_cond(mdev)) {
+ prepare_to_wait(&mdev->misc_wait, &wait, TASK_UNINTERRUPTIBLE);
+ spin_unlock_irq(&mdev->req_lock);
+ schedule();
+ finish_wait(&mdev->misc_wait, &wait);
+ spin_lock_irq(&mdev->req_lock);
+ }
+ spin_unlock_irq(&mdev->req_lock);
Since you're submitting for upstream inclusion you can just delete the extra
backwards-compatible opencoded stuff and use the generic helper.
+#define ARRY_SIZE(A) (sizeof(A)/sizeof(A[0]))
There's a generic ARRAY_SIZE(foo) somewhere in the kernel sources. Use that
one please.
+/* this is developers aid only! */
+#define PARANOIA_ENTRY() BUG_ON(test_and_set_bit(__LC_PARANOIA, &lc->flags))
+#define PARANOIA_LEAVE() do { clear_bit(__LC_PARANOIA, &lc->flags); smp_mb__after_clear_bit(); } while (0)
+#define RETURN(x...) do { PARANOIA_LEAVE(); return x ; } while (0)
What is the PARANOIA flag intended to verify? This needs better commenting
or removal. Don't use macros which take implicit arguments (IE: the lc
value is used in the macro without it being passed as a parameter). Since
it's debugging code you could also replace the open-coded bit trylock with a
blocking spinlock. If something hangs, just turn on lockdep and it will
report *MUCH* more useful information about the deadlock. Macros which do a
"return" are discouraged, but since this one is called RETURN() it might be
OK. On the other hand, you should probably rename it as PARANOIA_RETURN().
+int _drbd_md_sync_page_io(struct drbd_conf *mdev,
+ struct drbd_backing_dev *bdev,
+ struct page *page, sector_t sector,
+ int rw, int size)
+{
[...]
+ ok = (bio_add_page(bio, page, size, 0) == size);
+ if (!ok) goto out;
Ewwww, can somebody say "CodingStyle"? Don't put an if() and its result
on a single line, and just use a straight if instead:
> if (bio_add_page(bio, page, size, 0) != size)
> goto out;
+ mask = ( hardsect / MD_HARDSECT ) - 1;
+ D_ASSERT( mask == 1 || mask == 3 || mask == 7 );
+ D_ASSERT( hardsect == (mask+1) * MD_HARDSECT );
Problematic spaces and parentheses. Again, see Documentation/CodingStyle
for the preferred forms. On the other hand, sometimes little code-style
things like that are left to the maintainer if they really strongly prefer
it one particular way.
next prev parent reply other threads:[~2007-07-23 1:38 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-07-21 20:38 [DRIVER SUBMISSION] DRBD wants to go mainline Lars Ellenberg
2007-07-21 21:17 ` Jan Engelhardt
2007-07-21 22:43 ` Lars Ellenberg
2007-07-22 9:06 ` Jan Engelhardt
2007-07-22 14:03 ` Lars Ellenberg
2007-07-27 18:46 ` Pavel Machek
2007-07-30 19:35 ` Lars Ellenberg
2007-07-30 19:41 ` Jan Engelhardt
2007-07-21 21:34 ` Jesper Juhl
2007-07-21 23:50 ` Andi Kleen
2007-07-22 5:52 ` Jens Axboe
2007-07-22 13:58 ` Lars Ellenberg
2007-07-22 14:49 ` Sam Ravnborg
2007-07-22 14:56 ` Andi Kleen
2007-07-22 15:31 ` Satyam Sharma
2007-07-22 15:50 ` Satyam Sharma
2007-07-22 16:13 ` Stefan Richter
2007-07-22 6:09 ` Lars Ellenberg
2007-07-22 8:52 ` Sam Ravnborg
2007-07-22 9:05 ` Pekka Enberg
2007-07-22 9:00 ` Pekka Enberg
2007-07-23 1:32 ` Kyle Moffett [this message]
2007-07-23 8:49 ` Christoph Hellwig
2007-07-23 9:00 ` Sam Ravnborg
2007-07-23 9:01 ` Christoph Hellwig
2007-07-23 9:19 ` Sam Ravnborg
2007-07-23 11:08 ` Lars Ellenberg
2007-07-23 13:32 ` Lars Ellenberg
2007-07-23 13:37 ` Jens Axboe
2007-07-23 21:13 ` Lars Ellenberg
2007-07-23 13:40 ` Satyam Sharma
2007-07-23 21:19 ` Lars Ellenberg
2007-07-24 7:36 ` Jens Axboe
2007-07-24 23:11 ` Satyam Sharma
2007-07-25 9:46 ` Lars Ellenberg
2007-07-25 12:12 ` Satyam Sharma
2007-07-26 2:03 ` david
2007-07-26 3:43 ` Kyle Moffett
2007-07-26 9:17 ` Evgeniy Polyakov
2007-07-24 0:48 ` Kyle Moffett
2007-07-23 20:59 ` Jesper Juhl
-- strict thread matches above, loose matches on Subject: below --
2007-07-22 9:54 Tomasz Chmielewski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070722213202.1f5d1cab.mrlinuxman@mac.com \
--to=mrlinuxman@mac.com \
--cc=akpm@osdl.org \
--cc=axboe@kernel.dk \
--cc=drbd-user@linbit.com \
--cc=lars@linbit.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.