* Current NBD 'stuff' @ 2001-12-03 18:02 Edward Muller 2001-12-04 22:26 ` Paul Clements 2001-12-06 12:54 ` Pavel Machek 0 siblings, 2 replies; 11+ messages in thread From: Edward Muller @ 2001-12-03 18:02 UTC (permalink / raw) To: linux-kernel Not 100% kernel related ... but ... Anyone know where I can find the latest NBD stuff? Esp. client/server code? I looked at Pavel's website and the nbd.14.tar.gz file are from '98 and '99. Is their anything newer that works with the nbd kernel module? Oh ... And one more question what's the best 2.4.X kernel to use with nbd? -- ------------------------------- Edward Muller Director of IS 973-715-0230 (cell) 212-487-9064 x115 (NYC) http://www.learningpatterns.com ------------------------------- ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Current NBD 'stuff' 2001-12-03 18:02 Current NBD 'stuff' Edward Muller @ 2001-12-04 22:26 ` Paul Clements 2001-12-04 23:12 ` Edward Muller 2001-12-06 13:02 ` Pavel Machek 2001-12-06 12:54 ` Pavel Machek 1 sibling, 2 replies; 11+ messages in thread From: Paul Clements @ 2001-12-04 22:26 UTC (permalink / raw) To: Edward Muller; +Cc: linux-kernel [-- Attachment #1: Type: TEXT/PLAIN, Size: 762 bytes --] On 3 Dec 2001, Edward Muller wrote: > Anyone know where I can find the latest NBD stuff? Esp. client/server > code? I have the same question. Maybe the user-level stuff is not being actively maintained? However, since we couldn't find current versions of this stuff, my colleagues and I patched nbd-server and the nbd kernel module to fix a few bugs and to make them a little more robust. I'll attach my versions of those files (which I think are derived from Pavel's .14.tar.gz versions). > Oh ... And one more question what's the best 2.4.X kernel to use with > nbd? You'll want at least 2.4.4 (you'll probably want later than that for other reasons anyway) -- I think before that NBD was badly broken. -- Paul Clements Paul.Clements@SteelEye.com [-- Attachment #2: Type: TEXT/PLAIN, Size: 4466 bytes --] Index: linux/2.4/drivers/block/nbd.c diff -u linux/2.4/drivers/block/nbd.c:1.1.1.9 linux/2.4/drivers/block/nbd.c:1.1.1.9.4.3 --- linux/2.4/drivers/block/nbd.c:1.1.1.9 Fri Jun 29 16:31:25 2001 +++ linux/2.4/drivers/block/nbd.c Tue Oct 2 13:34:03 2001 @@ -91,17 +91,18 @@ int result; struct msghdr msg; struct iovec iov; - unsigned long flags; - sigset_t oldset; + //unsigned long flags; + //sigset_t oldset; oldfs = get_fs(); set_fs(get_ds()); - spin_lock_irqsave(¤t->sigmask_lock, flags); - oldset = current->blocked; - sigfillset(¤t->blocked); - recalc_sigpending(current); - spin_unlock_irqrestore(¤t->sigmask_lock, flags); + // JEJB: Allow signal interception + //spin_lock_irqsave(¤t->sigmask_lock, flags); + //oldset = current->blocked; + //sigfillset(¤t->blocked); + //recalc_sigpending(current); + //spin_unlock_irqrestore(¤t->sigmask_lock, flags); do { @@ -122,6 +123,13 @@ else result = sock_recvmsg(sock, &msg, size, 0); + // JEJB: Detect signal issue here + if(signal_pending(current)) { + printk(KERN_WARNING "NBD caught signal\n"); + result = -EINTR; + break; + } + if (result <= 0) { #ifdef PARANOIA printk(KERN_ERR "NBD: %s - sock=%ld at buf=%ld, size=%d returned %d.\n", @@ -133,10 +141,11 @@ buf += result; } while (size > 0); - spin_lock_irqsave(¤t->sigmask_lock, flags); - current->blocked = oldset; - recalc_sigpending(current); - spin_unlock_irqrestore(¤t->sigmask_lock, flags); + //JEJB: didn't modify signal mask, so no need to restore it + //spin_lock_irqsave(¤t->sigmask_lock, flags); + //current->blocked = oldset; + //recalc_sigpending(current); + //spin_unlock_irqrestore(¤t->sigmask_lock, flags); set_fs(oldfs); return result; @@ -333,8 +342,27 @@ spin_unlock_irq(&io_request_lock); down (&lo->queue_lock); + if(!lo->file) { + up(&lo->queue_lock); + spin_lock_irq(&io_request_lock); + printk(KERN_ERR "NBD: FAIL BETWEEN ACCEPT AND SEMAPHORE, FILE LOST\n"); + req->errors++; + nbd_end_request(req); + continue; + } + list_add(&req->queue, &lo->queue_head); nbd_send_req(lo->sock, req); /* Why does this block? */ + if(req->errors) { + printk(KERN_ERR "NBD: NBD_SEND_REQ FAILED\n"); + list_del(&req->queue); + + up(&lo->queue_lock); + spin_lock_irq(&io_request_lock); + nbd_end_request(req); + + continue; + } up (&lo->queue_lock); spin_lock_irq(&io_request_lock); @@ -384,12 +412,14 @@ printk(KERN_ERR "nbd: Some requests are in progress -> can not turn off.\n"); return -EBUSY; } - up(&lo->queue_lock); file = lo->file; - if (!file) + if (!file) { + up(&lo->queue_lock); return -EINVAL; + } lo->file = NULL; lo->sock = NULL; + up(&lo->queue_lock); fput(file); return 0; case NBD_SET_SOCK: @@ -430,9 +460,29 @@ if (!lo->file) return -EINVAL; nbd_do_it(lo); + /* on return tidy up in case we have a signal */ + printk(KERN_WARNING "NBD: nbd_do_it returned\n"); + /* Forcibly shutdown the socket causing all listeners + * to error + * + * FIXME: This code is duplicated from sys_shutdown, but + * there should be a more generic interface rather than + * calling socket ops directly here */ + lo->sock->ops->shutdown(lo->sock, 2); + down(&lo->queue_lock); + printk(KERN_WARNING "NBD: lock acquired\n"); + nbd_clear_que(lo); + file = lo->file; + lo->file = NULL; + lo->sock = NULL; + up(&lo->queue_lock); + if(file) + fput(file); return lo->harderror; case NBD_CLEAR_QUE: + down(&lo->queue_lock); nbd_clear_que(lo); + up(&lo->queue_lock); return 0; #ifdef PARANOIA case NBD_PRINT_DEBUG: @@ -492,7 +542,7 @@ return -EIO; } #ifdef MODULE - printk("nbd: registered device at major %d\n", MAJOR_NR); + printk("nbd: (version Steeleye-8) registered device at major %d\n", MAJOR_NR); #endif blksize_size[MAJOR_NR] = nbd_blksizes; blk_size[MAJOR_NR] = nbd_sizes; @@ -507,7 +557,7 @@ init_MUTEX(&nbd_dev[i].queue_lock); nbd_blksizes[i] = 1024; nbd_blksize_bits[i] = 10; - nbd_bytesizes[i] = 0x7ffffc00; /* 2GB */ + nbd_bytesizes[i] = ((u64)0x7ffffc00) << 10; /* 2TB */ nbd_sizes[i] = nbd_bytesizes[i] >> BLOCK_SIZE_BITS; register_disk(NULL, MKDEV(MAJOR_NR,i), 1, &nbd_fops, nbd_bytesizes[i]>>9); [-- Attachment #3: Type: TEXT/PLAIN, Size: 9229 bytes --] /* * Network Block Device - server * * Copyright 1996-1998 Pavel Machek, distribute under GPL * <pavel@atrey.karlin.mff.cuni.cz> * * Version 1.0 - hopefully 64-bit-clean * Version 1.1 - merging enhancements from Josh Parsons, <josh@coombs.anu.edu.au> * Version 1.2 - autodetect size of block devices, thanx to Peter T. Breuer" <ptb@it.uc3m.es> */ #define VERSION "1.3" #define GIGA (1*1024*1024*1024) /* use lseek64 exclusively */ #define _LARGEFILE_SOURCE // Some more functions for correct standard I/O. #define _LARGEFILE64_SOURCE // Additional functionality from LFS for large files #include <sys/socket.h> #include <sys/stat.h> #include <netinet/tcp.h> #include <netinet/in.h> /* sockaddr_in, htons, in_addr */ #include <netdb.h> /* hostent, gethostby*, getservby* */ #include <syslog.h> #include <unistd.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <fcntl.h> #include <arpa/inet.h> #define _IO(a,b) #define ISSERVER #define MY_NAME "nbd_server" #include "cliserv.h" #undef _IO /* Deep magic: ioctl.h defines _IO macro (at least on linux) */ #include <sys/ioctl.h> #include <sys/mount.h> /* For BLKGETSIZE */ // #define DODBG // #define DEBUG( a... ) printf( a ) #define DEBUG( a... ) do { } while(0) inline void readit(int f, void *buf, int len) { int res; while (len > 0) { DEBUG("*"); if ((res = read(f, buf, len)) <= 0) err("Read failed: %m"); len -= res; buf += res; } } inline void writeit(int f, void *buf, int len) { int res; while (len > 0) { DEBUG("+"); if ((res = write(f, buf, len)) <= 0) err("Write failed: %m"); len -= res; buf += res; } } int port; /* Port I'm listening at */ char *exportname; /* File I'm exporting */ u64 exportsize = ~0, hunksize = ~0; /* ...and its length */ int flags = 0; int export[1024]; #define F_READONLY 1 #define F_MULTIFILE 2 void cmdline(int argc, char *argv[]) { int i; if (argc < 3) { printf("This is nbd-server version " VERSION "\n"); printf("Usage: port file_to_export [size][kKmM] [-r]\n" " -r read only\n" " if port is set to 0, stdin is used (for running from inetd)\n" " if file_to_export contains '%%s', it is substituted with IP\n" " address of machine trying to connect\n" ); exit(0); } port = atoi(argv[1]); for (i = 3; i < argc; i++) { if (*argv[i] == '-') { switch (argv[i][1]) { case 'r': flags |= F_READONLY; break; case 'm': flags |= F_MULTIFILE; hunksize = 1*GIGA; break; } } else { u64 es; int last = strlen(argv[i])-1; char suffix = argv[i][last]; if (suffix == 'k' || suffix == 'K' || suffix == 'm' || suffix == 'M') argv[i][last] = '\0'; es = (u64)atol(argv[i]); switch (suffix) { case 'm': case 'M': es <<= 10; case 'k': case 'K': es <<= 10; default : break; } exportsize = es; } } exportname = argv[2]; } int connectme(int port) { struct sockaddr_in addrin; int addrinlen = sizeof(addrin); int net, sock; int size = 1; if ((sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)) < 0) err("socket: %m"); // SteelEye change - reuse the port number if (setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &size, sizeof(int)) < 0) err("setsockopt: %m"); DEBUG("Waiting for connections... bind, "); addrin.sin_family = AF_INET; addrin.sin_port = htons(port); addrin.sin_addr.s_addr = 0; if (bind(sock, (struct sockaddr *) &addrin, addrinlen) < 0) err("bind: %m"); DEBUG("listen, "); if (listen(sock, 1) < 0) err("listen: %m"); DEBUG("accept, "); if ((net = accept(sock, (struct sockaddr *) &addrin, &addrinlen)) < 0) err("accept: %m"); return net; } #define SEND writeit( net, &reply, sizeof( reply )); #define ERROR { reply.error = htonl(-1); SEND; reply.error = 0; lastpoint = -1; } u64 lastpoint = -1; void maybeseek(int handle, u64 a) { if (a > exportsize) err("Can not happen\n"); if (lastpoint != a) { if (lseek64(handle, a, SEEK_SET) < 0) err("Can not seek locally!\n"); lastpoint = a; } else { DEBUG("@"); } } int expread(u64 a, char *buf, int len) { maybeseek(export[a/hunksize], a%hunksize); return (read(export[a/hunksize], buf, len) != len); } int expwrite(u64 a, char *buf, int len) { maybeseek(export[a/hunksize], a%hunksize); return (write(export[a/hunksize], buf, len) != len); } int mainloop(int net) { struct nbd_request request; struct nbd_reply reply; char zeros[300]; int i = 0; u64 size_host; bzero(zeros, 290); if (write(net, INIT_PASSWD, 8) < 0) err("Negotiation failed: %m"); cliserv_magic = htonll(cliserv_magic); if (write(net, &cliserv_magic, sizeof(cliserv_magic)) < 0) err("Negotiation failed: %m"); size_host = htonll(exportsize); if (write(net, &size_host, 8) < 0) err("Negotiation failed: %m"); if (write(net, zeros, 128) < 0) err("Negotiation failed: %m"); DEBUG("Entering request loop!\n"); reply.magic = htonl(NBD_REPLY_MAGIC); reply.error = 0; while (1) { /* SteelEye change - need dynamic buffer to work with elevator */ static long max_nbd_request=131072; /* 128K */ static char *buf=NULL; int len; #ifdef DODBG i++; printf("%d: ", i); #endif readit(net, &request, sizeof(request)); request.from = ntohll(request.from); request.type = ntohl(request.type); len = ntohl(request.len); if (request.magic != htonl(NBD_REQUEST_MAGIC)) err("Not enough magic."); DEBUG("request len: %d bytes\n", len); while (len > max_nbd_request || !buf) { /* SteelEye change - (re)allocate the buffer */ if (buf) free(buf); if (len > max_nbd_request) max_nbd_request = len; buf=malloc(max_nbd_request); if (!buf) DEBUG("failed to allocate %d byte buffer\n", max_nbd_request); } #ifdef DODBG printf("%s from %d (%d) len %d, ", (request.type ? "WRITE" : "READ"), (int) request.from, (int) request.from / 512, len); #endif memcpy(reply.handle, request.handle, sizeof(reply.handle)); if (((request.from + len) > exportsize) || ((flags & F_READONLY) && request.type)) { DEBUG("[RANGE!]"); ERROR; continue; } if (request.type) { /* WRITE */ DEBUG("wr: net->buf, "); readit(net, buf, len); DEBUG("buf->exp, "); if (expwrite(request.from, buf, len)) { DEBUG("Write failed: %m" ); ERROR; continue; } lastpoint += len; SEND; continue; } /* READ */ DEBUG("exp->buf, "); if (expread(request.from, buf + sizeof(struct nbd_reply), len)) { lastpoint = -1; DEBUG("Read failed: %m"); ERROR; continue; } lastpoint += len; DEBUG("buf->net, "); memcpy(buf, &reply, sizeof(struct nbd_reply)); writeit(net, buf, len + sizeof(struct nbd_reply)); DEBUG("OK!\n"); } } char exportname2[1024]; void set_peername(int net) { struct sockaddr_in addrin; int addrinlen = sizeof( addrin ); char *peername; if (getpeername( net, (struct sockaddr *) &addrin, &addrinlen ) < 0) err("getsockname failed: %m"); peername = inet_ntoa(addrin.sin_addr); sprintf(exportname2, exportname, peername); syslog(LOG_INFO, "connect from %s, assigned file is %s", peername, exportname2); } u64 size_autodetect(int export) { u64 es; DEBUG("looking for export size with lseek SEEK_END\n"); if ((es = lseek64(export, 0, SEEK_END)) == MINUS_ONE_64 || es == 0) { struct stat stat_buf = { 0, } ; int error; DEBUG("looking for export size with fstat\n"); if ((error = fstat(export, &stat_buf)) == -1 || stat_buf.st_size == 0 ) { DEBUG("looking for export size with ioctl BLKGETSIZE\n"); #ifdef BLKGETSIZE if(ioctl(export, BLKGETSIZE, &es) || es == 0) { #else if(1){ #endif err("Could not find size of exported block device: %m"); } else { es *= 512; /* assume blocksize 512 */ } } else { es = stat_buf.st_size; } } return es; } int main(int argc, char *argv[]) { int net; u64 i; if (sizeof( struct nbd_request )!=28) err("Bad size of structure. Alignment problems?"); logging(); cmdline(argc, argv); if (port) net = connectme(port); else net = 0; set_peername(net); for (i=0; i<exportsize; i+=hunksize) { char exportname3[1024]; sprintf(exportname3, exportname2, i/hunksize); printf( "Opening %s\n", exportname3 ); if ((export[i/hunksize] = open(exportname3, (flags & F_READONLY) ? O_RDONLY : O_RDWR)) == -1) err("Could not open exported file: %m"); } if (exportsize == ~0) { exportsize = size_autodetect(export[0]); } if (exportsize > (~0UL >> 1)) if ((exportsize >> 10) > (~0UL >> 1)) syslog(LOG_INFO, "size of exported file/device is %luMB", (unsigned long)(exportsize >> 20)); else syslog(LOG_INFO, "size of exported file/device is %luKB", (unsigned long)(exportsize >> 10)); else syslog(LOG_INFO, "size of exported file/device is %lu", (unsigned long)exportsize); setmysockopt(net); mainloop(net); return 0; } [-- Attachment #4: Type: TEXT/PLAIN, Size: 2417 bytes --] /* This header file is shared by client & server. They really have * something to share... * */ /* Client/server protocol is as follows: Send INIT_PASSWD Send 64-bit cliserv_magic Send 64-bit size of exported device Send 128 bytes of zeros (reserved for future use) */ #include "config.h" #include <errno.h> #include <string.h> #include <netdb.h> #if SIZEOF_UNSIGNED_SHORT_INT==4 typedef unsigned short u32; #elif SIZEOF_UNSIGNED_INT==4 typedef unsigned int u32; #elif SIZEOF_UNSIGNED_LONG_INT==4 typedef unsigned long u32; #else #error I need at least some 32-bit type #endif #if SIZEOF_UNSIGNED_INT==8 typedef unsigned int u64; #define MINUS_ONE_64 (-1U) #elif SIZEOF_UNSIGNED_LONG_INT==8 typedef unsigned long u64; #define MINUS_ONE_64 (-1UL) #elif SIZEOF_UNSIGNED_LONG_LONG_INT==8 typedef unsigned long long u64; #define MINUS_ONE_64 (-1ULL) #else #error I need at least some 64-bit type #endif #include "nbd.h" u64 cliserv_magic = 0x00420281861253LL; #define INIT_PASSWD "NBDMAGIC" #define INFO(a) do { } while(0) void setmysockopt(int sock) { int size = 1; #if 0 if (setsockopt(sock, SOL_SOCKET, SO_SNDBUF, &size, sizeof(int)) < 0) INFO("(no sockopt/1: %m)"); #endif size = 1; if (setsockopt(sock, SOL_TCP, TCP_NODELAY, &size, sizeof(int)) < 0) INFO("(no sockopt/2: %m)"); #if 0 size = 1024; if (setsockopt(sock, SOL_TCP, TCP_MAXSEG, &size, sizeof(int)) < 0) INFO("(no sockopt/3: %m)"); #endif } void err(const char *s) { const int maxlen = 150; char s1[maxlen], *s2; int n = 0; strncpy(s1, s, maxlen); if (s2 = strstr(s, "%m")) { strcpy(s1 + (s2 - s), sys_errlist[errno]); s2 += 2; strcpy(s1 + strlen(s1), s2); } else if (s2 = strstr(s, "%h")) { strcpy(s1 + (s2 - s), hstrerror(h_errno)); s2 += 2; strcpy(s1 + strlen(s1), s2); } s1[maxlen-1] = '\0'; #ifdef ISSERVER syslog(LOG_ERR, s1); #else fprintf(stderr, "Error: %s\n", s1); #endif exit(1); } void logging(void) { #ifdef ISSERVER openlog(MY_NAME, LOG_PID, LOG_DAEMON); #endif setvbuf(stdout, NULL, _IONBF, 0); setvbuf(stderr, NULL, _IONBF, 0); } #ifdef WORDS_BIGENDIAN u64 ntohll(u64 a) { return a; } #else u64 ntohll(u64 a) { u32 lo = a & 0xffffffff; u32 hi = a >> 32U; lo = ntohl(lo); hi = ntohl(hi); return ((u64) lo) << 32U | hi; } #endif #define htonll ntohll ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Current NBD 'stuff' 2001-12-04 22:26 ` Paul Clements @ 2001-12-04 23:12 ` Edward Muller 2001-12-05 16:14 ` Paul Clements 2001-12-06 13:02 ` Pavel Machek 1 sibling, 1 reply; 11+ messages in thread From: Edward Muller @ 2001-12-04 23:12 UTC (permalink / raw) To: Paul.Clements; +Cc: linux-kernel Actually I am playing with ENBD now. I think ENBD is targeted for inclusion in the kernel in 2.5, but it can be found seperatly (sp) at http://www.it.uc3m.es/~ptb/nbd/ It looks much better than the nbd stuff that is currently in the kernel. But that's mostly because Pavel doesn't have much time at the moment for it AFAIK. On Tue, 2001-12-04 at 17:26, Paul Clements wrote: > On 3 Dec 2001, Edward Muller wrote: > > > Anyone know where I can find the latest NBD stuff? Esp. client/server > > code? > > I have the same question. Maybe the user-level stuff is not being > actively maintained? > > However, since we couldn't find current versions of this stuff, > my colleagues and I patched nbd-server and the nbd kernel module > to fix a few bugs and to make them a little more robust. I'll > attach my versions of those files (which I think are derived from > Pavel's .14.tar.gz versions). > > > > Oh ... And one more question what's the best 2.4.X kernel to use with > > nbd? > > You'll want at least 2.4.4 (you'll probably want later than that for > other reasons anyway) -- I think before that NBD was badly broken. > > -- > Paul Clements > Paul.Clements@SteelEye.com > > ---- > > Index: linux/2.4/drivers/block/nbd.c > diff -u linux/2.4/drivers/block/nbd.c:1.1.1.9 linux/2.4/drivers/block/nbd.c:1.1.1.9.4.3 > --- linux/2.4/drivers/block/nbd.c:1.1.1.9 Fri Jun 29 16:31:25 2001 > +++ linux/2.4/drivers/block/nbd.c Tue Oct 2 13:34:03 2001 > @@ -91,17 +91,18 @@ > int result; > struct msghdr msg; > struct iovec iov; > - unsigned long flags; > - sigset_t oldset; > + //unsigned long flags; > + //sigset_t oldset; > > oldfs = get_fs(); > set_fs(get_ds()); > > - spin_lock_irqsave(¤t->sigmask_lock, flags); > - oldset = current->blocked; > - sigfillset(¤t->blocked); > - recalc_sigpending(current); > - spin_unlock_irqrestore(¤t->sigmask_lock, flags); > + // JEJB: Allow signal interception > + //spin_lock_irqsave(¤t->sigmask_lock, flags); > + //oldset = current->blocked; > + //sigfillset(¤t->blocked); > + //recalc_sigpending(current); > + //spin_unlock_irqrestore(¤t->sigmask_lock, flags); > > > do { > @@ -122,6 +123,13 @@ > else > result = sock_recvmsg(sock, &msg, size, 0); > > + // JEJB: Detect signal issue here > + if(signal_pending(current)) { > + printk(KERN_WARNING "NBD caught signal\n"); > + result = -EINTR; > + break; > + } > + > if (result <= 0) { > #ifdef PARANOIA > printk(KERN_ERR "NBD: %s - sock=%ld at buf=%ld, size=%d returned %d.\n", > @@ -133,10 +141,11 @@ > buf += result; > } while (size > 0); > > - spin_lock_irqsave(¤t->sigmask_lock, flags); > - current->blocked = oldset; > - recalc_sigpending(current); > - spin_unlock_irqrestore(¤t->sigmask_lock, flags); > + //JEJB: didn't modify signal mask, so no need to restore it > + //spin_lock_irqsave(¤t->sigmask_lock, flags); > + //current->blocked = oldset; > + //recalc_sigpending(current); > + //spin_unlock_irqrestore(¤t->sigmask_lock, flags); > > set_fs(oldfs); > return result; > @@ -333,8 +342,27 @@ > spin_unlock_irq(&io_request_lock); > > down (&lo->queue_lock); > + if(!lo->file) { > + up(&lo->queue_lock); > + spin_lock_irq(&io_request_lock); > + printk(KERN_ERR "NBD: FAIL BETWEEN ACCEPT AND SEMAPHORE, FILE LOST\n"); > + req->errors++; > + nbd_end_request(req); > + continue; > + } > + > list_add(&req->queue, &lo->queue_head); > nbd_send_req(lo->sock, req); /* Why does this block? */ > + if(req->errors) { > + printk(KERN_ERR "NBD: NBD_SEND_REQ FAILED\n"); > + list_del(&req->queue); > + > + up(&lo->queue_lock); > + spin_lock_irq(&io_request_lock); > + nbd_end_request(req); > + > + continue; > + } > up (&lo->queue_lock); > > spin_lock_irq(&io_request_lock); > @@ -384,12 +412,14 @@ > printk(KERN_ERR "nbd: Some requests are in progress -> can not turn off.\n"); > return -EBUSY; > } > - up(&lo->queue_lock); > file = lo->file; > - if (!file) > + if (!file) { > + up(&lo->queue_lock); > return -EINVAL; > + } > lo->file = NULL; > lo->sock = NULL; > + up(&lo->queue_lock); > fput(file); > return 0; > case NBD_SET_SOCK: > @@ -430,9 +460,29 @@ > if (!lo->file) > return -EINVAL; > nbd_do_it(lo); > + /* on return tidy up in case we have a signal */ > + printk(KERN_WARNING "NBD: nbd_do_it returned\n"); > + /* Forcibly shutdown the socket causing all listeners > + * to error > + * > + * FIXME: This code is duplicated from sys_shutdown, but > + * there should be a more generic interface rather than > + * calling socket ops directly here */ > + lo->sock->ops->shutdown(lo->sock, 2); > + down(&lo->queue_lock); > + printk(KERN_WARNING "NBD: lock acquired\n"); > + nbd_clear_que(lo); > + file = lo->file; > + lo->file = NULL; > + lo->sock = NULL; > + up(&lo->queue_lock); > + if(file) > + fput(file); > return lo->harderror; > case NBD_CLEAR_QUE: > + down(&lo->queue_lock); > nbd_clear_que(lo); > + up(&lo->queue_lock); > return 0; > #ifdef PARANOIA > case NBD_PRINT_DEBUG: > @@ -492,7 +542,7 @@ > return -EIO; > } > #ifdef MODULE > - printk("nbd: registered device at major %d\n", MAJOR_NR); > + printk("nbd: (version Steeleye-8) registered device at major %d\n", MAJOR_NR); > #endif > blksize_size[MAJOR_NR] = nbd_blksizes; > blk_size[MAJOR_NR] = nbd_sizes; > @@ -507,7 +557,7 @@ > init_MUTEX(&nbd_dev[i].queue_lock); > nbd_blksizes[i] = 1024; > nbd_blksize_bits[i] = 10; > - nbd_bytesizes[i] = 0x7ffffc00; /* 2GB */ > + nbd_bytesizes[i] = ((u64)0x7ffffc00) << 10; /* 2TB */ > nbd_sizes[i] = nbd_bytesizes[i] >> BLOCK_SIZE_BITS; > register_disk(NULL, MKDEV(MAJOR_NR,i), 1, &nbd_fops, > nbd_bytesizes[i]>>9); > ---- > > /* > * Network Block Device - server > * > * Copyright 1996-1998 Pavel Machek, distribute under GPL > * <pavel@atrey.karlin.mff.cuni.cz> > * > * Version 1.0 - hopefully 64-bit-clean > * Version 1.1 - merging enhancements from Josh Parsons, <josh@coombs.anu.edu.au> > * Version 1.2 - autodetect size of block devices, thanx to Peter T. Breuer" <ptb@it.uc3m.es> > */ > > #define VERSION "1.3" > #define GIGA (1*1024*1024*1024) > > /* use lseek64 exclusively */ > #define _LARGEFILE_SOURCE // Some more functions for correct standard I/O. > #define _LARGEFILE64_SOURCE // Additional functionality from LFS for large files > > #include <sys/socket.h> > #include <sys/stat.h> > #include <netinet/tcp.h> > #include <netinet/in.h> /* sockaddr_in, htons, in_addr */ > #include <netdb.h> /* hostent, gethostby*, getservby* */ > #include <syslog.h> > #include <unistd.h> > #include <stdio.h> > #include <stdlib.h> > #include <string.h> > #include <fcntl.h> > #include <arpa/inet.h> > > #define _IO(a,b) > #define ISSERVER > #define MY_NAME "nbd_server" > #include "cliserv.h" > #undef _IO > /* Deep magic: ioctl.h defines _IO macro (at least on linux) */ > > #include <sys/ioctl.h> > #include <sys/mount.h> /* For BLKGETSIZE */ > > // #define DODBG > // #define DEBUG( a... ) printf( a ) > #define DEBUG( a... ) do { } while(0) > > > inline void readit(int f, void *buf, int len) > { > int res; > while (len > 0) { > DEBUG("*"); > if ((res = read(f, buf, len)) <= 0) > err("Read failed: %m"); > len -= res; > buf += res; > } > } > > inline void writeit(int f, void *buf, int len) > { > int res; > while (len > 0) { > DEBUG("+"); > if ((res = write(f, buf, len)) <= 0) > err("Write failed: %m"); > len -= res; > buf += res; > } > } > > int port; /* Port I'm listening at */ > char *exportname; /* File I'm exporting */ > u64 exportsize = ~0, hunksize = ~0; /* ...and its length */ > int flags = 0; > int export[1024]; > #define F_READONLY 1 > #define F_MULTIFILE 2 > > void cmdline(int argc, char *argv[]) > { > int i; > > if (argc < 3) { > printf("This is nbd-server version " VERSION "\n"); > printf("Usage: port file_to_export [size][kKmM] [-r]\n" > " -r read only\n" > " if port is set to 0, stdin is used (for running from inetd)\n" > " if file_to_export contains '%%s', it is substituted with IP\n" > " address of machine trying to connect\n" ); > exit(0); > } > port = atoi(argv[1]); > for (i = 3; i < argc; i++) { > if (*argv[i] == '-') { > switch (argv[i][1]) { > case 'r': > flags |= F_READONLY; > break; > case 'm': > flags |= F_MULTIFILE; > hunksize = 1*GIGA; > break; > } > } else { > u64 es; > int last = strlen(argv[i])-1; > char suffix = argv[i][last]; > if (suffix == 'k' || suffix == 'K' || > suffix == 'm' || suffix == 'M') > argv[i][last] = '\0'; > es = (u64)atol(argv[i]); > switch (suffix) { > case 'm': > case 'M': es <<= 10; > case 'k': > case 'K': es <<= 10; > default : break; > } > exportsize = es; > } > } > > exportname = argv[2]; > } > > int connectme(int port) > { > struct sockaddr_in addrin; > int addrinlen = sizeof(addrin); > int net, sock; > int size = 1; > > if ((sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)) < 0) > err("socket: %m"); > // SteelEye change - reuse the port number > if (setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &size, sizeof(int)) < 0) > err("setsockopt: %m"); > > DEBUG("Waiting for connections... bind, "); > addrin.sin_family = AF_INET; > addrin.sin_port = htons(port); > addrin.sin_addr.s_addr = 0; > if (bind(sock, (struct sockaddr *) &addrin, addrinlen) < 0) > err("bind: %m"); > DEBUG("listen, "); > if (listen(sock, 1) < 0) > err("listen: %m"); > DEBUG("accept, "); > if ((net = accept(sock, (struct sockaddr *) &addrin, &addrinlen)) < 0) > err("accept: %m"); > > return net; > } > > #define SEND writeit( net, &reply, sizeof( reply )); > #define ERROR { reply.error = htonl(-1); SEND; reply.error = 0; lastpoint = -1; } > > u64 lastpoint = -1; > > void maybeseek(int handle, u64 a) > { > if (a > exportsize) > err("Can not happen\n"); > if (lastpoint != a) { > if (lseek64(handle, a, SEEK_SET) < 0) > err("Can not seek locally!\n"); > lastpoint = a; > } else { > DEBUG("@"); > } > } > > int expread(u64 a, char *buf, int len) > { > maybeseek(export[a/hunksize], a%hunksize); > return (read(export[a/hunksize], buf, len) != len); > } > > int expwrite(u64 a, char *buf, int len) > { > maybeseek(export[a/hunksize], a%hunksize); > return (write(export[a/hunksize], buf, len) != len); > } > > int mainloop(int net) > { > struct nbd_request request; > struct nbd_reply reply; > char zeros[300]; > int i = 0; > u64 size_host; > > bzero(zeros, 290); > if (write(net, INIT_PASSWD, 8) < 0) > err("Negotiation failed: %m"); > cliserv_magic = htonll(cliserv_magic); > if (write(net, &cliserv_magic, sizeof(cliserv_magic)) < 0) > err("Negotiation failed: %m"); > size_host = htonll(exportsize); > if (write(net, &size_host, 8) < 0) > err("Negotiation failed: %m"); > if (write(net, zeros, 128) < 0) > err("Negotiation failed: %m"); > > DEBUG("Entering request loop!\n"); > reply.magic = htonl(NBD_REPLY_MAGIC); > reply.error = 0; > while (1) { > /* SteelEye change - need dynamic buffer to work with elevator */ > static long max_nbd_request=131072; /* 128K */ > static char *buf=NULL; > int len; > > #ifdef DODBG > i++; > printf("%d: ", i); > #endif > > readit(net, &request, sizeof(request)); > request.from = ntohll(request.from); > request.type = ntohl(request.type); > len = ntohl(request.len); > > if (request.magic != htonl(NBD_REQUEST_MAGIC)) > err("Not enough magic."); > > DEBUG("request len: %d bytes\n", len); > > while (len > max_nbd_request || !buf) { > /* SteelEye change - (re)allocate the buffer */ > if (buf) > free(buf); > if (len > max_nbd_request) > max_nbd_request = len; > buf=malloc(max_nbd_request); > if (!buf) > DEBUG("failed to allocate %d byte buffer\n", max_nbd_request); > } > #ifdef DODBG > printf("%s from %d (%d) len %d, ", (request.type ? "WRITE" : "READ"), > (int) request.from, (int) request.from / 512, len); > #endif > memcpy(reply.handle, request.handle, sizeof(reply.handle)); > if (((request.from + len) > exportsize) || > ((flags & F_READONLY) && request.type)) { > DEBUG("[RANGE!]"); > ERROR; > continue; > } > if (request.type) { /* WRITE */ > DEBUG("wr: net->buf, "); > readit(net, buf, len); > DEBUG("buf->exp, "); > if (expwrite(request.from, buf, len)) { > DEBUG("Write failed: %m" ); > ERROR; > continue; > } > lastpoint += len; > SEND; > continue; > } > /* READ */ > > DEBUG("exp->buf, "); > if (expread(request.from, buf + sizeof(struct nbd_reply), len)) { > lastpoint = -1; > DEBUG("Read failed: %m"); > ERROR; > continue; > } > lastpoint += len; > > DEBUG("buf->net, "); > memcpy(buf, &reply, sizeof(struct nbd_reply)); > writeit(net, buf, len + sizeof(struct nbd_reply)); > DEBUG("OK!\n"); > } > } > > char exportname2[1024]; > > void set_peername(int net) > { > struct sockaddr_in addrin; > int addrinlen = sizeof( addrin ); > char *peername; > > if (getpeername( net, (struct sockaddr *) &addrin, &addrinlen ) < 0) > err("getsockname failed: %m"); > peername = inet_ntoa(addrin.sin_addr); > sprintf(exportname2, exportname, peername); > > syslog(LOG_INFO, "connect from %s, assigned file is %s", peername, exportname2); > } > > u64 size_autodetect(int export) > { > u64 es; > DEBUG("looking for export size with lseek SEEK_END\n"); > if ((es = lseek64(export, 0, SEEK_END)) == MINUS_ONE_64 || es == 0) { > struct stat stat_buf = { 0, } ; > int error; > DEBUG("looking for export size with fstat\n"); > if ((error = fstat(export, &stat_buf)) == -1 || stat_buf.st_size == 0 ) { > DEBUG("looking for export size with ioctl BLKGETSIZE\n"); > #ifdef BLKGETSIZE > if(ioctl(export, BLKGETSIZE, &es) || es == 0) { > #else > if(1){ > #endif > err("Could not find size of exported block device: %m"); > } else { > es *= 512; /* assume blocksize 512 */ > } > } else { > es = stat_buf.st_size; > } > } > return es; > } > > int main(int argc, char *argv[]) > { > int net; > u64 i; > > if (sizeof( struct nbd_request )!=28) > err("Bad size of structure. Alignment problems?"); > > logging(); > cmdline(argc, argv); > > if (port) > net = connectme(port); > else > net = 0; > set_peername(net); > > for (i=0; i<exportsize; i+=hunksize) { > char exportname3[1024]; > > sprintf(exportname3, exportname2, i/hunksize); > printf( "Opening %s\n", exportname3 ); > if ((export[i/hunksize] = open(exportname3, (flags & F_READONLY) ? O_RDONLY : O_RDWR)) == -1) > err("Could not open exported file: %m"); > } > > if (exportsize == ~0) { > exportsize = size_autodetect(export[0]); > } > if (exportsize > (~0UL >> 1)) > if ((exportsize >> 10) > (~0UL >> 1)) > syslog(LOG_INFO, "size of exported file/device is %luMB", > (unsigned long)(exportsize >> 20)); > else > syslog(LOG_INFO, "size of exported file/device is %luKB", > (unsigned long)(exportsize >> 10)); > else > syslog(LOG_INFO, "size of exported file/device is %lu", > (unsigned long)exportsize); > setmysockopt(net); > > mainloop(net); > return 0; > } > ---- > > /* This header file is shared by client & server. They really have > * something to share... > * */ > > /* Client/server protocol is as follows: > Send INIT_PASSWD > Send 64-bit cliserv_magic > Send 64-bit size of exported device > Send 128 bytes of zeros (reserved for future use) > */ > > #include "config.h" > #include <errno.h> > #include <string.h> > #include <netdb.h> > > #if SIZEOF_UNSIGNED_SHORT_INT==4 > typedef unsigned short u32; > #elif SIZEOF_UNSIGNED_INT==4 > typedef unsigned int u32; > #elif SIZEOF_UNSIGNED_LONG_INT==4 > typedef unsigned long u32; > #else > #error I need at least some 32-bit type > #endif > > #if SIZEOF_UNSIGNED_INT==8 > typedef unsigned int u64; > #define MINUS_ONE_64 (-1U) > #elif SIZEOF_UNSIGNED_LONG_INT==8 > typedef unsigned long u64; > #define MINUS_ONE_64 (-1UL) > #elif SIZEOF_UNSIGNED_LONG_LONG_INT==8 > typedef unsigned long long u64; > #define MINUS_ONE_64 (-1ULL) > #else > #error I need at least some 64-bit type > #endif > > #include "nbd.h" > > u64 cliserv_magic = 0x00420281861253LL; > #define INIT_PASSWD "NBDMAGIC" > > #define INFO(a) do { } while(0) > > void setmysockopt(int sock) > { > int size = 1; > #if 0 > if (setsockopt(sock, SOL_SOCKET, SO_SNDBUF, &size, sizeof(int)) < 0) > INFO("(no sockopt/1: %m)"); > #endif > size = 1; > if (setsockopt(sock, SOL_TCP, TCP_NODELAY, &size, sizeof(int)) < 0) > INFO("(no sockopt/2: %m)"); > #if 0 > size = 1024; > if (setsockopt(sock, SOL_TCP, TCP_MAXSEG, &size, sizeof(int)) < 0) > INFO("(no sockopt/3: %m)"); > #endif > } > > void err(const char *s) > { > const int maxlen = 150; > char s1[maxlen], *s2; > int n = 0; > > strncpy(s1, s, maxlen); > if (s2 = strstr(s, "%m")) { > strcpy(s1 + (s2 - s), sys_errlist[errno]); > s2 += 2; > strcpy(s1 + strlen(s1), s2); > } > else if (s2 = strstr(s, "%h")) { > strcpy(s1 + (s2 - s), hstrerror(h_errno)); > s2 += 2; > strcpy(s1 + strlen(s1), s2); > } > > s1[maxlen-1] = '\0'; > #ifdef ISSERVER > syslog(LOG_ERR, s1); > #else > fprintf(stderr, "Error: %s\n", s1); > #endif > exit(1); > } > > void logging(void) > { > #ifdef ISSERVER > openlog(MY_NAME, LOG_PID, LOG_DAEMON); > #endif > setvbuf(stdout, NULL, _IONBF, 0); > setvbuf(stderr, NULL, _IONBF, 0); > } > > #ifdef WORDS_BIGENDIAN > u64 ntohll(u64 a) > { > return a; > } > #else > u64 ntohll(u64 a) > { > u32 lo = a & 0xffffffff; > u32 hi = a >> 32U; > lo = ntohl(lo); > hi = ntohl(hi); > return ((u64) lo) << 32U | hi; > } > #endif > #define htonll ntohll -- ------------------------------- Edward Muller Director of IS 973-715-0230 (cell) 212-487-9064 x115 (NYC) http://www.learningpatterns.com ------------------------------- ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Current NBD 'stuff' 2001-12-04 23:12 ` Edward Muller @ 2001-12-05 16:14 ` Paul Clements 2001-12-05 16:44 ` Peter T. Breuer 2001-12-05 21:02 ` Peter T. Breuer 0 siblings, 2 replies; 11+ messages in thread From: Paul Clements @ 2001-12-05 16:14 UTC (permalink / raw) To: Edward Muller; +Cc: Paul.Clements, linux-kernel On 4 Dec 2001, Edward Muller wrote: > Actually I am playing with ENBD now. Yep. I've looked at that too. > I think ENBD is targeted for inclusion in the kernel in 2.5, but it can > be found seperatly (sp) at http://www.it.uc3m.es/~ptb/nbd/ > > It looks much better than the nbd stuff that is currently in the kernel. A word of caution on this. I played around with ENBD (as well as some others) about 6 months ago. I also did some performance testing with the different drivers and user-level utilities. What I found was that ENBD achieved only about 1/3 ~ 1/4 the throughput of NBD (even with multiple replication paths and various block sizes). YMMV. I also looked at DRBD, which performed pretty well (comparable to NBD). > But that's mostly because Pavel doesn't have much time at the moment for > it AFAIK. Yeah. I wish I had the time to develop/maintain a network block device driver myself...but unfortunately I don't... :/ -- Paul Clements Paul.Clements@SteelEye.com ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Current NBD 'stuff' 2001-12-05 16:14 ` Paul Clements @ 2001-12-05 16:44 ` Peter T. Breuer 2001-12-05 21:02 ` Peter T. Breuer 1 sibling, 0 replies; 11+ messages in thread From: Peter T. Breuer @ 2001-12-05 16:44 UTC (permalink / raw) To: Paul.Clements; +Cc: linux kernel "Paul Clements wrote:" > On 4 Dec 2001, Edward Muller wrote: > > Actually I am playing with ENBD now. > Yep. I've looked at that too. > > I think ENBD is targeted for inclusion in the kernel in 2.5, but it can > > be found seperatly (sp) at http://www.it.uc3m.es/~ptb/nbd/ > > > > It looks much better than the nbd stuff that is currently in the kernel. > > A word of caution on this. I played around with ENBD (as well as some > others) about 6 months ago. I also did some performance testing with > the different drivers and user-level utilities. What I found was that > ENBD achieved only about 1/3 ~ 1/4 the throughput of NBD (even with > multiple replication paths and various block sizes). YMMV. It probably is much slower, because it does networking from userspace (permitting things like ssl channels, automatic reconnects and other fallover-like things). Nevertheless, I get about 18MB/s writing to localhost on my 366MHz portable, and about 5MB/s doing raid resync across NBD to scsi devices across 100BT. Come to think of it, maybe that IS the speed of the scsi devices, they're a raid5 assembly running as a raid0 component connected by NBD .... Here's the printout from the device itself in my max speed test on the portable in my hands. It seems to be still running (kernel 2.4.3 plus xfs) ... [b] B/s max: 18.6M (0R+18.6MW) [b] Spectrum: 70%23 15%102 12%252 ... (uh, the last line is the size-of-sent-request spectrum, data in 1K blocks - it looks like I thumped it with 16MB of data to write at once in the test and varied the max limit continuously as I did so, but there was only time for three limit changes) Read is always fast, of course. > I also looked at DRBD, which performed pretty well (comparable to NBD). But then you are talking about kernel 2.2, not kernel 2.4, surely? There is now a -pre for DRDB under kernels 2.4, but I didn't think there was several months ago. > > But that's mostly because Pavel doesn't have much time at the moment for > > it AFAIK. > > Yeah. I wish I had the time to develop/maintain a network block > device driver myself...but unfortunately I don't... :/ The real difficulty is in user space with the reconnect strategies and what to do with the various weird tcp states you can get into. The test above was trying to produce a classic deadlock to localhost, but didn't "succeed". Peter ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Current NBD 'stuff' 2001-12-05 16:14 ` Paul Clements 2001-12-05 16:44 ` Peter T. Breuer @ 2001-12-05 21:02 ` Peter T. Breuer 2001-12-05 22:30 ` Paul Clements 1 sibling, 1 reply; 11+ messages in thread From: Peter T. Breuer @ 2001-12-05 21:02 UTC (permalink / raw) To: Paul.Clements; +Cc: Edward Muller, linux-kernel "A month of sundays ago Paul Clements wrote:" > On 4 Dec 2001, Edward Muller wrote: > > A word of caution on this. I played around with ENBD (as well as some > others) about 6 months ago. I also did some performance testing with > the different drivers and user-level utilities. What I found was that > ENBD achieved only about 1/3 ~ 1/4 the throughput of NBD (even with > multiple replication paths and various block sizes). YMMV. It strikes me that possibly you were using the 2.2 kernel. My logic is that (1) nowadays kernels coalesce all requests into large lumps - limited only by the drivers wishes - before the driver gets them, (2) I don't think I ever managed to get req merging working in kernel 2.2, but now the kernel does it for free. When nbd sets the limit as 256KB, it gets 256KB sized requests to treat. Did you see the req size distribution in my previous reply? It was flat-out at the size limit every time. So whatever time is spent in the kernel or in userspace per req (possibly the context switch is still significant, but make the lumps bigger then ..) is dwarfed by the time spent in kernel networking, making the lumps go out and come back from the net. On 100BT we are talking about 1/4s in networking, per request. If we waste 10m/s over pavel in actual coding and context switches (ridiculous!) we lose only 4% in speed, not 75%! So I think you could compile in visual basic and get no variance in speed at the client end, at least. That leaves the server net-to-disk time to contend with. I don't know about that end. But Enbd does not do anything different in principle from what kernel nbd does. It might well be slower because the code is heavily layered. I suspect that at that end transfers are done at the local disk blocksize, which may be small enough to make code differences noticable. But in general I find that Enbd goes either at the speed of the net or at the speed of the remote disk, whichever is slower. It also uses a trick when writing that usually results in exceeding the cable bandwidth by a factor of two during raid resyncs over nbd. > I also looked at DRBD, which performed pretty well (comparable to NBD). > > > But that's mostly because Pavel doesn't have much time at the moment for > > it AFAIK. Peter ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Current NBD 'stuff' 2001-12-05 21:02 ` Peter T. Breuer @ 2001-12-05 22:30 ` Paul Clements 0 siblings, 0 replies; 11+ messages in thread From: Paul Clements @ 2001-12-05 22:30 UTC (permalink / raw) To: Peter T. Breuer; +Cc: Paul.Clements, Edward Muller, linux-kernel On Wed, 5 Dec 2001, Peter T. Breuer wrote: > It strikes me that possibly you were using the 2.2 kernel. Yes, the performance tests were on 2.2 -- and surely things have changed in 2.4 -- most notably, as you mention, the request merging stuff. > But in general I find that Enbd goes > either at the speed of the net or at the speed of the remote disk, > whichever is slower. Well, that's as good as it gets. I had noticed that NBD on 2.2 was also capable of writing at speeds just under the TCP bandwidth with 100Mb/s ethernet. (I wonder if it would scale nicely to 1Gb/s?) -- Paul Clements Paul.Clements@SteelEye.com ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Current NBD 'stuff' 2001-12-04 22:26 ` Paul Clements 2001-12-04 23:12 ` Edward Muller @ 2001-12-06 13:02 ` Pavel Machek 2001-12-06 22:13 ` [PATCH] " Paul Clements 2001-12-22 19:48 ` David Chow 1 sibling, 2 replies; 11+ messages in thread From: Pavel Machek @ 2001-12-06 13:02 UTC (permalink / raw) To: Paul.Clements; +Cc: Edward Muller, linux-kernel Hi > > > Anyone know where I can find the latest NBD stuff? Esp. client/server > > code? > > I have the same question. Maybe the user-level stuff is not being > actively maintained? > > However, since we couldn't find current versions of this stuff, > my colleagues and I patched nbd-server and the nbd kernel module > to fix a few bugs and to make them a little more robust. I'll > attach my versions of those files (which I think are derived from > Pavel's .14.tar.gz versions). Do not comment code by //. Kill if it you want to. You added clean way to stop nbd. Good. DO NOT USE ALL CAPITALS not even in printks(). Fix those and patch looks ike good idea for 2.5. Look at nbd.sf.net. If you have patches against that, mail them to me. If you are willing to co-develop stuff at nbd.sf.net, I guess we can arrange something. Pavel -- Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt, details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] Re: Current NBD 'stuff' 2001-12-06 13:02 ` Pavel Machek @ 2001-12-06 22:13 ` Paul Clements 2001-12-22 19:48 ` David Chow 1 sibling, 0 replies; 11+ messages in thread From: Paul Clements @ 2001-12-06 22:13 UTC (permalink / raw) To: Pavel Machek; +Cc: Paul.Clements, Edward Muller, linux-kernel, james.bottomley [-- Attachment #1: Type: TEXT/PLAIN, Size: 488 bytes --] Pavel, Here is the patch against 2.4.16. Please consider for inclusion in 2.4 series kernel. We've been running this code for several months now and it is working very well. On Thu, 6 Dec 2001, Pavel Machek wrote: > Do not comment code by //. Kill if it you want to. > > You added clean way to stop nbd. Good. > > DO NOT USE ALL CAPITALS not even in printks(). > > Fix those and patch looks ike good idea for 2.5. OK, done. Thanks, -- Paul Clements Paul.Clements@SteelEye.com [-- Attachment #2: nbd patch --] [-- Type: TEXT/PLAIN, Size: 4157 bytes --] --- linux-2.4.16/drivers/block/nbd.c.orig Fri Oct 26 18:39:02 2001 +++ linux-2.4.16/drivers/block/nbd.c Thu Dec 6 17:50:32 2001 @@ -24,7 +24,9 @@ * 01-3-11 Make nbd work with new Linux block layer code. It now supports * plugging like all the other block devices. Also added in MSG_MORE to * reduce number of partial TCP segments sent. <steve@chygwyn.com> - * + * 01-12-06 Make nbd cleanly killable; fix some locking issues; acknowledge + * and log network errors; make default device size 2TB. + * <James.Bottomley@SteelEye.com> <Paul.Clements@SteelEye.com> * possible FIXME: make set_sock / set_blksize / set_size / do_it one syscall * why not: would need verify_area and friends, would share yet another * structure with userland @@ -94,18 +96,10 @@ int result; struct msghdr msg; struct iovec iov; - unsigned long flags; - sigset_t oldset; oldfs = get_fs(); set_fs(get_ds()); - spin_lock_irqsave(¤t->sigmask_lock, flags); - oldset = current->blocked; - sigfillset(¤t->blocked); - recalc_sigpending(current); - spin_unlock_irqrestore(¤t->sigmask_lock, flags); - do { sock->sk->allocation = GFP_NOIO; @@ -125,6 +119,12 @@ else result = sock_recvmsg(sock, &msg, size, 0); + if(signal_pending(current)) { + printk(KERN_WARNING "NBD caught signal\n"); + result = -EINTR; + break; + } + if (result <= 0) { #ifdef PARANOIA printk(KERN_ERR "NBD: %s - sock=%ld at buf=%ld, size=%d returned %d.\n", @@ -136,11 +136,6 @@ buf += result; } while (size > 0); - spin_lock_irqsave(¤t->sigmask_lock, flags); - current->blocked = oldset; - recalc_sigpending(current); - spin_unlock_irqrestore(¤t->sigmask_lock, flags); - set_fs(oldfs); return result; } @@ -336,8 +331,27 @@ spin_unlock_irq(&io_request_lock); down (&lo->queue_lock); + if(!lo->file) { + up(&lo->queue_lock); + spin_lock_irq(&io_request_lock); + printk(KERN_ERR "NBD: fail between accept and semaphore, file lost\n"); + req->errors++; + nbd_end_request(req); + continue; + } + list_add(&req->queue, &lo->queue_head); nbd_send_req(lo->sock, req); /* Why does this block? */ + if(req->errors) { + printk(KERN_ERR "NBD: nbd_send_req failed\n"); + list_del(&req->queue); + + up(&lo->queue_lock); + spin_lock_irq(&io_request_lock); + nbd_end_request(req); + + continue; + } up (&lo->queue_lock); spin_lock_irq(&io_request_lock); @@ -387,12 +401,14 @@ printk(KERN_ERR "nbd: Some requests are in progress -> can not turn off.\n"); return -EBUSY; } - up(&lo->queue_lock); file = lo->file; - if (!file) + if (!file) { + up(&lo->queue_lock); return -EINVAL; + } lo->file = NULL; lo->sock = NULL; + up(&lo->queue_lock); fput(file); return 0; case NBD_SET_SOCK: @@ -433,9 +449,29 @@ if (!lo->file) return -EINVAL; nbd_do_it(lo); + /* on return tidy up in case we have a signal */ + printk(KERN_WARNING "NBD: nbd_do_it returned\n"); + /* Forcibly shutdown the socket causing all listeners + * to error + * + * FIXME: This code is duplicated from sys_shutdown, but + * there should be a more generic interface rather than + * calling socket ops directly here */ + lo->sock->ops->shutdown(lo->sock, 2); + down(&lo->queue_lock); + printk(KERN_WARNING "NBD: lock acquired\n"); + nbd_clear_que(lo); + file = lo->file; + lo->file = NULL; + lo->sock = NULL; + up(&lo->queue_lock); + if(file) + fput(file); return lo->harderror; case NBD_CLEAR_QUE: + down(&lo->queue_lock); nbd_clear_que(lo); + up(&lo->queue_lock); return 0; #ifdef PARANOIA case NBD_PRINT_DEBUG: @@ -512,7 +548,7 @@ init_MUTEX(&nbd_dev[i].queue_lock); nbd_blksizes[i] = 1024; nbd_blksize_bits[i] = 10; - nbd_bytesizes[i] = 0x7ffffc00; /* 2GB */ + nbd_bytesizes[i] = ((u64)0x7ffffc00) << 10; /* 2TB */ nbd_sizes[i] = nbd_bytesizes[i] >> BLOCK_SIZE_BITS; register_disk(NULL, MKDEV(MAJOR_NR,i), 1, &nbd_fops, nbd_bytesizes[i]>>9); ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Current NBD 'stuff' 2001-12-06 13:02 ` Pavel Machek 2001-12-06 22:13 ` [PATCH] " Paul Clements @ 2001-12-22 19:48 ` David Chow 1 sibling, 0 replies; 11+ messages in thread From: David Chow @ 2001-12-22 19:48 UTC (permalink / raw) To: Pavel Machek; +Cc: linux-kernel Hi, I am also looking for stuff NBD related, my first intention is studying on nfsswap, but I've heard NBD is more efficient, is there any port of nbd swap in the recent 2.4.1x kernels? I may do an up port is the code is old, but hopefully something in 2.4 not 2.0 and 2.2 . Thanks. regards, David Pavel Machek wrote: >Hi > >>>Anyone know where I can find the latest NBD stuff? Esp. client/server >>>code? >>> >>I have the same question. Maybe the user-level stuff is not being >>actively maintained? >> >>However, since we couldn't find current versions of this stuff, >>my colleagues and I patched nbd-server and the nbd kernel module >>to fix a few bugs and to make them a little more robust. I'll >>attach my versions of those files (which I think are derived from >>Pavel's .14.tar.gz versions). >> > >Do not comment code by //. Kill if it you want to. > >You added clean way to stop nbd. Good. > >DO NOT USE ALL CAPITALS not even in printks(). > >Fix those and patch looks ike good idea for 2.5. > >Look at nbd.sf.net. If you have patches against that, mail them to me. If >you are willing to co-develop stuff at nbd.sf.net, I guess we can arrange >something. > > Pavel > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Current NBD 'stuff' 2001-12-03 18:02 Current NBD 'stuff' Edward Muller 2001-12-04 22:26 ` Paul Clements @ 2001-12-06 12:54 ` Pavel Machek 1 sibling, 0 replies; 11+ messages in thread From: Pavel Machek @ 2001-12-06 12:54 UTC (permalink / raw) To: Edward Muller; +Cc: linux-kernel Hi! > Not 100% kernel related ... but ... > > Anyone know where I can find the latest NBD stuff? Esp. client/server > code? > > I looked at Pavel's website and the nbd.14.tar.gz file are from '98 and > '99. Look at nbd.sf.net. Pavel -- Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt, details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2001-12-22 19:53 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2001-12-03 18:02 Current NBD 'stuff' Edward Muller 2001-12-04 22:26 ` Paul Clements 2001-12-04 23:12 ` Edward Muller 2001-12-05 16:14 ` Paul Clements 2001-12-05 16:44 ` Peter T. Breuer 2001-12-05 21:02 ` Peter T. Breuer 2001-12-05 22:30 ` Paul Clements 2001-12-06 13:02 ` Pavel Machek 2001-12-06 22:13 ` [PATCH] " Paul Clements 2001-12-22 19:48 ` David Chow 2001-12-06 12:54 ` Pavel Machek
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox