From: Larry McVoy <lm@bitmover.com>
To: linux-kernel@vger.kernel.org
Subject: Re: bkbits.net is down
Date: Sun, 22 Jun 2003 22:37:13 -0700 [thread overview]
Message-ID: <20030623053713.GA6715@work.bitmover.com> (raw)
In-Reply-To: <20030622002614.GA16225@work.bitmover.com>
It should be back up now. Finally. There may be permission problems
that slipped through the cracks. Send mail to support@bitmover.com
if you can't pull or push.
On Sat, Jun 21, 2003 at 05:26:14PM -0700, Larry McVoy wrote:
> On Sat, Jun 21, 2003 at 12:09:44PM -0700, Larry McVoy wrote:
> > On Sat, Jun 21, 2003 at 06:58:12AM -0700, Larry McVoy wrote:
> > > I'm tracking down the problem, it's been off the air for most of the night it
> > > appears. I may upgrade the processor and motherboard. If I do that, it
> > > will be off the air for a bit longer.
> >
> > I'm still at the office working on this. Both the main and the backup drives
> > are not checking cleanly. It's going to be several more hours before it is
> > back up at this rate.
>
> Still working on it but I'm starting to get somewhere. There is something
> very strange or flakey about Samsung 80GB IDE drives. The symptoms are that
> on some controllers the drive gets all sorts of errors. Running under
> the most recent (on bkbits.net at least) linux-2.5 tree, if I put the drive
> on a Serverworks IDE interface (Tyan dual PIII, I think a 2150?) then the
> drive looks like it is just trashed, lots of fsck errors. I pulled it and
> tried on an ECS el cheopo motherboard and that failed too. I then thought
> that maybe the deal was that I had partitioned it under a 3ware and I was
> trying to fsck it on a normal IDE (hey, I was grabbing for an answer) so
> I stuck a 3ware into the el cheapo box. Still didnt work. OK, I tried
> another ECS el cheapo and this one works.
>
> By the way, "didn't work" meant that it would get through the fsck enough
> to restart the fsck from the beginning and then somewhere along the way
> the fsck would cause the system to reboot. Nice. It took several tries
> to figure that out, I eventually resorted to video taping the screen to
> find out what happened (it takes an hour to fsck this drive so I'd be
> reading mail and looking over my shoulder about every minute trying to
> catch it and of course I always missed it) and all I got was stuff that
> looked like the kernel was hosed, sendmail started crapping out and I
> know it wasn't doing anything. I have the video if someone wants it,
> this was Red Hat 8.0 generic, so 2.4.19 I think.
>
> OK, finally clean fsck on the second el cheapo, move it over to the Tyan
> and try again. Disk drive go kaboom. I'm starting to get pissed, this
> is my bloody Saturday, I promised my kids we'd play together, I'm grumpy,
> my wife keeps calling to ask when we are going to the beach, shit this
> just sucks, I need an sys admin that I trust to do this stuff. It's
> beyond lame that I don't have one. Any good sys admins out there in
> the Bay Area? Call me, now is a good time to negotiate a good package.
> Deep breath, don't get pissed, that's how you make things worse. OK,
> back at it. Pull the drive, plug in a Promise card, stick the drive on
> that and pray that it works. Whoops, didn't compile that in, recompile,
> reboot, man I hate American Megatrends, it takes forever to do a warm
> reboot. Linux BIOS, where are you? OK, it sees it, do the fsck, wait
> an hour, whoohoo! We're clean.
>
> Time to think about what to do. I don't trust the Samsung even though it
> says it is OK, too many problems. Another deep breath, call the local
> suppliers, yeah, they got some 80GB drives, we're at 40GB so that seems
> cool, head off to the store to buy some new drives. Shit. The store
> lied and they were out of stock. Buy some 120GB? Nah, if we get to
> that much data on bkbits.net I want it spread over multiple machines
> so I'm not stuck in the machine room for 40 hours the next time this
> crap happens. It sucks to be me some days, it really does. Go back,
> steal a 80GB Western Digital drive from one of the desktops, stick it in.
> By the way, Western Digital, if you ever want an endorsement I'm your man,
> every other drive company has screwed me at least once and you never have.
> Your drives rock, they behave well under benchmarks and they behave well
> in the real world and I have the data to prove it. And, best of all,
> your drives fail nicely, blocks start going bad but you can get 99.9%
> of the data off, very nice. Good job.
>
> Plug in the WD 80GB, write a script to start cloning the repositories,
> that's easy, it's running, and I'm typing in this mail as something to
> do while it runs. Hence the verboseness. And in spite of that we are
> only up to linux-ajc (who's ajc?). But we're getting there. My guess
> is that this is going to run for a few more hours. I've been here since
> 7am this morning, I'm going out to get plastered and I'll put the rest
> of this back together tomorrow.
>
> I wasn't kidding about that sys admin job, I'd love for this to be
> someone else's problem. In theory I'm supposed to be a CEO who plays
> golf games and cuts multi-million dollar deals for development tools.
> I still need to learn how to play golf so I could use some help, right?
> The problem is that I want things fixed right so that problems don't come
> back and I don't trust lame people to do that so I end up doing a lot of
> stuff myself. If you have an ego that won't quit because you could kick
> my pathetic sys admin ass all over the place, you're who I want to hire.
> Of course you need to be able to take a lot of shit because BK isn't
> politically correct in the open source world :-(
>
> I'm outta here to drink some beer, ETA on bkbits being back online is
> some time tomorrow. Sorry about the delay. For what it is worth, we are
> in the process of setting up an India based development effort which is
> going to take this over and make it work better. We really want to be
> in a place where when something goes wrong we change some DNS entries
> and we're back on line. We're not there yet but we are working on it.
> --
> ---
> Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
next prev parent reply other threads:[~2003-06-23 5:23 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-06-21 13:58 bkbits.net is down Larry McVoy
2003-06-21 19:09 ` Larry McVoy
2003-06-22 0:26 ` Larry McVoy
2003-06-23 5:37 ` Larry McVoy [this message]
2003-06-25 1:33 ` Larry McVoy
2003-06-25 17:26 ` Larry McVoy
2003-06-25 17:51 ` Larry McVoy
2003-06-26 13:33 ` Henning P. Schmiedehausen
2003-06-26 21:17 ` Vojtech Pavlik
2003-06-26 21:21 ` Larry McVoy
2003-06-26 22:25 ` Jonathan Lundell
2003-06-27 10:53 ` Alan Cox
2003-06-27 14:57 ` Larry McVoy
2003-06-27 15:06 ` Mike Dresser
2003-06-27 18:30 ` Mr. James W. Laferriere
2003-06-27 15:44 ` Alan Cox
2003-07-01 14:50 ` Larry McVoy
2003-07-01 17:07 ` Mr. James W. Laferriere
2003-06-27 16:28 ` nick
2003-06-27 16:37 ` CaT
2003-06-27 16:54 ` Alan Cox
2003-06-27 22:12 ` Larry McVoy
2003-06-27 22:15 ` Larry McVoy
2003-06-28 8:14 ` Vojtech Pavlik
2003-06-28 13:41 ` Larry McVoy
2003-06-27 23:27 ` Larry McVoy
2003-06-27 23:51 ` Larry McVoy
2003-06-27 23:55 ` Patrick Mansfield
2003-06-28 0:16 ` Larry McVoy
2003-06-28 0:51 ` Scott McDermott
2003-06-28 3:19 ` Larry McVoy
2003-06-28 4:04 ` Mr. James W. Laferriere
2003-06-28 4:08 ` Joshua Penix
2003-06-28 4:21 ` Larry McVoy
2003-06-28 5:42 ` Frank Cusack
2003-06-28 5:50 ` Larry McVoy
2003-06-28 7:33 ` David Lang
2003-07-01 15:46 ` vlad
2003-06-28 8:08 ` Scott McDermott
2003-06-28 14:07 ` Larry McVoy
2003-06-28 19:50 ` Valdis.Kletnieks
2003-06-28 19:14 ` Alan Cox
2003-06-28 19:18 ` Larry McVoy
2003-06-28 19:38 ` Dr. David Alan Gilbert
2003-06-28 19:47 ` Larry McVoy
2003-06-28 20:31 ` Alan Cox
2003-06-28 20:31 ` Willy Tarreau
2003-06-28 20:55 ` David Schwartz
2003-06-28 21:00 ` Larry McVoy
2003-06-28 21:06 ` Abramo Bagnara
2003-06-28 22:15 ` Dr. David Alan Gilbert
2003-06-28 22:55 ` Oliver Neukum
2003-06-28 23:13 ` Alan Cox
2003-07-01 20:42 ` Zed Pobre
2003-06-29 6:24 ` Daniel Egger
2003-06-29 10:24 ` Mr. James W. Laferriere
2003-06-29 13:14 ` Daniel Egger
2003-06-30 15:16 ` Andrew Ryan
2003-06-29 15:08 ` Brian Jackson
2003-06-28 20:52 ` Valdis.Kletnieks
2003-06-29 20:18 ` Pau Aliagas
2003-07-03 5:08 ` Larry McVoy
2003-06-28 0:01 ` Mike Dresser
2003-06-28 0:07 ` Mike Dresser
2003-06-28 0:24 ` Mike Dresser
2003-06-26 21:35 ` Joel Jaeggli
-- strict thread matches above, loose matches on Subject: below --
2003-06-29 3:34 David Brownell
2003-06-29 3:42 Hemmann, Volker Armin
2005-04-12 2:17 Larry McVoy
2005-04-12 11:10 ` Marcin Dalecki
2005-04-12 12:33 ` Alexander Nyberg
2005-04-12 12:38 ` Diego Calleja
2005-04-12 15:19 ` Jesper Juhl
2005-04-12 21:10 ` Toon van der Pas
2005-04-13 9:16 ` Geert Uytterhoeven
2005-04-13 10:55 ` Stefan Smietanowski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20030623053713.GA6715@work.bitmover.com \
--to=lm@bitmover.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox