[BUGS] SCSI hardware failure?
jonathan michaels
jlm at caamora.com.au
Mon Jan 14 19:38:11 EST 2008
greetings all,
On Mon, Jan 14, 2008 at 03:04:12PM +1100, Callum Gibson wrote:
> On 14Jan08 14:43, Tong Wang wrote:
=== various bits trimmed for pick a reason.
> }I am really trying to get AMANDA emails back. As for the reason why I
> }relate the disk error with this, because I tried to reboot the server, with
> }the first try failed saying:
> }
> }Missing Operating System
not a good sign, these machines have a ms dos partition to hold
teh ms windows toolset that fixes/setups the machine and
configures teh relevent parts .. it looks like this could have
evaporated as part of teh bad media report.
i am not sure how freebsd reports that kind of a situation ..
any thoughts callum ??
> }On the second try, it halted half way, with the following message:
> }
> }Drive on AIC-7902 B at slot 00, 08:07:01, SCSI ID: 0 has exceeded failure
> }prediction threshold.
>
> That's really bad. You need to replace that quick I reckon. So you have
> rebooted now, at least.
worse, i think .. i haven't looked this up on google (i go
thereunless i ABSOLUTELY have too) but from memory this looks
like an embedded scsi chipset and as such means that teh
motherboard needs to be replaced .. service contract time
hopefully
> }This is why I relate these two issues together, maybe I'm wrong though.
>
> Yeah, hard to pinpoint what is stuffed now though. It could just be your
> mail system or just the permissions on /tmp. I still tend to think that
> disk errors should result in a total failure, rather than something odd
> like /tmp changing permission, or strange corruption throughout the
> filesystem, although I did see just that on a Solaris 8 system recently.
> And if you were getting corruption, you might expect whole subsystems to
> fail, for example. In my case, I got core-dumping binaries, libraries
> you couldn't link against, etc.
>
> }df gives the following:
> }
> }Filesystem 1K-blocks Used Avail Capacity Mounted on
> }/dev/mirror/gm0s1a 495726 107454 348614 24% /
> }devfs 1 1 0 100% /dev
> }/dev/mirror/gm0s1d 2026030 78516 1785432 4% /var
> }/dev/mirror/gm0s1e 2026030 997190 866758 53% /tmp
> }/dev/mirror/gm0s1f 10154158 1415242 7926584 15% /usr
> }/dev/stripe/stripe0s1a 800991544 576215642 216765988 73% /export
just a small hint "df(8) -h" gives a more reasonable/readable
output for example,
Filesystem Size Used Avail Capacity Mounted on
/dev/idad0s1a 248M 35M 193M 15% /
devfs 1.0K 1.0K 0B 100% /dev
/dev/idad0s1d 4.8G 3.0G 1.5G 67% /home
/dev/idad0s1f 1.9G 9.1M 1.8G 1% /tmp
/dev/idad0s1e 15G 2.8G 11G 21% /usr
/dev/idad0s1g 4.8G 19M 4.4G 0% /var
/dev/idad0s1h 4.5G 4.0K 4.2G 0% /var/mail
devfs 1.0K 1.0K 0B 100% /var/named/dev
[caamora] ~>
unless you need to know how many blocks you have on the
platters, tha above is a raid'd (just raid5, with tape drive i
do not see teh point in raid 1+0) from a couple of 9 gb drives
then split up by fdisk.
> Particularly since you're using mirroring/striping, perhaps the real
> drive errors are being hidden under this software and it's incorrectly
> reporting or hiding the failures. I dunno - I haven't used that stuff
the raid subsystem usuall has its own reporting system, perhaps
its time to go looking at teh online (i mean the machines own
reporting system) as per teh proliant with its "surestart"
cdrom environment.
> before (although I use amanda on a 5.4 system regularly, as it happens).
> Perhaps someone with more of a clue will pipe up on this.
i'm planing on using amanda as onn as i can relocate a couple
of 70 gb dlt-7000 tape-drives into a smaller box teh 50+ kg
desktop they are in at teh moment is a bit awkward to move
around <GRIN> i'm not a strong as i wonce was .. ah thats life.
sorry for teh potential bad news it looks like its time to make
sure teh backups are as reliable as all the advertising claims,
been there done that ..
best wishes for teh new year and teh just past christmas
much kind regards
jonathan
--
================================================================
powered by ..
QNX, OS9 and freeBSD -- http://caamora com au/operating system
==== === appropriate solution in an inappropriate world === ====
More information about the BUGS
mailing list