[BUGS] ZFS dis-integration

Andrew Sinclair syncman0x at gmail.com
Sun Jul 11 03:42:27 EST 2010


On 10 July 2010 08:10, Peter Jeremy <peterjeremy at acm.org> wrote:
> On 2010-Jul-10 06:01:14 +1000, Andrew Sinclair <syncman0x at gmail.com> wrote:
>>I've heard rumors that ZFS is unstable. It is not my place to say why,
>>or even who would say this but...
>
> I've found the opposite and had more recent problems with UFS than
> ZFS.  That said, my latest scrub reported 3 "unrecoverable" files for
> reasons I don't understand (since corruption should have been
> recoverable via RAIDZ1 redundancy).
>
>>        NAME        STATE     READ WRITE CKSUM
>>        zw0801      ONLINE       0     0     0
>>          da0s1g    ONLINE       0     0     0
>
> Note that there's no redundancy on this pool which restricts ZFS's
> ability to recover from errors.

I have mentioned this was a RAID enclosure. I am aware the built-in
recovery of ZFS is limited if an enclosure is doing the job:
  http://www.addonics.com/products/raid_system/zebra_4sa.asp

I am also aware that RAID-5 suffers from the write hole when power is
dropped to the enclosure; however:
  http://www.addonics.com/products/host_controller/ad4sr5hpmus.asp

it was either this, or be at the mercy of ZFS. If you do not limit
yourself to just one aggregated drive interface, ZFS will complain
when disks are reattached in the incorrect order. I have already tried
multiple independent devices and ZFS has no way that I can see, to
safely change device without a complete resilver. I just happened to
be lucky that USB devices are numbered sequentially; not expecting end
users to know this.


>
>>all I wish to say here is this: it was a, "Dam Site," easier to
>>recover from this, than in UFS. In UFS I have to map each I-node to
>>its corresponding file to know the content. This is just the way I see
>>FSCK working. In ZFS, a specific cp and generic scrub are the pair of
>>commands that fix this.
>
> Yes, being faced with a long list of inode numbers and having to
> work out what files are affected is a real PITA.
>
>>and yes, I keep backups but upgrading is a pain as it is, with all the
>>manual diff/merges one is required to do. If I were not stonewalled
>>out of my own trade, I'd contribute a Subversion based /etc in the
>>base install, at my own expense, with pleasure.
>
> Can you explain more about how your proposed approach is an improvement
> over mergemaster?  The underlying issue is that /etc contains files
> with content depends on both the version of FreeBSD and the specific
> host configuration.  Therefore any solution needs to support 3-way
> merging - which is messy.
>
This is the trouble with competition. I am not suggesting to disband
mergemaster. In fact; I prefer mergemaster to the manual diff by diff
I've had to do since the 3.x series. Mergemaster currently does not
give me time to properly research my changes. I can read
/usr/src/updating all I want, plus the mailing lists, and I am still
caught out by surprise with changes I did not expect. I am not talking
about /etc/group or /etc/aliases, which I can rebuild.

1) The first time I run mergemaster, I can expect changes I missed in
the mailing lists. The system I am upgrading is generally performing a
service, so I cannot keep it down for too long.
  I have kept records of my old ETC in the past. Now I am aware this
is my fault, but those records were disjointed and there were many
differences to keep track of. Most of these differences were vendor
supplied, and some reflected states that were no longer relevant to my
environment.
  The tags-->branch hierarchy in Subversion is very interesting to me
at this point. If I can maintain a couple of branches, each branch
reflecting a major release, and I am preparing to upgrade a production
system, the branch feature should theoretically aid the three-way
merge. This assistance is thus provided, days before the upgrade is
due. If mergemaster integrated this, there should be no questions
asked during the live upgrade, that have not already been answered.

2) On subsequent invocations of mergemaster, I will be upgrading
production systems. I've found BOCHS very helpful for this task;
however, it is still a completely different system as far as
configuration is concerned. Custom kernels just fail to make sense
within this. Networking can never be identical. Rolling back just /etc
is not helpful if the entire userland has changed dramatically. The
useful life of emulators end during major release upgrades.
  I still depend on a router in my place of work to provide internal
DNS. The reason I do this is because I am not willing to risk BIND
breaking down upon upgrade. Rolling back to a prior version of FreeBSD
is not something I can do; not for one program.

  My boss was rightly using a six-year old handover, from his previous
company, until very recently. He was maintaining PHP code on his own
company site before I applied for the job.
  FreeBSD is thus encouraged, not just welcomed. In a sense I owe it
to him, not to allow a system upgrade to take longer than 1 hr of
staff time.


More information about the BUGS mailing list