While ZEO is great, it is still just a fancy, distributed cacheing mechanism for a storage. To achieve complete redundancy it is necessary to replicate the main storage.
I've started this Wiki page so that we can start collecting ideas on how to do it. Sorry for the ad-hoc nature of it.
- I've got some good news for you. :) We've recently made progress on a quorum-based replication protocol, that provides:
- Transaction consistency, and possibly
- Distribution of load over multiple servers
Idea 1--implement a ReplicatedStorage that'll enable us to have many main storages, replicated/synchronized. Synchronization can be automatic or manually, atomically or object-by-object, and locking can be optimistic or pessimistic.
- Do you have a protocol in mind for automatic synchronization?
What will you do if there are conflicting changes that can't be resolved during manual synchronization?
This can be used to implement a fault tolerant production system (automatic replication, pessimitic locking) and the process of staging a system (manual replication, optimistic locking). The latter is probably better done by using mountable storages; first replicate to a new storage, and then change the mount point to start using the new storage.
- Why must you be pessimistic to be fault-tolerant?
What you really need is to maintain transactional consistency.
You can be optimistic and still do this.
The staging server approach seems to be excessively course grained.
-- 2000-07-06 BjornStabell