80 lines
2.2 KiB
Plaintext
80 lines
2.2 KiB
Plaintext
The FM cannot init groups/resources until the API is online.
|
|
|
|
The DM expects FM Phase1 init to create the quorum resource.
|
|
|
|
What happens if remote node goes down at various points along this path?
|
|
|
|
Order of events is...
|
|
|
|
Phase 1:
|
|
Petition Other Node to Join
|
|
|
|
NmJoin happens
|
|
|
|
DmJoin happens
|
|
- Registers with the gum and syncs database
|
|
|
|
ApiInitPhase 1
|
|
- Read-only access allowed to DM after this point
|
|
|
|
|
|
FmInitPhase1
|
|
- Performs synchronized join with GUM layer
|
|
- Must build the quorum resource for DM layer below
|
|
- Builds skeletal groups/resources (especially quorum resource)
|
|
- Gets current state (owner and state) about all groups/resources
|
|
|
|
DmUpdateJoin
|
|
- hooks in callbacks for quorum notification and node events.
|
|
|
|
NmJoinComplete
|
|
- The node is now fully online
|
|
|
|
Phase 2:
|
|
ApiInitPhase2
|
|
- API is now fully online
|
|
|
|
FmInitPhase2
|
|
- Complete building all Groups/Resources
|
|
- Setup all state info retrieved from FmInitPhase1
|
|
- FM is now officially online
|
|
- Pull all groups that should be run on the local node
|
|
- Signal events for all groups/resources
|
|
|
|
|
|
FORM
|
|
Phase 1
|
|
|
|
DmForm -
|
|
- Registers with gum.
|
|
|
|
FmGetQuorumResource
|
|
- gets the quorum resource and arbitrates for it.
|
|
|
|
DmUpdateFormNewCluter-
|
|
hooks in callbacks for quorum notifications,node events. This must
|
|
be done before FmInitPhase1.
|
|
|
|
FmInitPhase1
|
|
- Performs synchronized join with GUM layer
|
|
- Must build the quorum resource for DM layer below
|
|
- Builds skeletal groups/resources (especially quorum resource)
|
|
- Gets current state (owner and state) about all groups/resources
|
|
|
|
DmRollChanges
|
|
- quorum resource is brought online
|
|
- dm layer waits on notification, and when online,
|
|
changes in the quorum logs are applied.. for logging purposes.
|
|
- Must be done after FmInitPhase1.
|
|
|
|
Problem:
|
|
|
|
- if a node crashes after FmInitPhase1 but before FmInitPhase2 then the
|
|
state info retrieved in Phase1 is stale
|
|
|
|
- Resources must be cleaned appropriately no matter where the failure occurs.
|
|
|
|
- A join must convert to form appropriately.
|
|
|
|
|