Distributed Systems Election Algorithms Explained
Distributed Systems Useful Books
Election Algorithms in Distributed Systems Explained
Here is some basic, information regarding election algorithms in distrubuted systems and distributed computing.
Election algorithms are a fundamental part of distributed systems and used in all distributed systems concepts and designs.
They are used in replicated services where co-ordination is necessary and when continuation of functionality is vital (even if failures oocur). They are also used for self-organising when 'actors' need to know their status relative to the other 'actors'
Election Algorithms often have only two states.
- By default processing nodes are set to be slaves
- Only a single processing node can be the elected master
The requirements of election algorithms
These apply to all election algorithms that have been designed for distributed systems in distributed computing.
They are Scalability, Robustness, Low as possible latency, Efficiency and Fault Tolerance. An election algorithm in distributed systems should also find a replacement master extremely quickly if the present master is removed. The time that a single master node is the master should be very high.
In addition there must not be more than one master, if this does occur it must be rectifed quickly. Also, master node elections must not happen unless the master has actually failed.
Now onto the different election algorithms themselves....
The bully election algorithm
- With the bully election algorithm each node is allocated a numerical unique identifier. This is because during elections of a new master the node with the higest number will become the new master.
- When the master fails, to elect a new master node, the other slaves communicate in 'rounds'
- Slaves are eliminated from the election if they receie a message from another node with a higher unique identifier.
- Therefore, with the bully algoirithm the slave with the highest unique indentifier will then become the master node.
The ring election algorithm (fairly similar to the bully algorithm)
- In ring election algorithm the nodes (the slaves and the master) are arranged in a logical ring in which they can only communicate with their logical neighbours.
- If a master fails or removed only the direct neighbours of the master will realise.
- The slave nodes then pass messages around the ring of slaves to see which one has the highest unique identifier (the same as the bully algorithm) and then it becomes the new master.
The pre-election of a master/leader (The leader election algorithm)
- With the leader election algorithm a backup master is chosen while the distributed system is functioning.
- If the master fails or is removed the backup master takes over once it has detected that this is the case.
- After this has happened a new backup will need to be chosen for the new master.
Examples of distributed systems in a real world environment
Probably the best real world example of a distrbuted system is a colony/group of ants. They are made up of a large numbers of entities that carry out basic and simple behaviours and only have limited local knowledge of what is happening around them. They also have a master, if this is removed or fails (dies) then a new master needs to be found.
However, each ants basic behaviour makes a big difference when viewed collectively. In addition this example of distributed 'operating' system is scalable, efficient and robust. <-- The requirements of an election algorithm in a distributed system.
I realise some of this may be slighlty confusing and new to some people not familiar with different election algorithms and distributed systems, so if you have any questions, don't hesitate to ask or comment.