A Spanning tree is a tree that includes every vertex in the graph. A spanning tree is a special form of a tree that
Every spanning tree connects n vertices with (n-1) number of edges
A spanning tree can be made by using the edges during DFS and BFS. For example :
depth_first_search(v) :
Mark v as visited;
for (all u ∈ (vertices connected to v) ) do
if(u is not visited yet)
then show thant edge (v,u) is a part of the spanning tree
depth_first_search(u);
Spanning tree application : Used in construction of communication network (normally when the least number of edges are required)
MST is a tree that considers the weight of each edge. It is a spanning tree that uses the edges that have the smalles total weight
Normally Kruskal's MST algorithm and Prim's algorithm is most famous to get the MST. Conditions for MST :
Kruskal's algorithm mainly uses the greedy method (one of the algorithm construction techniques) : Selecting the best option in current status to reach the final answer
However it is never confirmed that, selecting the best option in that situation leads to optimum solution. Therefore you have to assess if the answer is actually the optimum solution for greedy method. For Kruskal, it has been proved that it always gives the optimum solution.
Algorithm :
1. Sort all the edges in non-decreasing order of their weight.
2. Pick the smallest edge. Check if it forms a cycle with the spanning tree formed so far. If cycle is not formed, include this edge. Else, discard it.
3. Repeat step#2 until there are (V-1) edges in the spanning tree.
Pseudo code :
//E : set of edges, G : graph structure
//Et : Set of edges in the MST
kruskal(G) :
E is ordered in decreasing order such that w(e1)<=...w(e2)
Et = NULL, int ecounter = 0;
int k=0;
while(ecounter < (n-1))
k = k+1;
if Et U {e(k)} does not form a cycle
then include the e(k) to Et set
return Et
An important part of the algorithm is to check whether the edge considered will form a cycle with the edges in the set. Therefore Unifon Find algorithm is used inside the Kruskal's algorithm
Main concept
Cycle formed : The 2 vertices of the connecting edge both come from the same set of vertices
Cycle X formed : The 2 vertices of the connecting edge come from different set of vertices
We use two additional helping functions for the algorithm : union(x,y) and find(x)
union(x,y) : Input the 2 sets that element x and y are associated in, and combine the 2 sets into 1 set
find(x) : Return the set that element x is a memeber of
Implementation of algorithm
To track the sets of each element, 1D array is used. In 1D array, the index of each element's parent node is stored. If elements have equal parent node ID, it means that the elements are in same set

For the following example,
1) When initially there is no link all value of array is initialized to -1 (->means that there is only 1 item in every subset)
0 1 2
-1 -1 -1
2) Then edge 0-1 is formed : Find the subset in which vertices 0 and 1 are. Since they are in different subsets, we unify them.
For the union, we can take either 0 or 1 as the parent node and it does not matter (in this case assume 0 is parent node)
0 1 2
-1 0 -1
The index of the parent node(0) is stored in the element 1 position. Since 0 is the parent node, its value in array is -1
Code :
//Array to store the parent index of each vertex
int parent[MAX_VERTICES];
//Function that returns the representative node (root node) of the set that curr is a member of
int set_find(int curr) {
if(parent[curr] == -1) //The following node is the root node
return curr;
//If current node is not parent, moves to the parent node of the current node
//Repeat moving up to the parent node till it reaches the root node
while(parrent[curr] != -1) curr = parent[curr];
return curr;
}
//Combine the to sets together
void set_union(int a, int b) {
int root1 = set_find(a); //Find root node of a
int root2 = set_find(b); //Find root node of b
//The set where root 1 was the root node now has root 2 as the root node
//The root 1 now points root 2 to be its parent node
if(root1!=root2)
parent[root1] = root2;
}
In Kruskal's algorithm we would have to sort the edges and therefore Graph is represented as a set of edges only. (GraphType only storees edges)
//Structure to represent a weighted edge in graph
struct Edge {
int start, end, weight;
};
//Structure to represent the graph
struct GraphType {
int n; //Number of edges
struct Edge *edge; //The graph is represented as an array of edges
}
//Creates the graph with number of edges initialized
//Use dynamic memory allocation to efficiently use memory
struct GraphType* createGraph(int E)
{
struct GraphType* graph = (struct GraphType*)(malloc(sizeof(struct Graph)));
graph->n = E;
//Create an array with size of the number of edges input
graph->edge = (struct Edge*)malloc(sizeof(struct Edge)*E);
return graph;
}
// Compare two edges according to their weights.
// Used in qsort() for sorting an array of edges
int myComp(const void* a, const void* b)
{
struct Edge* a1 = (struct Edge*)a;
struct Edge* b1 = (struct Edge*)b;
return a1->weight > b1->weight;
}
//Main function to construct MST using Kruskal's algorithm
//Assume the graph is all set with all of the nodes connected to each other
void KrsukalMST(struct GraphType *g) {
int edge_accepted = 0; //Number of edges inside the MST
int uset, vset; //Root node for set u and v
struct Edge e;
int i=0; //Index variable used for sorted edges
//Step 1 : Sort all the edges in non-decreasing order of their weight
qsort(g->edges, g->n, sizeof(struct Edge), compare);
while(edge_accepted < (g->n - 1)) { //Repeat till num of edges < (n-1)
//Step 2 : Pick the smallest edge and increment the index for next iteration
e = g->edges[i++];
//Find the set(root node) that each vertex of edge is associated to
//The root node index is stored in each variable
uset = set_find(e.start);
vset = set_find(e.end);
//If they both are from different set, unify the two into 1 uniton
if(uset != vset) {
printf("Edge (%d, %d) selected", e.start, e.end);
edge_accepted++;
//Unify the two sets together
set_union(uset,vset);
}
}
}
Time complexity for the following algorithm depends on the time to sort the edges :O(ElogE)
Prim's algorithm basically starts from a node and expands the set of nodes consecutively. At the start, only 1 node is included in the set.
Prim's algorithm selects the closest node that is directly linked to the nodes in the set, and adds it to the set. This process is repeated till the tree has (n-1) edges.
Comparison with Kruskal's algorithm :
Basic algorithm for Prim's MST algorithm :
1. Create a set that keeps track of vertices already included in MST
2. Asign a key value to all vertices in the input graph. Initially all key values are set to INFINITE, except for the first vertex (key value:0) so that it is picked first
3. While the set doesn't include all vertices repeat :
The key values represent the minimum weight edges connecting them to the set of vertices included in MST
Code :
int distance[MAX_VERTICES];
int mstSet[MAX_VERTICES];
int parent[MAX_VERTICES];
typedef struct GraphType {
int n;
int weight[MAX_VERTICES][MAX_VERTICES];
} GraphType;
//Return the index of vertex with the smalles dist[v]
int get_min_vertex(int n) {
//Initialize min value
int min = INT_MAX;
int min_index
for (int v=0; v<n; v++) {
//Compare the distance of selected vertex with all the other vertices that are not in the set.
//Find vertex with smallest distance and return position index
if(mstSet[v]==false && key[v] < min) {
min = key[v], min_index = v;
}
}
return min_index;
}
//Function to construct MST graph
void primMST (GraphyType *g) {
//Initialize all the distance to INFINITE
//Initialize set so that there is no vertex included initially
for (int i=0; i<g->n; i++) {
distance[i] = INT_MAX;
mstSet[i] = false;
}
//1st node in initially included in the set
//The vertex with position index of 0 is added
distance[0] = 0;
parent[0]=-1; //First node is always root of MST
for(int i=0; i<g->n; i++) {
//Pick the minimum distance vertex from the set of vertices not in the MST
int y = get_min_vertex(g->n);
//Add the picked vertex to the MST set
mstSet[u] = true;
//Update the distance value and parent index of the adjacent vertices of the vertices not included in MST
for(int v=0; v<g->n; v++) {
//graph[u][v] is non zero only for adjacent vertices of u
//mstSet[v] is false for vertices not included in MST
//distance[v] represents the distance between v and the node that was originally closest to v that is included in MST
//Update distance[v] only if the newly added u node has a closer distance to v
if(g->weight[u][v] && g->weight[u][v] < distance[v]) {
parent[v] = u;
distance[v] = g->weight[u][v];
}
}
}
}
The main iterative function in Prim's has a double loop, therefore has a time complexity of O(n2)
Therefore Kruskals algorithm : Appropriate for sparse graph,
Prims algorithm : Appropriate for dense graph
The shortest path question is to find the shortest path that connects two vertices i and j. Unlike MST, it focuses on finding the path therefore does not require all vertices to be included in the path. There are 2 algorithms to find the shortest path.
Objective : Finds the shortest path's distance from one ceratin vertext to every other vertices.
The algorithm creates the tree of the shortest paths from the starting source vertex from all other points in the graph. We first start with an empty set S={}. Starting from the source node, nodes are added in order based on the distance between node and source node.
distance of source vertex as 0 and all ogher vertices as infinitydistance, then replace it otherwise ignoredistance value and go back to step 4distance[]distance[] :distance[], it is replaceddistance[w] = min(distance[w], distance[u]+weight[u][w])When the dikstra's algorithm is done the distance[] would store the minimum value of the distance between the source vertex to each vertices. The path is not directly stored therefore an additional array is required to store the path
We create a sepreate array parent[] to store the parent index to each element. Initially the array is blank, and the source node has a parent[] value of -1 to represent it as the ultimate source node.
but everytime the distance[] value is updated for a certain vertex, the following parent[] value is also upated to point to the new node added to the set to be the parent.
To print the path, we use recursive function till the parent[] value reaches -1.
void printPath(int parent[], int j)
{
// Base Case : If j is source
if (parent[j]==-1)
return;
printPath(parent, parent[j]);
printf("%d ", j);
}
//Return the vertex with minimum distance value within vertices not yet processed(visited)
int minDistance(int distance[], int found[]) {
int min = INT_MAX;
int min_index = -1;
for(int i=0; i<g->n; i++) {
if(distance[i] < min && !found[i]) {
min = distance[i];
min_index = i;
}
}
return min_index;
}
// Function that implements Dijkstra's single source shortest path algorithm
void dijkstra(GraphType *g, int src)
{
// The output array. dist[i]
// Will hold the shortesㅅ distance from src to i
int distance[MAX_VERTICES];
// Represent if vertex was been visited
bool found[MAX_VERTICES];
// Parent array to store parent index
int parent[MAX_VERTICES];
// Initialize all distance and found array values
for (int i = 0; i < V; i++)
{
parent[0] = -1;
distance[i] = INT_MAX;
found[i] = false;
}
// Distance of source vertex from itself is always 0
distance[src] = 0;
// Find shortest path for all vertices
for (int count = 0; count < g->n-1 ; count++)
{
// Pick the minimum distance vertex from the set of vertices not yet processed.
// u is always equal to src in first iteration.
int u = minDistance(distance, found);
// Mark the picked vertex as processed
found[u] = true;
// Update distance value of the adjacent vertices of the picked vertex.
for (int v = 0; v < V; v++)
// Update distance[v] only if is
// not in found and
// There is an edge from u to v, and total weight of path from src to v through u is smaller than current value of dist[v]
if (!found[v] && graph[u][v] && distance[u] + graph[u][v] < distance[v])
{
parent[v] = u;
distance[v] = distance[u] + graph[u][v];
}
}
// print the constructed distance array
printSolution(dist, V, parent);
}
Objective : Unlike Dikstra's algorithm, floyd's algorithm finds the shortest path between all of the vertices to each other. A 2x2 array with all of the shortest distance between all vertices are stored.
The algorithm uses 3 nested loops. Initially, the solution matrix is initiailized to the graph itself. The main idea is to pick a certain vertex and update all shortest path between the original one and the one that includes the picked vertex as the intermediate vertex.
Ak[i][j] represents the minimum distance between vertex i to j by using 0~k as the intermediate vertex. The key feature of Floyd's algorithm is to calculate the distance in following order :
A-1, A0, A1, ... , An-1
When calculating the Ak, we consider that we have already considered cases on when the vertex 0~k-1 is an intermeidate vertex and have found the optimal solution.
For every pair (i,j) of the source and destination vertices respectively, there are 2 possible cases
1) When k(the intermediate vertex) is not included in the path from i to j : Min distance remains to be Ak-1[i][j]
2) When k(the intermediate vertex) is included in path from i to j : Distance could be the result of (Ak-1[i][k] + Ak-1[k][j])
(Addition of min distance between i-k and j-k)
=> Therefore conclusively, the min distance would be the smalles one between the result of (1) and (2)
Note : Since if there is no link between i-j and k, it is represented by infinity(very large number), there is no error.
Following matrix shows the shortest distances between every pair of vertices
0 5 8 9
INF 0 3 4
INF INF 0 1
INF INF INF 0
It shows every shortest distance between any two vertices in the graph.
void floyd(GraphType *g) {
//Initialize the answer 2x2 array
for(int i=0; i<g->n; i++) {
for(int j=0; j<g->n; j++) {
A[i][j] = g->weight[i][j];
}
}
//Find the min distance
for(int k=0; k<g->n; k++) {
for(int i=0; i<g->n; i++) {
for(int k=0; k<g->n; k++) {
if(A[i][k] + A[k][j] < A[i][j])
A[i][j] = A[i][k] + A[k][j];
}
}
}
}
Algorithm to decide the order of tasks that has a specific hierarchy. Some tasks should always come infront of other certain tasks due to relationship.
Ex) The process of university tasks are shown

You would not be able to pass the test until you become a senior. However passing the test and signing in the department website is independent and therefore the order does not matter.
A topology sort allgins every task(represented as nodes) in the graph, without violating the precedence order of each task.
For a single graph, there can be several topology sorts as there are tasks that have same hierarchy.
Note that if a graph has a cycle, Topology sort is impossible and would cause an error.
1) Select a node that has no entering edges
2) Delete the following node and all of the links connected to the node
3) Repeate the process till all of the nodes have been deleted
topo_sort(G)
for(i<-0 to n-1)
if(Every node has a preceding node)
then cycle exists and error;
Select a node v that does not have a preceiding node
print v
delete v and every link that is connected to v
in_degree[] to record the number of edges pointing to the node, for every nodein_degree[i] == 0 it becomes the next candidate to be deletedin_degree[] value decrease by 1in_degree==0)in_degree[] value decreases to 0, it is added to the stackint todo_sort(GraphType *g) {
int i;
StackType s;
GraphNode *node;
//Calculate the number of edges entering the vertex for every vertex
//Dynamically allocate an array to store number of entering edges
int *in_degree = (int*)malloc(g->n * sizeof(int));
for(int i=0; i< g->n; i++) {
in_degree[i] = 0;
}
for(int i=0; i<g->n; i++) {
//In each list, the nodes pointed by the head node of each list exists
GraphNode *node = g->adj_list[i];
while(node!=NULL) {
//Get position index of the pointed vertex and increment count by 1
in_degree[node->vertex]++;
node=node->link;
}
}
//Add the node in which in_degree is 0 to stack
init(&s); //Initialize stack
for(int i=0; i<g->n; i++) {
//Push the candidate vertices into the stack
if(in_degree[i]==0) push(&s, i);
}
//Generate topology sort
while(!is_empty(&s)) {
int w;
//w stores entering node's position index
w = pop(&s);
printf("Vertex %d ->", w);
//Get the head node of the list
//Decrease the number of entering index
while(node != NULL) {
int u = node->vertex;
in_degree[u]--;
if(in_degree[u] == 0) push(&s, u);
node=node->link;
}
}
free(in_degree);
}