Saturday, October 24, 2015

2-3 Trees and 2-3-4 Trees

2-3 Trees 


A 2-3 Tree is a data(tree) structure where every bode with children has either two children and two data elements 

2-3 Tree Structure

Properties 

  1. Every internal node is a 2-node or a 3-node 
  2. All leaves are at the same level 
  3. All data is kept in sorted order.
Operations

1. Search 
               Similar to BST, since the data elements in each node is ordered 
2. Insertion
              Works by searching for the proper location of the key and adds it there. If the node becomes a 4-node, then the split from the 2-nodes and the middle key is moved up to the parent.
3. Deletion 
            Reverse of insertion


2-3-4 trees 

2-3-4 trees are self balancing data structure. The time complexity for the trees involve O(log(n))

Properties: 
  • Every node is a 2-, 3- or a 4- node and hold 1,2,or 3 data elements respectuvely 
  • All leaves are kept at same depth 
  • All data is kept in sorted order.
Operations and working functions are similar to 2-3 Trees as explained above.


Blogs to refer: 1,2 

AVL Trees

AVL Tree is a self balancing Binary Search tree named after Soviet inventors George Adelson Velsky and Evangi Landis.

A height balanced BST is known as Avl which can have at most 1 difference with children.Height at leaves usually start with 0 but for AVL trees we do it from 1(for the ease of understanding).Here the difference between the left and right subtrees cannot be more than one for all nodes.Operations like Lookup, insertion and deleting all take  O(log(n)) time in both the average and worst case.

Operations:

Searching: Process is similar to search in ordinary Binary search tree.

Insertion: To make sure given tree remains AVL after every insertion. 

So we must rebalance, without violating the BST rule. The Insert operation involves two rotation 

  1. Left Rotation 
  2. Right Rotation
Steps to follow for Insertion:
  1. Perform standard BST insert
  2. Starting from w, find first unbalanced node
  3. Rebalance the tree in the following ways accordingly 
    1. Left Left case (Right Rotation)
    2. Left right case (Left and right rotation)
    3. Right Right case (Left Rotate)
    4. Right left case (Right and Left Rotation)



AVL Tree Rebalancing


Algorithm




Deletion: Deletion follows the same procedure as Insertion except in place of insert we delete.


Blogs to follow: 1, 2, 3

Sunday, October 18, 2015

Binary Search Trees (BST)

Binary Search Tree is a node based binary tree data structure which has the following properties:

  • The left subtree of a node contains only nodes with key's less than the node's key
  • The right subtree of a node contains only nodes with keys greater than node's key
  • The left and right subtree each must also be a binary search tree
  • The tree must not have no duplicate nodes
The above properties provide an ordering among keys such that operations like 
  1. Search
  2. Min 
  3. Max 
can be done fast enough.The tree actions are 
  1. Search 
  2. Insert 
  3. Delete
There are many types of binary trees. Few among them are self balancing trees like 
  1. AVL Trees
  2. Red Black Trees
  3. Splay Trees
All operations take time directly proportional to the height of the tree. Complexity at worst crosses O(h).

Optimal Binary Search Tree

If we do not plan to modify a search tree and if we know how exactly often each item will be accessed.We can construct an optimal binary search tree (with an average cost of looking up an item).

Wednesday, October 7, 2015

Trees: An Introduction

Trees are very flexible, versatile and powerful non-liner data structure that can be
used to represent data items possessing hierarchical relationship between the grand father
and his children and grand children as so on.


The topmost node is called root of the tree. The elements that are directly under an element are called its children. The element directly above something is called its parent. For example, a is a child of f and f is the parent of a. Finally, elements with no children are called leaves.

Advantages of Trees

  1. Trees can serve as a hierarchy 
  2. Trees provide moderate access/search
  3. Trees provide Moderate/insertion Deletion
  4. Have no upper limit 

Main applications of trees include:

  1. Manipulate hierarchical data.
  2. Make information easy to search .
  3. Manipulate sorted lists of data.
  4. Router algorithms
  5.  Form of a multi-stage decision-making.

A Tree Classification





Binary Tree

A tree whose elements have at most 2 children is called a binary tree. Since each element in a binary tree can have only 2 children, we typically name them the left and right child.

It can be represented as:  
  1. Data 
  2. Left pointer or Child
  3. Right pointer or Child


Properties of a Binary tree


1) The maximum number of nodes at level ‘l’ of a binary tree is 2^(L-1).
Here level is number of nodes on path from root to the node (including root and node). Level of root is 1. This can be proved by induction.For root, L = 1, number of nodes = 2^(L-1) = 1.Assume that maximum number of nodes on level l is 2^(L-1).Since in Binary tree every node has at most 2 children, next level would have twice nodes, i.e. 2 * 2^(L-1)


2) Maximum number of nodes in a binary tree of height ‘h’ is 2^h – 1.
Here height of a tree is maximum number of nodes on root to leaf path. Height of a leaf node is considered as 1.
This result can be derived from point 2 above. A tree has maximum nodes if all levels have maximum nodes. So maximum number of nodes in a binary tree of height h is 1 + 2 + 4 + .. + 2^h-1. This is a simple geometric series with h terms and sum of this series is 2^h – 1.


3) In a Binary Tree with N nodes, minimum possible height or minimum number of levels is  ⌈Log(base2)(N+1)].
This can be directly derived from point 2 above. If we consider the convention where height of a leaf node is considered as 0, then above formula for minimum possible height becomes   
⌈ Log(base2)(N+1)⌉ – 1


4) A Binary Tree with L leaves has at least   ⌈ Log2L ⌉ + 1   levels
A Binary tree has maximum number of leaves when all levels are fully filled. Let all leaves be at level l, then below is true for number of leaves L.
   L   <=  2l-1  [From Point 1]
   Log2L <=  l-1
   l >=   ⌈ Log2L ⌉ + 1  


5) In Binary tree, number of leaf nodes is always one more than nodes with two children.
   L = T + 1
Where L = Number of leaf nodes
      T = Number of internal nodes with two children

Types of Binary trees:


  1. Full Binary Tree ; A full binary tree is a binary tree which has exactly has either 2 children or  no children. (No. of leaf nodes=No, of internal nodes+1)
  2. Complete Binary Tree: A complete Binary tree is a binary tree excepting the bottom which is filled from left to right.Its the best binary tree as it provides the best possible ration between the number of nodes and the height. O(log N) height h and nodes N. 2^(h+1)-1.
  3. Strictly Binary tree: A strictly binary tree is a binary tree which has 2N -1 nodes i.e., no children or two children where N stands for degree.
  4. Perfect Binary Tree:A perfect binary tree is a binary tree in which all interior nodes have two children and all leaves have the same depth or same level.They are ambiguously also known as complete or full binary tree.
  5. Balanced binary Tree:A balanced binary tree has the minimum possible maximum height (a.k.a. depth) for the leaf nodes, because for any given number of leaf nodes the leaf nodes are placed at the greatest height possible.

Tree Traversal 

Unlike other linear data structures, trees can be traversed in different ways.The ways to traverse trees are : 
  1. Depth First Search 
    1. Inorder Traversal     (Lc R Rc) //Where Lc=Left Child Rc= Right Child R=Root
    2. Preorder Traversal   (R Lc Rc) //                   "
    3. Postorder Traversal (Lc Rc R) //                   " 
  2. Breadth first Search

More tree explanations shall be done with programs.

If you are given two traversal sequences, can you construct the binary tree?
It depends on what traversals are given. If one of the traversal methods is Inorder then the tree can be constructed, otherwise not.
Therefore, following combination can uniquely identify a tree.
Inorder and Preorder.
Inorder and Postorder.
Inorder and Level-order.

And following do not.

Postorder and Preorder.
Preorder and Level-order.
Postorder and Level-order.
For example, Preorder, Level-order and Postorder traversals are same for the trees given in above diagram.
Preorder Traversal = AB
Postorder Traversal = BA
Level-Order Traversal = AB
So, even if three of them (Pre, Post and Level) are given, the tree can not be constructed.

Monday, October 5, 2015

Queue

Queue is a linear data structure which follows a FIFO(First In First Out) order.Classic example of queue can be a line in front of a bank cashier to deposit or withdraw money.

The primary difference between a stack and queue is that Stack is (Last In First Out) piling of data. Whereas as Queue is  First In First Out.

Queue data structure has two pointers.One being the front and other pointing the rear.

Operations of Queue:
  1. Enqueue
  2. Dequeue





Queue can be implemented using :
  1. Array 
  2. Linked list
Both have their own advantages or disadvantages.


Applications of Queue:
  1. CPU scheduling & Disk Scheduling
  2. IO buffers or asynchronous transfer 
More Links to follow.

Tuesday, September 29, 2015

Circularly Linked List

Circularly linked list is a linked list where all nodes are connected to form a circle.There is no end or rear or null pointer.



Circularly Linked List

Circular linked list can be implemented in both singly linked list as well as doubly linked list.
The only difference between implementation of a circularly linked list and other linked list is that, Circularly linked list doesn't have a last element instead points to the head.Whereas other linked list points to null in place of the head.

To traverse through all the elements in the circularly linked list, we need to use two loops 
  1. One which checks the condition that the head reference is not equal to null
  2. Other loop which checks if the loop doesn't repeat. i.e temp variable doesn't equal to head.

Function to push an element into the Circularly linked list 





Source: Geeksforgeeks
Also Refer to Tortoise-Hare Algorithm

Doubly Linked list

A doubly linked list contains an extra pointer which points to the previous element of a list and which does not exist in a singly linked list.This makes easy traversal of a linked list to the previous elements.


Algorithm for insertion 

of an element in a Doubly Linked List . Insertion can be classified into 3 different methods.
  1. Insertion at the Head
  2. Insertion at a given postion 
  3. Insertion at the end of the Linked List 

Insertion at the head



Insertion after a given position


Insertion of a node towards the end



Algorithm for deletion of a node



Algorithm for reversing a Doubly Linked List


Source: GeeksforGeeks

Advantages over Singly Linked list

  1. Can be traversed front and back from any location
  2. Less ambiguity in deleting an element from the linked list.

Disadvantages over a Singly Linked list

  1. Every Node takes in an extra pointer when a linked list could be created with a single pointer.
  2. Extra maintenance.
Important Links to Follow
Doubly Linked List(Geeksforgeeks)
Code for doubly Linked List

Thursday, September 24, 2015

Tortoise & Hare Algorithm

Tortoise & Hare algorithm is also known as Floyd's Cycle-Finding Algorithm.This is a technique for detecting infinite loops inside a linked list, symlinks, state machines etc.

To detect a loop inside a linked list
Source
  • Make a pointer to start of the linked list and call it Hare 
  • Make a pointer to the head of the linked list and call it Tortoise 
  • advance the Hare by two and the tortoise by one for every interation.
  • There's a loop if the Tortoise and Hare meet
  • There's no loop if the Hare reaches the end.
Algorithm is intersting as it takes O(n) time and O(1) space.

Python code to understand Tortoise and Hare algorithms
Source




With this Algorithm we detect the loops and iterations in a circularly linked list as suggested by the name Floyd's Cycle-finding Algorithm.

More Reference: Coding Freak

Sunday, September 20, 2015

Stack

Stack is a linear data structure which follows a particular order in which the operations are performed. Order is LIFO(Last In First Out) or FILO(First In Last Out)

Three main operations of a stack are :

  1. Push
  2. Pop
  3. Peek or Top
Stack


Push
The Push Operation pushes an element into the stack.

Pop
Pop operation removes the top most element in the stack.

Peek
Peek or top operation returns the top most value of the stack.


To understand stack, you can assume a pile of books placed one over the other.To get the last book in the stack, one needs to take away all the books before the last book.


Applications of Stack
  1. Recursion
  2. Tower of Hanoi
  3. Converting Infix to Postfix.


Stack can be implemented in two different ways: 
  1. Stack Implementation using Arrays 
  2. Stack Implementation using Linked list
The implemented codes are carried out into separate blogs in the above given links.

More stack related blogs


Stack Implementation Using Linked list

Implementation of a stack using linked list is dynamic and flexible. Yet the space consumed is high and adding or manipulating the linked list is complex.
Refer to the code for more details. The Code language is C++.

Code:



Output:




Stack Implementation Using Array

This blog will deal with implementation of stack using arrays.The implementation is simple yet we need to allocate a pre-defined size.The allocated size may or may not be utilized completely. This makes the array implementation non-flexible.

Using Array





Output





Linked List v/s Arrays

Linked list is a replacement to the arrays. Since arrays take in a block of memory so the flexibility is very less.It has a defined structure which uses a limited memory and uses only when the memory is required. Though adding more elements is a cumbersome task.

Each node in a list consists of at least two parts:
  1. Data
  2. Pointer to the next node





A tabular comparison of Arrays v/s Linked List:

S. No
Parameter
Arrays
Linked list
1
Cost
Constant time O(1)
O(n)
2
Memory usage
Fixed Size
Contiguous Blocks of memory
No unused memory,
Extra memory for pointer variables
3
Cost of inserting and Deleting elements
At the beginning
O(n)
Shifting one index at a time to the right
Creating a new node and head pointer
O(1)
At the end
If Array not full O(1) else O(n)
Traversing a new node & End
At the nth position
O(n)
O(n)
4
Ease of Use
Very Easy
Complicated especially in C/C++

Advantages over arrays

  1. Dynamic size
  2. Ease of insertion/deletion

Drawbacks:
  1. Random access is not allowed. We have to access elements sequentially starting from the first node. So we cannot do binary search with linked lists.
  2. Extra memory space for a pointer is required with each element of the list.

Linked List

linked list is a data structure consisting of a group of nodes which together represent a sequence. Under the simplest form, each node is composed of data and a reference  to the next node in the sequence.

                                   

Types of Linked List: 

Based on the working of a data structure, we list only a few important linked list models.
  1. Singly Linked list
  2. Doubly Linked list 
  3. Circular Linked list

Others forms of linked lists include multiply linked list, sentinal nodes etc.

Advantages

  • Linked lists are a dynamic data structure, allocating the needed memory while the program is running.
  • Insertion and deletion node operations are easily implemented in a linked list.
  • Linear data structures such as stacks and queues are easily executed with a linked list.
  • They can reduce access time and may expand in real time without memory overhead.

Disadvantages


  • They have a tendency to use more memory due to pointers requiring extra storage space.
  • Nodes in a linked list must be read in order from the beginning as linked lists are inherently sequential access.
  • Nodes are stored incontiguously, greatly increasing the time required to access individual elements within the list.
  • Difficulties arise in linked lists when it comes to reverse traversing. For instance, singly linked lists are cumbersome to navigate backwards and while doubly linked lists are somewhat easier to read, memory is wasted in allocating space for a back pointer.

Abstract Data Types(ADT)

An abstract data type (ADT) is a mathematical model for data types where a data type is defined by its behavior (semantics) from the point of view of a user of the data, specifically in terms of possible values, possible operations on data of this type, and the behavior of these operations.This contrasts with data structures, which are concrete representations of data, and are the point of view of an implementer, not a user.

ADT is a theoretical concept in computer science, used in the design and analysis of algorithms, data structures, and software systems, and do not correspond to specific features of any computer languages.

In ADTs,inferences are derived using a mathematical or a logical module.It is very much equivalent to writing a set of points understandable to a user.Lets assume a list with user input data which can be implemented in two ways:
  1. Using Arrays
  2. Using Linked Lists
Both have their own set of advantages and disadvantages,But it is made sure that advantages are used to make our understanding ends meet and the idea to implement looks concrete.

Each data structure has its limitations. One such data structure is a Linked list which is followed in the next chapter.