Skip to main content

Huffman Coding


  • Proposed by Dr. David A. Huffman in 1952
        “A Method for the Construction of Minimum Redundancy Codes”

  • Applicable to many forms of data transmission
               
Algorithm for Huffman Tree
  1. Construct a set of trees with root nodes that contain each of the individual symbols and their weights.
  2. Place the set of trees into a priority queue.
  3. while the priority queue has more than one item
  4. Remove the two trees with the smallest weights.
  5. Combine them into a new binary tree in which the weight of the tree root is the sum of the weights of its children.
  6. Insert the newly created tree back into the priority queue.

  • Huffman coding is a technique used to compress files for transmission
  • Uses statistical coding
        more frequently used symbols have shorter code words


  • Works well for text and fax transmissions
  • An application that uses several data structures 
#include<iostream>
#include<string>
#include<iomanip>
#define MAX 26

using namespace std;

typedef struct HuffTree
{
 char sym;
 int weight;
 struct HuffTree *left;
 struct HuffTree *right;
}Node;

void Min_Heap(HuffTree *Heap[],int Index, HuffTree *p)
{
 int par;  

 Index = Index+1;
 while(Index > 0)
 { par = int((Index-1)/2); // Parent
  if(p->weight >= Heap[par]->weight)
  {
   Heap[Index] = p;
   return;
  }
  Heap[Index] = Heap[par];
  Index = par;
 }
 Heap[0] = p;

}

Node* Delete_Heap(Node *hp[],int *n)
{
 Node *ptr = hp[0];
 Node *last = hp[*n];
 hp[*n] = NULL;
 *n = *n-1;
 int k = 0, left = 1, right = 2;
 while(right <= *n)
 {
  if((last->weight <= hp[left]->weight) && (last->weight <= hp[right]->weight))
  {
   hp[k] = last;
   return(ptr);
  }
  if(hp[left]->weight <= hp[right]->weight)
  {
   hp[k] = hp[left];
   k = left;
  }
  else
  { hp[k] = hp[right];
   k = right;
  }
  left = 2*k+1;
  right = left+1;
 }
 if((left == *n) && (last->weight > hp[left]->weight))
 {
  hp[k] = hp[left];
  k = left;
 }
 hp[k] = last;

 return(ptr);
}

Node* HuffmanTree(Node *hp[],int n)
{
 n--;
 while(n)
 {
  Node *p1 = Delete_Heap(hp,&n);
  Node *p2 = Delete_Heap(hp,&n);
  Node *p = new Node;

  p->sym = '*';
  p->weight = p1->weight + p2->weight;
  p->left = p1;
  p->right = p2;

  Min_Heap(hp,n++,p);
 }
 return(hp[0]);
}

void Genetarte_Codes(HuffTree *root,int i)
{
 static char s[MAX];

 if(root->left)
 {
  s[i] = '0';
  Genetarte_Codes(root->left,i+1);
 }
 if(root->right)
 {
  s[i] = '1';
  Genetarte_Codes(root->right,i+1);
 }
 if(root->left==NULL && root->right==NULL)
 {
  cout<<" "<<root->sym<<"->";
  for(int j=0;j<i;j++)
   cout<<" "<<s[j];

   cout<<endl;
 }
}


void main()
{
 Node* HuffmanTree(Node *heap[],int n);
 
 char symb[MAX+1] = {'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','\0'};
 int freq[MAX] = {200,106,170,64,63,57,57,51,48,47,32,32,23,22,21,20,18,16,15,15,13,8,5,1,1,1};

 //char symb[MAX+1] = {'E','e','r','i','y','s','n','a','l','k','.','\0'};
 //int freq[MAX] = {1,8,2,1,1,2,2,2,1,1,1};

 Node *heap[MAX];

 for(int i=0;i<MAX;i++)
 {
  Node *p = new Node;
  p->sym = symb[i];
  p->weight = freq[i];
  p->left = p->right = NULL;

  Min_Heap(heap,i-1,p);
 }

 Node *root = HuffmanTree(heap,MAX);
  
 Genetarte_Codes(root,0);
}

Comments

Popular posts from this blog

Font identifier and Unicode converter for Hindi

Font identifier and Unicode converter for Hindi Fonts are used to represent text in document. Fonts are mainly two kind non-Unicode and Unicode fonts. Complex scripts like Hindi and other Asian languages well represented in Unicode fonts. There are some other ways to write these languages for e.g we can use ASCII/ISCII codes to represent different characters of Hindi, but there are large numbers of characters in Hindi script as compared to English. Therefore, we always need multiple ASCII/ISCII encoded characters combination to represent a single character of Hindi Script. One major problem in these ASCII encoding based fonts is that we cannot easily transfer text from one system to another. The system must have these text fonts. There is hundreds of ASCII/ISCII encoding based fonts which are used to write Hindi text. New software systems are based on Unicode fonts.                   ...

Binary Search Tree in ASP .Net

Binary Search Tree in ASP .Net To create Binary Search Tree(BST) in Asp.net application   first you need to create a Node class. Something like following : class Node {     public String data;     public int freq = 0;     public Node left, right;     public Node()     { }     public Node( String data)     {         this .data = data;         left = null ;         right = null ;     } } Next You need to create a class including different functions.Like class BinaryTreeImp {     Node root;     String outputfreq = "" ;     static int count = 0;     public BinaryTreeImp()     {      ...

Hindi to Punjabi Machine Translation System

The Hindi To Punjabi Machine Translation System has been developed using Direct/Rule based Approach by Dr.Vishal Goyal and Dr. G.S Lehal. Various large size Lexicon resources  have been used to map Source and Target language words.  In general, if the two languages are structurally similar, in particular as regards lexical correspondences, morphology and word order, the case for abstract syntactic analysis seems less convincing. Since the present research work deals with a pair of closely related language, so the direct translation system is the obvious choice. The overall system architecture shown below, is adopted for Hindi to Punjabi Machine Translation System. The system is divided into three stages: Preprocessing, Translation Engine, and Post Processing stage. Following is the description of various steps of this architecture.  PreProcessing   The pre-processing stage is a collection of operations that are applied on input  data to make it pr...