Skip to main content

Punjabi to Hindi Machine Translation System

Punjabi and Hindi are two closely related languages as both originated from the same origin and having lot of syntactic and semantic similarities. These similarities make direct translation methodology an obvious choice for Punjabi-Hindi language pair. The purposed system for Punjabi to Hindi translation has been implemented with various research techniques based on Direct MT architecture and language corpus. The accuracy percentage for the system is found out to be 90.67%. 

Lexical Resources
System use various resources as follow: 
Root word Lexicon: It is a bilingual dictionary that contains Punjabi language word, its lexical category like whether it is noun, verb or adjective etc and corresponding Hindi word. It also contains the gender information in case of nouns and type information (i.e. transitive or intransitive) in case of verb. This dictionary contains about 33000 entries covering almost all the root words of Punjabi language. 
Inflectional form lexicon: It contains all the inflectional forms, root word and corresponding Hindi word. Ambiguous words has the entry “amb” in the Hindi word field. It contains about 90,000 entries.
Ambiguous word lexicon: It contains about 1000 entries covering all the ambiguous words with their most frequent meaning.
Bi gram Table: Used for resolving ambiguity, this table contains Punjabi bi grams along with Hindi meaning. Bi grams are created from a corpus of 7 million words.
Trigram Table: Same as Bi gram, but contain Punjabi trigrams used for resolving ambiguity.Created from 7 million words corpus. 

System Architecture

For Further Reading about this system, Readers may follow: http://dl.acm.org/citation.cfm?id=1599292

System is Freely available to use: http://www.learnpunjabi.org/p2h/default.aspx

The Punjabi To Hindi Machine Translation system has been developed by GS Josan and GS Lehal.

Comments

Popular posts from this blog

Font identifier and Unicode converter for Hindi

Font identifier and Unicode converter for Hindi Fonts are used to represent text in document. Fonts are mainly two kind non-Unicode and Unicode fonts. Complex scripts like Hindi and other Asian languages well represented in Unicode fonts. There are some other ways to write these languages for e.g we can use ASCII/ISCII codes to represent different characters of Hindi, but there are large numbers of characters in Hindi script as compared to English. Therefore, we always need multiple ASCII/ISCII encoded characters combination to represent a single character of Hindi Script. One major problem in these ASCII encoding based fonts is that we cannot easily transfer text from one system to another. The system must have these text fonts. There is hundreds of ASCII/ISCII encoding based fonts which are used to write Hindi text. New software systems are based on Unicode fonts.                   ...

Binary Search Tree in ASP .Net

Binary Search Tree in ASP .Net To create Binary Search Tree(BST) in Asp.net application   first you need to create a Node class. Something like following : class Node {     public String data;     public int freq = 0;     public Node left, right;     public Node()     { }     public Node( String data)     {         this .data = data;         left = null ;         right = null ;     } } Next You need to create a class including different functions.Like class BinaryTreeImp {     Node root;     String outputfreq = "" ;     static int count = 0;     public BinaryTreeImp()     {      ...

Hindi to Punjabi Machine Translation System

The Hindi To Punjabi Machine Translation System has been developed using Direct/Rule based Approach by Dr.Vishal Goyal and Dr. G.S Lehal. Various large size Lexicon resources  have been used to map Source and Target language words.  In general, if the two languages are structurally similar, in particular as regards lexical correspondences, morphology and word order, the case for abstract syntactic analysis seems less convincing. Since the present research work deals with a pair of closely related language, so the direct translation system is the obvious choice. The overall system architecture shown below, is adopted for Hindi to Punjabi Machine Translation System. The system is divided into three stages: Preprocessing, Translation Engine, and Post Processing stage. Following is the description of various steps of this architecture.  PreProcessing   The pre-processing stage is a collection of operations that are applied on input  data to make it pr...