Approaches of POS Tagging Algorithm for Bangla Corpus

Call for Papers November 2024 | Email: editor@uijrt.com | ISSN: 2582-6832 | Google Scholar | Impact Factor: 5.794

Paper Details

Subject:	Computer Science and Engineering
Paper ID:	UIJRTV1I70005
Volume:	01
Issue:	07
Pages:	29-35
Date:	May 2020
ISSN:	2582-6832
Statistics:

Full Text [PDF]

Cite this

Sultana, M. and Balazon, F.G., 2020. Approaches of POS Tagging Algorithm for Bangla Corpus. United International Journal for Research & Technology (UIJRT), 1(7), pp.29-35.

Abstract

Parts of speech is the process of classifying words into their parts of speech and labeling them accordingly in lexical categories and by using this POS tagging it is very easy to identify the words as nouns, verbs, adjectives etc. in each word in a natural Language sentence. For building lemmatizers which we are used to reduce a word to its root form in natural processing language, the POS tagging is essential part. The text analysis, machine translator, information retrieval and text to speech synthesis etc. POS tagging is initial stage in NLP application. Now a days to implement POS tagger various approaches have been proposed. In this paper Trigram and HMM methods are using to develop the tagger in general statistical approach and present a clear idea about this algorithm and also represent tag set with Indian corpus for tagging Bangla text for trying to find the accuracy of taggers output. This paper also presents the various development in POS taggers and POS-tag-set for Bangla language, which is very important computational verbal tool needed for natural language processing (NLP) presentation.

Keywords: Tag-set, Ambiguity, Trigram, HMM, NLP, Token, Corpus, Bangla Language.