The Use of MSVM and HMM for Sentence Alignment

dc.AffiliationOctober University for modern sciences and Arts (MSA)
dc.contributor.authorFattah, Mohamed Abdel
dc.date.accessioned2020-01-01T08:23:24Z
dc.date.available2020-01-01T08:23:24Z
dc.date.issued2012-06
dc.descriptionAccession Number: WOS:000420351000006en_US
dc.description.abstract—In this paper, two new approaches to align English-Arabic sentences in bilingual parallel corpora based on the Multi-Class Support Vector Machine (MSVM) and the Hidden Markov Model (HMM) classifiers are presented. A feature vector is extracted from the text pair that is under consideration. This vector contains text features such as length, punctuation score, and cognate score values. A set of manually prepared training data was assigned to train the Multi-Class Support Vector Machine and Hidden Markov Model. Another set of data was used for testing. The results of the MSVM and HMM outperform the results of the length based approach. Moreover these new approaches are valid for any language pairs and are quite flexible since the feature vector may contain less, more, or different features, such as a lexical matching feature and Hanzi characters in Japanese-Chinese texts, than the ones used in the current researchen_US
dc.description.urihttps://www.scimagojr.com/journalsearch.php?q=21100203301&tip=sid&clean=0
dc.identifier.doihttps://doi.org/10.3745/JIPS.2012.8.2.301
dc.identifier.issn1976-913X
dc.identifier.otherhttps://doi.org/10.3745/JIPS.2012.8.2.301
dc.identifier.urihttps://t.ly/GJk0z
dc.language.isoen_USen_US
dc.publisherKOREA INFORMATION PROCESSING SOCen_US
dc.relation.ispartofseriesJOURNAL OF INFORMATION PROCESSING SYSTEMS;Volume: 8 Issue: 2 Pages: 301-314
dc.subjectUniversity for October University for Hidden Markov modelen_US
dc.subjectMulti-Class Support Vector Machineen_US
dc.subjectMachine Translationen_US
dc.subjectParallel Corporaen_US
dc.subjectEnglish/ Arabic Parallel Corpusen_US
dc.subjectSentence Alignmenten_US
dc.titleThe Use of MSVM and HMM for Sentence Alignmenten_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
avatar_scholar_256.png
Size:
6.31 KB
Format:
Portable Network Graphics
Description: