Buch, Englisch, 220 Seiten, Format (B × H): 161 mm x 240 mm, Gewicht: 499 g
Buch, Englisch, 220 Seiten, Format (B × H): 161 mm x 240 mm, Gewicht: 499 g
ISBN: 978-0-367-56199-4
Verlag: Chapman and Hall/CRC
Machine Translation and Transliteration involving Related, Low-resource Languages discusses an important aspect of natural language processing that has received lesser attention: translation and transliteration involving related languages in a low-resource setting. This is a very relevant real-world scenario for people living in neighbouring states/provinces/countries who speak similar languages and need to communicate with each other, but training data to build supporting MT systems is limited. The book discusses different characteristics of related languages with rich examples and draws connections between two problems: translation for related languages and transliteration. It shows how linguistic similarities can be utilized to learn MT systems for related languages with limited data. It comprehensively discusses the use of subword-level models and multilinguality to utilize these linguistic similarities. The second part of the book explores methods for machine transliteration involving related languages based on multilingual and unsupervised approaches. Through extensive experiments over a wide variety of languages, the efficacy of these methods is established.
Features
- Novel methods for machine translation and transliteration between related languages, supported with experiments on a wide variety of languages.
- An overview of past literature on machine translation for related languages.
- A case study about machine translation for related languages between 10 major languages from India, which is one of the most linguistically diverse country in the world.
The book presents important concepts and methods for machine translation involving related languages. In general, it serves as a good reference to NLP for related languages. It is intended for students, researchers and professionals interested in Machine Translation, Translation Studies, Multilingual Computing Machine and Natural Language Processing. It can be used as reference reading for courses in NLP and machine translation.
Anoop Kunchukuttan is a Senior Applied Researcher at Microsoft India. His research spans various areas on multilingual and low-resource NLP. Pushpak Bhattacharyya is a Professor at the Department of Computer Science, IIT Bombay. His research areas are Natural Language Processing, Machine Learning and AI (NLP-ML-AI). Prof. Bhattacharyya has published more than 350 research papers in various areas of NLP.
Zielgruppe
Academic, Postgraduate, Professional, and Undergraduate Advanced
Autoren/Hrsg.
Fachgebiete
- Mathematik | Informatik EDV | Informatik Informatik Theoretische Informatik
- Mathematik | Informatik EDV | Informatik EDV & Informatik Allgemein
- Wirtschaftswissenschaften Betriebswirtschaft Wirtschaftsmathematik und -statistik
- Mathematik | Informatik EDV | Informatik Daten / Datenbanken Data Mining
- Mathematik | Informatik EDV | Informatik Informatik Künstliche Intelligenz
- Mathematik | Informatik Mathematik Algebra Zahlentheorie
- Mathematik | Informatik EDV | Informatik Programmierung | Softwareentwicklung Spiele-Programmierung, Rendering, Animation
- Wirtschaftswissenschaften Volkswirtschaftslehre Volkswirtschaftslehre Allgemein Wirtschaftsstatistik, Demographie
Weitere Infos & Material
Preface. Introduction. Past Work on MT for Related Languages. I Machine Translation. Utilizing Lexical Similarity by using Subword Translation Units. Improving Subword-level. Translation Quality. Subword-level Pivot-based SMT. A Case Study on Indic Language Translation. II Machine Transliteration. Utilizing Orthographic Similarity for Unsupervised Transliteration. Multilingual Neural Transliteration. Conclusion and Future Directions. Appendices. A Extended ITRANS Romanization Scheme. B Software and Data Resources. C Conferences/Workshops for Translation between Related Languages. Bibliography.