Investigating Forms and Functions of Characters used in the Modern Bengali Script
312pages
11 heures de lecture
Focusing on modern Bengali script, this book offers an in-depth descriptive study based on extensive empirical data from a large corpus of written texts. It systematically explores the formal and functional aspects of Bengali orthographic symbols, including spelling and punctuation. The author analyzes the shapes, sizes, and usage patterns of characters, as well as the influences on their formation and changes. This work serves as a valuable resource for learners, educators, researchers, and linguists involved in Bengali linguistics.
Focusing on the methodologies for annotating and processing language corpora, this book delves into challenges across various language proficiency levels. It combines theoretical background with empirical data, illuminating the complexities of corpus annotation and text processing. Additionally, it demonstrates how linguistic elements are analyzed to enhance language technology systems. This resource is particularly beneficial for researchers, educators, and students in the fields of linguistics and language technology, offering valuable insights into practical applications.
This book discusses some of the basic issues relating to corpus generation and the methods normally used to generate a corpus. Since corpus-related research goes beyond corpus generation, the book also addresses other major topics connected with the use and application of language corpora, namely, corpus readiness in the context of corpus sanitation and pre-editing of corpus texts; the application of statistical methods; and various text processing techniques. Importantly, it explores how corpora can be used as a primary or secondary resource in English language teaching, in creating dictionaries, in word sense disambiguation, in various language technologies, and in other branches of linguistics. Lastly, the book sheds light on the status quo of corpus generation in Indian languages and identifies current and future needs. Discussing various technical issues in the field in a lucid manner, providing extensive new diagrams and charts for easy comprehension, and using simplified English, the book is an ideal resource for non-native English readers. Written by academics with many years of experience teaching and researching corpus linguistics, its focus on Indian languages and on English corpora makes it applicable to graduate and postgraduate students of applied linguistics, computational linguistics and language processing in South Asia and across countries where English is spoken as a first or second language.