Unicode, UTF-8 and multilingual text: An introduction

This post was originally published on this site


This article introduces a number of OpenType and Unicode-related topics: starting out with a discussion of what is meant by a “character” and moving on to introduce scripts/languages, Unicode encoding and UTF-8—together with an example of working with a multilingual text file containing English and Arabic text. Our objective is to provide an introduction to some key terms/topics and piece together a basic framework to show how those topics are related—providing users of LaTeX with some helpful background information.

Screenshot showing a multilingual UTF-8 text file open in a HEX editor

Comments are closed.

Proudly powered by WordPress | Theme: Baskerville 2 by Anders Noren.

Up ↑