Unicode 5.0 Tutorial                    ASMUS, Inc.  

The Unicode 5.0 Tutorial


Part I - Characters in Action


Part I of the Unicode 5.0 Tutorial was developed by ASMUS, Inc. to serve as an accessible and entertaining way of visualizing the core concepts of the Unicode standard. It answers these questions:


Sample Slide: Encoding Characters

Sample Slide

• What is a Unicode character?
• How are Unicode characters represented?
• How do Unicode character codes fit into a modern computing environment?
• How are Unicode characters entered into a computer?
• How are Unicode characters displayed on a computer?
• How are Unicode characters interchanged?
• What is the interaction between Unicode and rich text (markup)?
• How do end-users experience Unicode?

Throughout Part I, the Unicode 5.0 Tutorial gives typical examples of how the Unicode Standard interacts with the other elements of an internationalized software architecture. With the help of concrete scenarios for the use of Unicode characters you will become familiar with the role of the Unicode Standard and the benefits of supporting it.


Part I of the tutorial provides a concrete context to which the more systematic and detailed treatment of the features of the Unicode Standard presented in Part II and Part III can be related.


Part II - Fundamental Specifications


Part II of the Unicode 5.0 Tutorial builds on the concepts introduced in Part I and systematically presents the details of fundamental specifications that are part of the Unicode Standard.


•What is the organization of the Unicode code space?
•What principles are used to allocate and unify characters?
•What is a Unicode encoding form?
•What are the definitions of UTF-8, UTF-16, UTF-32?
•What is a byte order mark and how do I use it?
•Which encoding form should I select?
•What are security considerations when using Unicode?
•What are Unicode Character Properties?
•What is the relationship between Unicode and plain text?
•What are combining characters?
•When are code sequences equivalent?
•What are special characters such as format characters?
•What types of code points exist?
•Where are all the pieces of the Unicode Standard?

Sample slide: Combining Characters

Sample Slide


Part II of the Unicode tutorial is recommended for anyone interested in a systematic overview of the key aspects of the standard. Detailed technical or programming experience is not required. The descriptions of the technical concepts have been aligned with the revised text for Unicode 5.0 and anticipated future developments of the standard.


Part III - Unicode Algorithms


The Unicode Standard and related specifications by the Unicode Consortium specify a number of algorithms. Part III of the Unicode 5.0 Tutorial surveys the algorithms specified in the Unicode Standard, and covers many general aspects of Unicode algorithms:


Sample Slide: Steps of the Algorithm

Sample Slide

• What is a Unicode Algorithm?
• How is an abstract algorithm different from an actual implementation?
• How does it relate to Unicode Character Properties?
• What are efficient techniques for storing and accessing character properties?
• What is Unicode Normalization and what requirements does it address?
• What is a Unicode Normalization form?
• What are the different normalization forms (NFC, NFD, NFKC, NFKD)?
• What do I need to know in applying normalization?
• How does Normalization interact with the web?
• What is the Unicode Bidirectional Algorithm?
• How is it defined and how does it interact with other text layout tasks?
• When do I need to support it?
• How do I determine text boundaries and line breaks?
• What are character foldings?
• How does character transformation interact with Normalization?


Part III of the Unicode 5.0 Tutorial is more detailed and will touch on the description of algorithms and other material that may require some familiarity with technical concepts. The descriptions of the technical concepts have been aligned with the revised text for Unicode 5.0 as well as anticipated future developments of the standard.


Additional Information


The Author


Asmus Freytag, Ph.D. is president of ASMUS, Inc. a Seattle-based company specializing in consulting services and seminars on topics ranging from software internationalization to implementing Unicode.

He has been a contributor to the Unicode Standard since before the inception of the Unicode Consortium and a co-author of the Unicode Standard up to and including Unicode 5.0. He has written or contributed to several Unicode Technical Reports and Standards. He is a technical vice-president emeritus of the Unicode Consortium and for many years he represented the consortium in other standards bodies such as NCITS/L2 and ISO/IEC JTC1/SC2/WG2.

The Handouts


The tutorial is accompanied by extensive handouts. Each page of the handouts contains the text of the slide together with complete text, as well as additional graphics or marginal notes as needed.



For more information or to schedule a presentation of the tutorial please e-mail: asmus@unicode.org

Sample Handout: Unicode Design Principles

Sample Handout Page

  ASMUS, Inc.   ASMUS, Inc.  

Copyright © 2006-2007 ASMUS, Inc. All rights reserved.