![]() |
|
||||||
Access to the Unicode™ Standard |
|||||||
The Poll |
|||||||
|
Format: E-mail questionnaire
Questions: Open ended with some
Target: Two large e-mail lists devoted Date: 2005-10-07 Duration: 72 hours
Number of responses: 99 |
BackgroundThe Unicode Standard has been in publication since 1991. It is now in version 4.1. Originally, the only documentation was the book, later augmented by a CD-ROM. The CD-ROM contains a copy of the Unicode Character database which is also available online. Since Unicode 3.0 there have been online PDF files covering the text of the standard, as well as the code charts. Since 1996 there have been online editions of Unicode Technical Reports, and later Unicode Technical Standards as well as Unicode Standard Annexes. In parallel the Unicode web-site provides an ever-growing set of additional information about the standard. The PollWeb-statistics can be used to discern usage patterns for online access to the standard, and the book sales reported by the publisher allow some conclusions about the scale and trends in the use of the hardcopy book. However, the information provided by these two sources is at a different level of detail and, more importantly, allows no conclusions about the motivations of the user. Therefore, we conducted an informal poll to obtain additional information about how people access the Unicode Standard. The poll is an extension of our informal practice of asking participants in our Unicode Tutorials about their sources of getting information about the Unicode Standard. While such information can be helpful in designing tutorials, we realized that the same information collected from a wider circle of respondents could be used to improve the authoritative sources of information as well. The SampleThe sample reported here are 99 self-selected respondents from two large mailing lists focused on Unicode and internationalization issues. The sample is clearly not representative in that it tends to underrepresent the more occasional users and tends to overrepresent people knowledgeable about the standard. However, reading the detailed answers, it becomes clear that the sample successfully represents a wide range of strategies for accessing the standard and that makes the results useful for the intended purpose. |
||||||
Modes of Accessing the Unicode Standard |
|||||||
The first part of the poll
addressed the users' primary choice of mode of access of information about the
standard. The tabulated results (at right) suggest a fairly balanced
picture overall picture.
|
If you need to look up something in the Unicode
Standard
|
||||||
Access to the Unicode Standard as a Book |
|||||||
|
Do you have access to a hard copy (book edition)
|
The Unicode Standard and "the book" are not identical. The
book represents a coherent snapshot of a particular major version of the
standard — with the Unicode Annexes and Unicode Character Database as
softcopy on the CD-ROM. As soon as errata or update versions are available
online, the information contained in the book is no longer guaranteed to
be the most recent. However, comments indicated that rather than looking
(only) for the latest information, people conceive of the book as a
reference that can serve to get a grasp on the whole. The poll
asked about whether respondents had access to a book:
|
||||||
Versions of the Unicode Standard in Book Form |
|||||||
The intent of this question was to simply find out which version of the
Unicode Standard were accessible to respondents in book form. Some
people responded as intended, citing the latest version in their
possession. Many other listed several or even all the versions that they
have access to. In many cases, detailed comments explained their usage
patterns.
|
Which version of the Unicode Standard do you have access to?
|
||||||
Reason to get the Latest Book |
|||||||
|
The Unicode Standard continues to be extended to cover new scripts, and
to complete the repertoire of existing scripts and symbol collections.
Each version adds additional information to the Unicode Character
Database, whether in corrections or improvements of character property
assignments, introduction of new properties, or extension of existing
properties to cover newly added characters. The text of the standard,
including the UAXs is continually revised, both for clarity and to cover
additional topics. Approximately every three years, a new edition of the
book is released.
It is in the nature of things that the most widely used characters have been encoded early on and more recent additions have focused on more obscure usage. Equally, the most fundamental aspects of character behavior are already described in the earliest versions. Later improvements often had the character of detail fixes, even though sometimes a more thorough restatement has introduced additional clarity. In this context we asked the open-ended question: What would be a compelling reason to
upgrade? |
The comments about what would make respondents choose to purchase a
new edition of a book they already own were very much in line with what one would expect from people
making the decision of whether to buy a revised version of a popular
programming book.
|
||||||
Access to Online Information about the Unicode Standard |
|||||||
The questions in the next part of the poll were designed to gather
information about which of the online resources users typically access,
and to what degree. Also included was a question about the CD-ROM that
has been included in every copy the book since Unicode 2.0.
CD-ROMThe CD-ROM essentially provides a snapshot of certain parts of the Unicode website at the time of the publication of the book. The drawback of this is that the information is stale, the moment the standard is updated, which tends to occur between two and three times between major versions. Nearly 75% of all users never use the CD-ROM or used it only once to check its content. However, the 10-15% that seriously use the CD-ROM do so primarily in order to deal with limited connectivity issues that make ongoing use of the online information impractical, expensive or impossible. 2% suggested that the CD-ROM serves a useful archival function. Unicode Character Database and Online ChartsThese are among the most consistently used parts of the online information about the Unicode Standard, with the charts being used more often online while many users reported that they use a local copy of the UCD. Note that the questionnaire allowed both yes/no answers as well as more specific answer to degree of usage. From the way the answers were formulated and from the comments, we get the impression that users that used a resource "often" were highly motivated to disclose that fact. Therefore, we show the "yes" answer between the "often" and "sometimes" answer in the bar chart on the right. Unicode Standard Annexes as well as
|
Parts of the Unicode Standard (Source: Unicode 4.1 Tutorial)
|
||||||
Poll conducted and interpreted by Asmus Freytag
Note: all comments have been summarized or edited. Unicode is a trademark of the Unicode Consortium. |
|||||||
![]() |
|
||||||
|
Copyright © 2005 ASMUS, Inc. All rights reserved. |
|||||||