The History of "Å"

The letter "Å" was implemented in the Norwegian language in 1917, but was not recognized internationally until the 60's. There were several problems representing this letter in texts, thus often referred to as an A with a ring above, which is an example of a diacritical mark, i.e. an accent near or through an orthographic or phonetic character or combination of characters indicating a phonetic value different from that given the unmarked or otherwise marked element.

ASCII(American Standard Code For Information Interchange) is a standard data-transmission code that is used to represent both textual data (letters, numbers, and punctuation marks) and noninput-device commands (control characters). Like other coding systems, it converts information into standardized digital formats that allow computers to communicate with each other. ASCII was introduced in 1963 and was a 7-bit code system, containing 128 different signs. Alas, ASCII was only functional with the English alphabet, and lead to the disclusion of other countries that also used the Latin alphabet, but with some additional letters. Hence, the creation of ISO 646-60. The standard is quite similar to ASCII, but sacrificed signs like [\] and {|} to make room for new letters. "[" was replaced by "Å" and "}" was replaced by "å". "Å" was for instance 01011101 in binary representation and 5D in hexadecimal representation.

The 7-bit standards became obsolete as soon as the 8-bits standard was introduced. ISO 8859 (1968) created different standards for different alphabets. The Norwegian language was placed under the 8859-1 standard, which is common for most Western European alphabets, and the letter "Å" received its respective place and code.

Windows CP 1252 did in no way revolutionize the standards, nor did it change the make-up of the letter "Å". The only way it differed from ISO 8859-1, was that it vacated some of the voids in ISO 8859-1 with new signs.

When the Unicode Standard was implemented, it seemed to be the solution to every problem. The Unicode Standard is a character coding system designed to support the worldwide interchange, processing, and display of the written texts of the diverse languages and technical disciplines of the modern world. In addition, it supports classical and historical texts of many written languages.

The Unicode Standard provides 1,114,112 code points, most of which are available for encoding of characters. The majority of the common characters used in the major languages of the world are encoded in the first 65,536 code points, also known as the Basic Multilingual Plane (BMP). The overall capacity for more than a million characters is more than sufficient for all known character encoding requirements, including full coverage of all minority and historic scripts of the world. Unicode provides for three encoding forms: a 32-bit form (UTF-32), a 16-bit form (UTF-16), and an 8-bit form (UTF-8). Different encoding forms of Unicode are useful in different system environments. For example, UTF-32 is somewhat simpler in usage than UTF-16, but in almost all cases occupies twice the storage. A common strategy is to have internal string storage use UTF-16 or UTF-8, but to use UTF-32 for individual character datatypes. The code for "Å" is identical to ISO's 8859-1. If you need the letter "Å" you can either write the decimal code or Å which is the diacritical mark for this letter, combining an "A" with a ring.

Angstrom is a unit of length used chiefly in measuring wavelengths of light, equal to 10-10 metre. It is named for the 19th-century Swedish physicist Anders Jonas Ångström. The symbol is Å. The angstrom and multiples of it, the micron and the millimicron, are also used to measure such quantities as molecular diameters and the thickness of films on liquids. Henceforth, "Å" inevitably became an international symbol, even though it is not a SI-unit. In ISO 8859-1 and Unicode the "Å" unit has its own codes, quite different from the letter "Å" (8491 in decimal-, 212B in hexadecimal- and 0010000100101011 in binary code.).

In representing national characters, even today, you will encounter difficulties. There are several standards in use all over the world, some still writing in an 8-bit system, whilst others use 32-bit systems. A common problem is interpreting a 32-bit text, when your own system only supports an 8-bit version, which do not have characters like "Å". The character will then be replaced and cause misrepresentation in the context. There are several national standards, but few international ones. Unicode's main agenda is to solve problems of this nature. Their plan is to implement a standard in 21-bit with the possibility of representing 1114112 signs and characters. So far, they have come up with three different plans. Plan 0, which is called BMP - Basic Multilingual Plan 0. Plan 1 is called SIP - Supplementary Multilingual Plane, which deals with historical languages, music etc. Plan 2 which is the SMP - Supplementary Ideographic Plane, is still not widely used, but it involves ancient Chinese signs. Plan 0 contains the most common characters and signs around the world. Of course, another problem surfaces, i.e. writing these signs and characters requires a very high bit rate, which means additional storage is required.

Sources: http://www.ifi.uio.no/~inf1040/foiler2003/tekst.pdf

Encylcopædia Britannica

Wikipedia

http://www.unicode.org

http://www.w3.org