You can define which document type is used as a default document for a site.
For example, if most pages in your site are of a specific file type (such as Cold Fusion, HTML, or ASP documents), you can set document preferences that automatically create new documents of the specified file type.
If you select Unicode (UTF‑8) as the document encoding, entity encoding is not necessary because UTF‑8 can safely represent all characters. If you select another document encoding, entity encoding may be necessary to represent certain characters. For more information on character entities, see www.w3.org/TR/REC-html40/sgml/entities.html.
If you select Unicode (UTF‑8) as a default encoding, you can include a Byte Order Mark (BOM) in the document by selecting the Include Unicode Signature (BOM) option.
A BOM is 2-4 bytes at the beginning of a text file that identifies a file as Unicode, as well as the byte order of the following bytes. Because UTF‑8 has no byte order, adding a UTF‑8 BOM is optional. For UTF‑16 and UTF‑32, it is required.
There are four Unicode Normalization Forms. The most important is Normalization Form C because it's the most common one used in the Character Model for the World Wide Web. Adobe provides the other three Unicode Normalization Forms for completeness.
In Unicode, there are characters that are visually similar but can be stored within the document in different ways. For example, “ë” (e‑umlaut) can be represented as a single character, “e‑umlaut,” or as two characters, “regular Latin e” + “combining umlaut.” A Unicode combining character is one that gets used with the previous character, so the umlaut would appear above the “Latin e.” Both forms result in the same visual typography, but what is saved in the file is different for each form.
Normalization is the process of making sure all characters that can be saved in different forms are all saved using the same form. That is, all “ë” characters in a document are saved as single “e‑umlaut” or as “e” + “combining umlaut,” and not as both forms in one document.
For more information on Unicode Normalization and the specific forms that can be used, see the Unicode website at www.unicode.org/reports/tr15.