|
马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有账号?注册
×
附录
A. 参考文献
A.1 正式参考文献
A.2 其他参考文献
B. 字符的分类
C. XML 和 SGML(非正式)
D. 实体和字符引用的展开(非正式)
E. 确定型内容模型(非正式)
F. 字符编码的自动检测(非正式)
F.1 无外部编码信息时的检测
F.2 有外部编码信息时的优先级
G. W3C XML 工作组(非正式)
H. W3C XML 核心工作组(非正式)
I. 文档制作说明(非正式)
A. 参考文献
A.1 正式参考文献
IANA-CHARSETS
(Internet Assigned Numbers Authority) Official Names for Character Sets, ed. Keld Simonsen et al. See ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets.
IETF RFC 1766
IETF (Internet Engineering Task Force). RFC 1766: Tags for the Identification of Languages, ed. H. Alvestrand. 1995.
ISO/IEC 10646
ISO (International Organization for Standardization). ISO/IEC 10646-1993 (E). Information technology -- Universal Multiple-Octet Coded Character Set (UCS) -- Part 1: Architecture and Basic Multilingual Plane. [Geneva]: International Organization for Standardization, 1993 (plus amendments AM 1 through AM 7).
ISO/IEC 10646-2000
ISO (International Organization for Standardization). ISO/IEC 10646-1:2000. Information technology -- Universal Multiple-Octet Coded Character Set (UCS) -- Part 1: Architecture and Basic Multilingual Plane. [Geneva]: International Organization for Standardization, 2000.
Unicode
The Unicode Consortium. The Unicode Standard, Version 2.0. Reading, Mass.: Addison-Wesley Developers Press, 1996.
Unicode3
The Unicode Consortium. The Unicode Standard, Version 3.0. Reading, Mass.: Addison-Wesley Developers Press, 2000. ISBN 0-201-61633-5.
A.2 其他参考文献
Aho/Ullman
Aho, Alfred V., Ravi Sethi, and Jeffrey D. Ullman. Compilers: Principles, Techniques, and Tools. Reading: Addison-Wesley, 1986, rpt. corr. 1988.
Berners-Lee et al.
Berners-Lee, T., R. Fielding, and L. Masinter. Uniform Resource Identifiers (URI): Generic Syntax and Semantics. 1997. (Work in progress; see updates to RFC1738.)
Br黦gemann-Klein
Br黦gemann-Klein, Anne. Formal Models in Document Processing. Habilitationsschrift. Faculty of Mathematics at the University of Freiburg, 1993. (See ftp://ftp.informatik.uni-freibur ... brueggem/habil.ps.)
Br黦gemann-Klein and Wood
Br黦gemann-Klein, Anne, and Derick Wood. Deterministic Regular Languages. Universit鋞 Freiburg, Institut f黵 Informatik, Bericht 38, Oktober 1991. Extended abstract in A. Finkel, M. Jantzen, Hrsg., STACS 1992, S. 173-184. Springer-Verlag, Berlin 1992. Lecture Notes in Computer Science 577. Full version titled One-Unambiguous Regular Languages in Information and Computation 140 (2): 229-253, February 1998.
Clark
James Clark. Comparison of SGML and XML. See http://www.w3.org/TR/NOTE-sgml-xml-971215.
IANA-LANGCODES
(Internet Assigned Numbers Authority) Registry of Language Tags, ed. Keld Simonsen et al. (See http://www.isi.edu/in-notes/iana/assignments/languages/.)
IETF RFC2141
IETF (Internet Engineering Task Force). RFC 2141: URN Syntax, ed. R. Moats. 1997.
IETF RFC 2279
IETF (Internet Engineering Task Force). RFC 2279: UTF-8, a transformation format of ISO 10646, ed. F. Yergeau, 1998. (See http://www.ietf.org/rfc/rfc2279.txt.)
IETF RFC 2376
IETF (Internet Engineering Task Force). RFC 2376: XML Media Types. ed. E. Whitehead, M. Murata. 1998. (See http://www.ietf.org/rfc/rfc2376.txt.)
IETF RFC 2396
IETF (Internet Engineering Task Force). RFC 2396: Uniform Resource Identifiers (URI): Generic Syntax. T. Berners-Lee, R. Fielding, L. Masinter. 1998. (See http://www.ietf.org/rfc/rfc2396.txt.)
IETF RFC 2732
IETF (Internet Engineering Task Force). RFC 2732: Format for Literal IPv6 Addresses in URL's. R. Hinden, B. Carpenter, L. Masinter. 1999. (See http://www.ietf.org/rfc/rfc2732.txt.)
IETF RFC 2781
IETF (Internet Engineering Task Force). RFC 2781: UTF-16, an encoding of ISO 10646, ed. P. Hoffman, F. Yergeau. 2000. (See http://www.ietf.org/rfc/rfc2781.txt.)
ISO 639
(International Organization for Standardization). ISO 639:1988 (E). Code for the representation of names of languages. [Geneva]: International Organization for Standardization, 1988.
ISO 3166
(International Organization for Standardization). ISO 3166-1:1997 (E). Codes for the representation of names of countries and their subdivisions -- Part 1: Country codes [Geneva]: International Organization for Standardization, 1997.
ISO 8879
ISO (International Organization for Standardization). ISO 8879:1986(E). Information processing -- Text and Office Systems -- Standard Generalized Markup Language (SGML). First edition -- 1986-10-15. [Geneva]: International Organization for Standardization, 1986.
ISO/IEC 10744
ISO (International Organization for Standardization). ISO/IEC 10744-1992 (E). Information technology -- Hypermedia/Time-based Structuring Language (HyTime). [Geneva]: International Organization for Standardization, 1992. Extended Facilities Annexe. [Geneva]: International Organization for Standardization, 1996.
WEBSGML
ISO (International Organization for Standardization). ISO 8879:1986 TC2. Information technology -- Document Description and Processing Languages. [Geneva]: International Organization for Standardization, 1998. (See http://www.sgmlsource.com/8879rev/n0029.htm.)
XML Names
Tim Bray, Dave Hollander, and Andrew Layman, editors. Namespaces in XML. Textuality, Hewlett-Packard, and Microsoft. World Wide Web Consortium, 1999. (See http://www.w3.org/TR/REC-xml-names/.)
B. 字符的分类(Character Classes)
根据 Unicode 标准中定义的特征,字符被分为基字符(其中包含了拉丁字母),表意字符和组合字符(其中包含了大多数的变音符)。数字和扩展符(extender)也各自被分成类。 |
|