차례
  1. 2 공통 의미구조
    1. 2.1 단어 사용
      1. 2.1.1 자원
      2. 2.1.2 XML
      3. 2.1.3 DOM 트리
      4. 2.1.4 스크립팅
      5. 2.1.5 플러그인
      6. 2.1.6 문자 인코딩
    2. 2.2 이행 요구사항
      1. 2.2.1 종속성
      2. 2.2.2 확장성
    3. 2.3 대소문자 구분과 문자열 비교
    4. 2.4 UTF-8

2 공통 의미구조

2.1 단어 사용

이 명세는 HTML과 XML 속성, 그리고 IDL 속성을 종종 같은 문맥에서 다룹니다. 이러한 설명이 어느 것을 가리키는지 불분명한 경우, 내용 속성이라는 표현으로 HTML과 XML 속성을 말하며, IDL 속성이라는 표현으로 IDL의 속성을 가리킵니다. 흡사하게, 프로퍼티 라는 표현이 자바스크립트와 CSS에 동시에 적용됩니다. 만약 이것이 모호할 경우, 이들을 각각 객체 프로퍼티, CSS 프로퍼티 로 구분할 것입니다. 역주

This specification refers to both HTML and XML attributes and IDL attributes, often in the same context. When it is not clear which is being referred to, they are referred to as content attributes for HTML and XML attributes, and IDL attributes for those defined on IDL interfaces. Similarly, the term "properties" is used for both JavaScript object properties and CSS properties. When these are ambiguous they are qualified as object properties and CSS properties respectively.

일반적으로, 이 명세에서 어떤 기능이 HTML 문법, 또는 XHTML 문법에 적용된다고 할 경우, 그 표현은 양쪽 모두에 적용됩니다. 만약 어떤 기능이 두 언어 중 하나에만 적용된다면, 다른 것에는 적용되지 않음을 "HTML에서는, ...(이것은 XHTML에는 적용되지 않습니다)" 와 같이 명시적으로 나타낼 것입니다.

Generally, when the specification states that a feature applies to the HTML syntax or the XHTML syntax, it also includes the other. When a feature specifically only applies to one of the two languages, it is called out by explicitly stating that it does not apply to the other format, as in "for HTML, ... (this does not apply to XHTML)".

이 명세는 문서라는 단어를 모든 HTML에 걸쳐 사용합니다. 작고 정적인 문서, 긴 길이의 에세이, 멀티미디어 요소를 포함하는 보고서, 완전히 상호작용하는 어플리케이션 모두가 이에 포함됩니다.

This specification uses the term document to refer to any use of HTML, ranging from short static documents to long essays or reports with rich multimedia, as well as to fully-fledged interactive applications.

간결함을 위해, 보이는, 표시된, 볼 수 있는 이라는 단어는 문서를 사용자에게 렌더링하는 방법을 가리킵니다. 이러한 단어가 시각적인 매체를 암시적으로 가리키는 것은 아닙니다. 다른 매체에서도 같은 방법으로 적용된다고 인식해야 합니다.

For simplicity, terms such as shown, displayed, and visible might sometimes be used when referring to the way a document is rendered to the user. These terms are not meant to imply a visual medium; they must be considered to apply to other media in equivalent ways.

B라는 알고리즘이 다른 알고리즘 A에게 돌아간다return고 하면, 그것은 A가 B를 호출하였음을 암시합니다. A에게 돌아갈 때, 사용자 에이전트는 반드시 호출 시점에서부터 계속해야 합니다.

When an algorithm B says to return to another algorithm A, it implies that A called B. Upon returning to A, the implementation must continue from where it left off in calling B.

2.1.1 자원들

이 명세에서 지원하는는 단어를 사용할 때에는, 사용자 에이전트가 외부 자원을 의미에 따라 해석할 수 있는지를 말합니다. 어떤 포맷이나 타입을 지원한다고 말한다면, 그것은 사용자 에이전트가 그 자원의 중요한 부분을 무시하지 않은 채 처리할 수 있음을 말합니다. 특정한 자원을 지원하는가는 그 자원의 포맷에 어떠한 기능이 있는가에 따라 달라질 수 있습니다.

The specification uses the term supported when referring to whether a user agent has an implementation capable of decoding the semantics of an external resource. A format or type is said to be supported if the implementation can process an external resource of that format or type without critical aspects of the resource being ignored. Whether a specific resource is supported can depend on what features of the resource's format are in use.

예를 들어, PNG 이미지는, 사용자 에이전트가 이미지에 애니메이션 데이터가 포함됨을 인지하지 못한다고 하더라도, 그 픽셀 데이터를 해석해서 표현할 수 있다면 지원하는 것으로 간주합니다.

For example, a PNG image would be considered to be in a supported format if its pixel data could be decoded and rendered, even if, unbeknownst to the implementation, the image also contained animation data.

MPEG64 비디오 파일의 경우, 설령 사용자 에이전트가 그 파일의 메타데이터에서 가로/세로 크기를 읽어낼 수 있다고 하더라도 압축 포맷을 지원하지 못한다면 지원하지 않는 것으로 간주합니다.

A MPEG4 video file would not be considered to be in a supported format if the compression format used was not supported, even if the implementation could determine the dimensions of the movie from the file's metadata.

일부 명세들, 특히 HTTP와 URI 관련 명세들에서 나타냄representation이라 칭하는 것을 이 명세에서는 자원resource라 칭합니다. [HTTP] [RFC3986]

What some specifications, in particular the HTTP and URI specifications, refer to as a representation is referred to in this specification as a resource. [HTTP] [RFC3986]

마임 타입이라는 단어는 프로토콜 분야에서 간혹 인터넷 미디어 타입이라고 불리는 것을 말합니다. 이 명세에서 미디어 타입이라는 단어는 또한 CSS 명세에서 사용하는 바와 같이, 표현되는 매체의 타입을 가리킵니다. [RFC2046] [MQ]

The term MIME type is used to refer to what is sometimes called an Internet media type in protocol literature. The term media type in this specification is used to refer to the type of media intended for presentation, as used by the CSS specifications. [RFC2046] [MQ]

문자열은 RFC 2616 3.7장의 미디어 타입 섹션에 기술된 규칙에 일치한다면, 유효한 마임 타입으로 간주합니다. 정확히 말해, 유효한 마임 타입은 마임 타입 매개변수를 포함할 수 있습니다. [HTTP]

A string is a valid MIME type if it matches the media-type rule defined in section 3.7 "Media Types" of RFC 2616. In particular, a valid MIME type may include MIME type parameters. [HTTP]

문자열이 RFC 2616 3.7장의 미디어 타입 섹션에 기술된 규칙에 일치하고 세미콜론(;) 문자를 포함하지 않는다면, 매개변수 없는 유효한 마임 타입으로 간주합니다. 바꿔 말해, 문자열이 타입과 서브타입만으로 구성되면서 마임 타입 매개변수를 갖지 않는 경우입니다.

A string is a valid MIME type with no parameters if it matches the media-type rule defined in section 3.7 "Media Types" of RFC 2616, but does not contain any U+003B SEMICOLON characters (;). In other words, if it consists only of a type and subtype, with no MIME Type parameters. [HTTP]

HTML 마임 타입이라는 단어는 text/html, text/html-sandboxed 를 지칭합니다.

The term HTML MIME type is used to refer to the MIME types text/html and text/html-sandboxed.

자원의 치명적 부속자원이란 그 자원을 올바르게 처리하는데 필요한 자원입니다. 어떠한 자원이 치명적인지, 그렇지 않은지는 자원의 포맷을 정의하는 명세에서 정의합니다. CSS 자원에서는 @import 규칙만이 치명적인 부속자원을 정의하며, 글꼴이나 배경 같은 다른 자원들은 그렇지 않습니다.

A resource's critical subresources are those that the resource needs to have available to be correctly processed. Which resources are considered critical or not is defined by the specification that defines the resource's format. For CSS resources, only @import rules introduce critical subresources; other resources, e.g. fonts or backgrounds, are not.

data: URL 이라는 단어는 data: 스키마를 사용하는 URL을 지칭합니다. [RFC2397]

The term data: URL refers to URLs that use the data: scheme. [RFC2397]

2.1.2 XML

HTML에서 XHTML로 이동하기 쉽게 하기 위해, 이 명세를 지키는 사용자 에이전트는, 최소한 DOM과 CSS에 대해서는, HTML 요소들을 http://www.w3.org/1999/xhtml 네임스페이스에 맞게 해석할 것입니다. 따라서 "HTML 요소"라는 단어는 이 명세에서 HTML과 XHTML 요소 모두를 지칭합니다.

To ease migration from HTML to XHTML, UAs conforming to this specification will place elements in HTML in the http://www.w3.org/1999/xhtml namespace, at least for the purposes of the DOM and CSS. The term "HTML elements", when used in this specification, refers to any element in that namespace, and thus refers to both HTML and XHTML elements.

다른 방법으로 명시되지 않았다면, 이 명세에서 정의되거나 언급되는 모든 요소는 http://www.w3.org/1999/xhtml 네임스페이스 정의를 따르며, 이 명세에서 정의되거나 언급되는 모든 속성은 별도의 네임스페이스를 갖지 않습니다.

Except where otherwise stated, all elements defined or mentioned in this specification are in the http://www.w3.org/1999/xhtml namespace, and all attributes defined or mentioned in this specification have no namespace.

속성의 명칭이 XML에서 정의된 Name에 들어맞고, 콜론(:) 문자를 포함하지 않으며, 그 첫 세 글자가 "xml"(ASCII, 대소문자 구분 없이)이 아닌 경우, 그러한 속성은 XML에 호환된다고 말합니다. [XML]

Attribute names are said to be XML-compatible if they match the Name production defined in XML, they contain no U+003A COLON characters (:), and their first three characters are not an ASCII case-insensitive match for the string "xml". [XML]

XML 마임 타입이라는 단어는 text/xml, application/xml, 서브타입의 마지막 네 글자가 +xml 로 끝나는 모든 마임 타입을 지칭합니다. [RFC3023]

The term XML MIME type is used to refer to the MIME types text/xml, application/xml, and any MIME type whose subtype ends with the four characters "+xml". [RFC3023]

2.1.3 DOM 트리

루트 요소라는 단어는, 문서의 루트 요소라고 명시적으로 지정하지 않았다면, 현재 논의하고 있는 요소의 노드 중 가장 상위에 있는 요소를 말합니다. 만약 해당 노드가 조상 요소를 갖지 않는다면 그 노드 자체를 지칭합니다. 노드가 문서 일부분이라면, 당연히 루트 요소는 문서의 루트 요소를 지칭합니다만, 현재 논의하는 요소가 문서 트리의 일부분이 아니라면 루트 요소는 고아 노드가 될 것입니다.

The term root element, when not explicitly qualified as referring to the document's root element, means the furthest ancestor element node of whatever node is being discussed, or the node itself if it has no ancestors. When the node is a part of the document, then the node's root element is indeed the document's root element; however, if the node is not currently part of the document tree, the root element will be an orphaned node.

요소의 루트 요소Document의 루트 요소일 경우, 그 요소는 Document 안에 있다고 말합니다. 요소의 루트 요소가 변화해서 문서의 루트 요소를 공유할 경우, 그 요소는 문서에 삽입되었다고 말합니다. 반대로, 요소의 루트 요소가 변화해서 문서의 루트 요소와 다른 요소가 될 경우, 문서에서 제거되었다고 말합니다.

When an element's root element is the root element of a Document, it is said to be in a Document. An element is said to have been inserted into a document when its root element changes and is now the document's root element. Analogously, an element is said to have been removed from a document when its root element changes from being the document's root element to being another element.

노드의 홈 서브트리란, 노드의 루트 요소에서 뻗어나온 다른 서브트리를 말합니다. 노드가 Document 안에 있다면, 그것의 홈 서브트리Document의 트리입니다.

A node's home subtree is the subtree rooted at that node's root element. When a node is in a Document, its home subtree is that Document's tree.

Node(요소 같은)의 Document는 그 Node의 IDL 속성 ownerDocument가 반환하는 Document입니다. NodeDocument 안에 있다면, Document는 항상 NodeDocument가 되며, 따라서 Node의 IDL 속성 ownerDocument는 항상 Document를 반환합니다.

The Document of a Node (such as an element) is the Document that the Node's ownerDocument IDL attribute returns. When a Node is in a Document then that Document is always the Node's Document, and the Node's ownerDocument IDL attribute thus always returns that Document.

트리 순서라는 단어는 깊이 순서로 미리 정렬된, 관련된(parentNode/childNodes 관계를 통해) DOM 노드들을 말합니다.

The term tree order means a pre-order, depth-first traversal of DOM nodes involved (through the parentNode/childNodes relationship).

어떤 요소나 속성이 무시되었다, 다른 값으로 취급되거나, 또는 어떤 다른 것이었던 것처럼 다루어진다고 선언되었다면, 이러한 것은 요소가 DOM 노드에 존재한 다음의 이야기입니다. 사용자 에이전트는 이러한 상황에서 DOM을 변형시켜서는 안 됩니다.

When it is stated that some element or attribute is ignored, or treated as some other value, or handled as if it was something else, this refers only to the processing of the node after it is in the DOM. A user agent must not mutate the DOM in such situations.

텍스트 노드라는 단어는 CDATASection 노드를 포함한 모든 Text를 말합니다. 좀 더 구체적으로, 이것은 TEXT_NODE(3) 타입이거나 CDATA_SECTION_NODE(4) 타입인 모든 Node를 말합니다. [DOMCORE]

The term text node refers to any Text node, including CDATASection nodes; specifically, any Node with node type TEXT_NODE (3) or CDATA_SECTION_NODE (4). [DOMCORE]

내용의 속성이 바뀌었다고 표현하는 것은, 오직 그 새로운 값이 이전의 값과 다른 경우뿐입니다. 이미 존재하는 값의 속성 이름을 설정하는 것은 그것을 바꾸는 것이 아닙니다.

A content attribute is said to change value only if its new value is different than its previous value; setting an attribute to a value it already has does not change it.

속성, 텍스트 노드, 문자열이 비어 있다고 표현하는 것은 그러한 텍스트의 길이가 0 임을 의미합니다. (즉, 공백문자나 제어문자도 포함하지 않습니다.)

The term empty, when used of an attribute value, text node, or string, means that the length of the text is zero (i.e. not even containing spaces or control characters).

2.1.4 스크립팅

"Foo 객체" 라는 구조는, Foo 가 실제로는 인터페이스일 경우, 좀 더 정확한 표현인 "인터페이스 Foo를 구현하는 객체"라는 표현 대신 사용하기도 합니다.

The construction "a Foo object", where Foo is actually an interface, is sometimes used instead of the more accurate "an object implementing the interface Foo".

IDL 속성이 그 값을 전달받고 있을 때(스크립트 등을 통해) 속성을 갖게 되는 중이다고 하며, 새로운 값이 설정되었을때 설정되었다고 말합니다.

An IDL attribute is said to be getting when its value is being retrieved (e.g. by author script), and is said to be setting when a new value is assigned to it.

DOM 객체가 살아 있다고 표현한다면, 해당 객체를 반환하는 속성은 항상 반드시 같은 객체(매번 새로운 객체가 아닌)를 반환해야 합니다. 또한, 그 객체의 속성과 메서드는 실제 데이터를 조작해야 하며 데이터의 스냅샷을 조작해서는 안됩니다.

If a DOM object is said to be live, then the attributes and methods on that object must operate on the actual underlying data, not a snapshot of the data.

발생fire배포dispatch라는 단어는 이벤트의 문맥 내에서 혼용됩니다. 신뢰받는 이벤트라는 표현은 DOM 이벤트 명세에서 정의된 대로의 의미입니다. [DOMEVENTS]

The terms fire and dispatch are used interchangeably in the context of events, as in the DOM Events specifications. The term trusted event is used as defined by the DOM Events specification. [DOMEVENTS]

2.1.5 플러그인

플러그인이라는 단어는 사용자 에이전트가 정의하는 컨텐츠 핸들러의 집합입니다. 이러한 것들은 사용자 에이전트가 Document 객체를 렌더링할 때 참여할 수 있지만, 문서의 자식 브라우징 문맥처럼 행동할 수는 없습니다. 또한, Document의 DOM에 어떠한 Node를 삽입할 수도 없습니다.

The term plugin refers to a user-agent defined set of content handlers used by the user agent that can take part in the user agent's rendering of a Document object, but that neither act as child browsing contexts of the Document nor introduce any Node objects to the Document's DOM.

일반적으로 이러한 핸들러들은 써드파티에서 제공합니다. 하지만, 사용자 에이전트가 내장된 컨텐츠 핸들러를 플러그인이라 정의할 수도 있습니다.

Typically such content handlers are provided by third parties, though a user agent can also designate built-in content handlers as plugins.

사용자 에이전트는 text/plain 타입과 application/octet-stream 타입에 등록된 플러그인이 있는 것으로 간주하여서는 안됩니다.

A user agent must not consider the types text/plain and application/octet-stream as having a registered plugin.

이러한 플러그인의 한가지 예를 들자면, 사용자가 PDF 파일에 접근했을때 그것을 브라우징 문맥 내에서 처리해주는 PDF 뷰어를 예로 들 수 있습니다. 이럴 때, PDF 뷰어를 제작한 업체가 브라우저 업체와 같다고 하더라도 마찬가지로 플러그인으로 간주합니다. 반대로, 사용자 에이전트와는 별도의 창을 띄워 PDF 파일을 표시해주는 프로그램은 이 정의에서는 플러그인으로 간주하지 않습니다.

One example of a plugin would be a PDF viewer that is instantiated in a browsing context when the user navigates to a PDF file. This would count as a plugin regardless of whether the party that implemented the PDF viewer component was the same as that which implemented the user agent itself. However, a PDF viewer application that launches separate from the user agent (as opposed to using the same interface) is not a plugin by this definition.

이 명세에서는 플러그인과 상호작용하는 메커니즘에 대해서는 정의하지 않습니다. 그것은 사용자 에이전트의 몫이며 운영체제에 따라 다르기 때문입니다. 일부 사용자 에이전트는 넷스케이프의 플러그인 API와 같은 것을 선택할 수 있으며, 다른 일부는 내용 형식을 변화시키는 방법을 빌릴 수도 있고 또는 몇가지 타입에 대해 내장된 지원을 선택할 수도 있습니다. [NPAPI]

This specification does not define a mechanism for interacting with plugins, as it is expected to be user-agent- and platform-specific. Some UAs might opt to support a plugin mechanism such as the Netscape Plugin API; others might use remote content converters or have built-in support for certain types. [NPAPI]

브라우저는 플러그인을 사용하는 외부 컨텐츠를 다룰 때 극도의 주의를 기울여야 합니다. 써드파티 프로그램이 사용자 에이전트 자신과 같은 권한을 가지고 실행될 때, 이러한 써드파티 프로그램의 취약점은 사용자 에이전트 자신의 취약점인것처럼 위험할 수 있습니다.

Browsers should take extreme care when interacting with external content intended for plugins. When third-party software is run with the same privileges as the user agent itself, vulnerabilities in the third-party software become as dangerous as those in the user agent.

2.1.6 문자 인코딩

ISSUE-101 (us-ascii-ref) blocks progress to Last Call

IANA 문자셋 레지스트리에서 선호되는 마임 이름이라고 명명된 이름, 혹은 그의 약칭이 있다면, 그것이 문자 인코딩에서 선호되는 마임 이름입니다. 그러한 것이 없다면 인코딩 이름이 선호되는 마임 이름입니다. [IANACHARSET]

The preferred MIME name of a character encoding is the name or alias labeled as "preferred MIME name" in the IANA Character Sets registry, if there is one, or the encoding's name, if none of the aliases are so labeled. [IANACHARSET]

아스키 호환 문자 인코딩이란, 1바이트 또는 가변 바이트 인코딩이며, 그러한 인코딩에서 0x09, 0x0A, 0x0C, 0x0D, 0x20 - 0x22, 0x26, 0x27, 0x2C - 0x3F, 0x41 - 0x5A, and 0x61 - 0x7A, 그리고 연속된 여러 개의 바이트에서 두 번째 것 이후의 바이트들을 무시한 바이트들 전체가 같은 유니코드 문자에 대응하는 것을 말합니다. [RFC1345]

An ASCII-compatible character encoding is a single-byte or variable-length encoding in which the bytes 0x09, 0x0A, 0x0C, 0x0D, 0x20 - 0x22, 0x26, 0x27, 0x2C - 0x3F, 0x41 - 0x5A, and 0x61 - 0x7A, ignoring bytes that are the second and later bytes of multibyte sequences, all correspond to single-byte sequences that map to the same Unicode characters as those bytes in ANSI_X3.4-1968 (US-ASCII). [RFC1345]

그러한 인코딩에는 Shift_JIS, HZ-GB-2312, 그리고 ISO-2022 변형들을 포함합니다 - 비록 이러한 인코딩에서 0x70 같은 바이트가 ASCII 로 해석되지 않는 더 긴 바이트의 일부가 될 수 있긴 하지만. 그리고 그러한 인코딩에서 UTF-7, UTF-16, GSM03.38, EBCDIC 변형들은 제외됩니다.

This includes such encodings as Shift_JIS, HZ-GB-2312, and variants of ISO-2022, even though it is possible in these encodings for bytes like 0x70 to be part of longer sequences that are unrelated to their interpretation as ASCII. It excludes such encodings as UTF-7, UTF-16, GSM03.38, and EBCDIC variants.

유니코드 문자라는 단어는 유니코드 스칼라 값(즉, 서러게이트역주 코드 포인트에 속하지 않는 모든 유니코드 코드 포인트)을 말합니다. [UNICODE]

The term Unicode character is used to mean a Unicode scalar value (i.e. any Unicode code point that is not a surrogate code point). [UNICODE]

2.2 이행 요구사항

다른 모든 섹션에서 명시적으로 표현한 것과 같이, 이 명세에 사용된 모든 다이어그램, 예제, 노트들은 규범이 아닙니다. 그 외의 모든 것은 규정된 것입니다.

All diagrams, examples, and notes in this specification are non-normative, as are all sections explicitly marked non-normative. Everything else in this specification is normative.

이 문서의 규정된 부분에서 "해야 한다", "하지 말아야 한다", "요구된다", "권고한다", "하지 않기를 권고한다", "권한다", "해도 된다" 라는 표현들은 RFC2119에서 정의한 내용대로 해석되어야 합니다. 가독성을 위해, 이러한 단어들은 명세에서 대문자로 기재하지는 않고 있습니다. [RFC2119]

The key words "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in the normative parts of this document are to be interpreted as described in RFC2119. For readability, these words do not appear in all uppercase letters in this specification. [RFC2119]

알고리즘의 일부분으로 필요한 것이라고 쓴 요구사항들(즉 "선두에 있는 모든 공백문자를 삭제합니다.", "false를 반환하고 이 단계들을 취소합니다.")은 그 핵심어의 의미("반드시 해야 한다", "해야 한다", "할 수 있다" 등) 그대로 해석해야 합니다.

Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and abort these steps") are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm.

이 명세는 문서에 대한 요구사항을 기술합니다.

This specification describes the conformance criteria for user agents (relevant to implementors) and documents (relevant to authors and authoring tool implementors).

올바른는 문서란 문서에 대한 요구사항을 모두 충족하는 문서를 말합니다. 가독성을 위해, 이러한 요구사항 일부는 저자에 대한 요구사항이라고 명시되어 있습니다. 그러한 요구 사항은 암시적으로 문서에 대한 요구사항입니다. 그러한 정의는, 모든 문서에는 저자가 있는 것으로 간주합니다. (이따금씩, 사용자 에이전트가 문서의 저자인 경우도 있을 것입니다. 그러한 사용자 에이전트는 추가적인 규칙을 준수하여야 합니다. 그런 규칙은 아래에서 설명합니다.)

Conforming documents are those that comply with all the conformance criteria for documents. For readability, some of these conformance requirements are phrased as conformance requirements on authors; such requirements are implicitly requirements on documents: by definition, all documents are assumed to have had an author. (In some cases, that author may itself be a user agent — such user agents are subject to additional rules, as explained below.)

예를 들어, 요구사항에서 "저자들은 foobar 요소를 사용하면 안 됩니다."라고 선언한다면, 문서가 foobar라는 요소를 포함해서는 안 된다는 것을 암시하고 있는 것입니다.

For example, if a requirement states that "authors must not use the foobar element", it would imply that documents are not allowed to contain elements named foobar.

User agents fall into several (overlapping) categories with different conformance requirements.

Web browsers and other interactive user agents

Web browsers that support the XHTML syntax must process elements and attributes from the HTML namespace found in XML documents as described in this specification, so that users can interact with them, unless the semantics of those elements have been overridden by other specifications.

A conforming XHTML processor would, upon finding an XHTML script element in an XML document, execute the script contained in that element. However, if the element is found within a transformation expressed in XSLT (assuming the user agent also supports XSLT), then the processor would instead treat the script element as an opaque element that forms part of the transform.

Web browsers that support the HTML syntax must process documents labeled with an HTML MIME type as described in this specification, so that users can interact with them.

User agents that support scripting must also be conforming implementations of the IDL fragments in this specification, as described in the Web IDL specification. [WEBIDL]

Unless explicitly stated, specifications that override the semantics of HTML elements do not override the requirements on DOM objects representing those elements. For example, the script element in the example above would still implement the HTMLScriptElement interface.

Non-interactive presentation user agents

User agents that process HTML and XHTML documents purely to render non-interactive versions of them must comply to the same conformance criteria as Web browsers, except that they are exempt from requirements regarding user interaction.

Typical examples of non-interactive presentation user agents are printers (static UAs) and overhead displays (dynamic UAs). It is expected that most static non-interactive presentation user agents will also opt to lack scripting support.

A non-interactive but dynamic presentation UA would still execute scripts, allowing forms to be dynamically submitted, and so forth. However, since the concept of "focus" is irrelevant when the user cannot interact with the document, the UA would not need to support any of the focus-related DOM APIs.

User agents with no scripting support

Implementations that do not support scripting (or which have their scripting features disabled entirely) are exempt from supporting the events and DOM interfaces mentioned in this specification. For the parts of this specification that are defined in terms of an events model or in terms of the DOM, such user agents must still act as if events and the DOM were supported.

Scripting can form an integral part of an application. Web browsers that do not support scripting, or that have scripting disabled, might be unable to fully convey the author's intent.

Conformance checkers

Conformance checkers must verify that a document conforms to the applicable conformance criteria described in this specification. Automated conformance checkers are exempt from detecting errors that require interpretation of the author's intent (for example, while a document is non-conforming if the content of a blockquote element is not a quote, conformance checkers running without the input of human judgement do not have to check that blockquote elements only contain quoted material).

Conformance checkers must check that the input document conforms when parsed without a browsing context (meaning that no scripts are run, and that the parser's scripting flag is disabled), and should also check that the input document conforms when parsed with a browsing context in which scripts execute, and that the scripts never cause non-conforming states to occur other than transiently during script execution itself. (This is only a "SHOULD" and not a "MUST" requirement because it has been proven to be impossible. [COMPUTABLE])

The term "HTML5 validator" can be used to refer to a conformance checker that itself conforms to the applicable requirements of this specification.

XML DTDs cannot express all the conformance requirements of this specification. Therefore, a validating XML processor and a DTD cannot constitute a conformance checker. Also, since neither of the two authoring formats defined in this specification are applications of SGML, a validating SGML system cannot constitute a conformance checker either.

To put it another way, there are three types of conformance criteria:

  1. Criteria that can be expressed in a DTD.
  2. Criteria that cannot be expressed by a DTD, but can still be checked by a machine.
  3. Criteria that can only be checked by a human.

A conformance checker must check for the first two. A simple DTD-based validator only checks for the first class of errors and is therefore not a conforming conformance checker according to this specification.

Data mining tools

Applications and tools that process HTML and XHTML documents for reasons other than to either render the documents or check them for conformance should act in accordance with the semantics of the documents that they process.

A tool that generates document outlines but increases the nesting level for each paragraph and does not increase the nesting level for each section would not be conforming.

Authoring tools and markup generators

Authoring tools and markup generators must generate conforming documents. Conformance criteria that apply to authors also apply to authoring tools, where appropriate.

Authoring tools are exempt from the strict requirements of using elements only for their specified purpose, but only to the extent that authoring tools are not yet able to determine author intent. However, authoring tools must not automatically misuse elements or encourage their users to do so.

For example, it is not conforming to use an address element for arbitrary contact information; that element can only be used for marking up contact information for the author of the document or section. However, since an authoring tool is likely unable to determine the difference, an authoring tool is exempt from that requirement. This does not mean, though, that authoring tools can use address elements for any block of italics text (for instance); it just means that the authoring tool doesn't have to verify that when the user uses a tool for inserting contact information for a section, that the user really is doing that and not inserting something else instead.

In terms of conformance checking, an editor has to output documents that conform to the same extent that a conformance checker will verify.

When an authoring tool is used to edit a non-conforming document, it may preserve the conformance errors in sections of the document that were not edited during the editing session (i.e. an editing tool is allowed to round-trip erroneous content). However, an authoring tool must not claim that the output is conformant if errors have been so preserved.

Authoring tools are expected to come in two broad varieties: tools that work from structure or semantic data, and tools that work on a What-You-See-Is-What-You-Get media-specific editing basis (WYSIWYG).

The former is the preferred mechanism for tools that author HTML, since the structure in the source information can be used to make informed choices regarding which HTML elements and attributes are most appropriate.

However, WYSIWYG tools are legitimate. WYSIWYG tools should use elements they know are appropriate, and should not use elements that they do not know to be appropriate. This might in certain extreme cases mean limiting the use of flow elements to just a few elements, like div, b, i, and span and making liberal use of the style attribute.

All authoring tools, whether WYSIWYG or not, should make a best effort attempt at enabling users to create well-structured, semantically rich, media-independent content.

Some conformance requirements are phrased as requirements on elements, attributes, methods or objects. Such requirements fall into two categories: those describing content model restrictions, and those describing implementation behavior. Those in the former category are requirements on documents and authoring tools. Those in the second category are requirements on user agents. Similarly, some conformance requirements are phrased as requirements on authors; such requirements are to be interpreted as conformance requirements on the documents that authors produce. (In other words, this specification does not distinguish between conformance criteria on authors and conformance criteria on documents.)

Conformance requirements phrased as algorithms or specific steps may be implemented in any manner, so long as the end result is equivalent. (In particular, the algorithms defined in this specification are intended to be easy to follow, and not intended to be performant.)

User agents may impose implementation-specific limits on otherwise unconstrained inputs, e.g. to prevent denial of service attacks, to guard against running out of memory, or to work around platform-specific limitations.

문서에 대한 요구사항과 구현 요구사항 사이에는 어떠한 암시적 연계도 없습니다. 사용자 에이전트가 올바르지 않은 문서를 마음대로 처리할 수 있는 것은 아닙니다. 이 명세에서 설명하는 처리 모델은 문서가 요구사항을 준수하든, 그렇지 않든 동일하게 적용해야 합니다.

There is no implied relationship between document conformance requirements and implementation conformance requirements. User agents are not free to handle non-conformant documents as they please; the processing model described in this specification applies to implementations regardless of the conformity of the input documents.

이미 존재하는 컨텐츠, 기존 명세들과의 호환성을 위해, 이 명세는 두 가지 저작 포맷을 설명합니다. 하나는 XML(XHTML 문법이라 칭합니다.)에 기반을 둔 것이고, 다른 하나는 SGML(HTML 문법이라 칭합니다.)에서 비롯된 커스텀 포맷입니다.

For compatibility with existing content and prior specifications, this specification describes two authoring formats: one based on XML (referred to as the XHTML syntax), and one using a custom format inspired by SGML (referred to as the HTML syntax). Implementations must support at least one of these two formats, although supporting both is encouraged.

The language in this specification assumes that the user agent expands all entity references, and therefore does not include entity reference nodes in the DOM. If user agents do include entity reference nodes in the DOM, then user agents must handle them as if they were fully expanded when implementing this specification. For example, if a requirement talks about an element's child text nodes, then any text nodes that are children of an entity reference that is a child of that element would be used as well. Entity references to unknown entities must be treated as if they contained just an empty text node for the purposes of the algorithms defined in this specification.

2.2.1 종속성

This specification relies on several other underlying specifications.

XML

Implementations that support the XHTML syntax must support some version of XML, as well as its corresponding namespaces specification, because that syntax uses an XML serialization with namespaces. [XML] [XMLNS]

DOM

The Document Object Model (DOM) is a representation — a model — of a document and its content. The DOM is not just an API; the conformance criteria of HTML implementations are defined, in this specification, in terms of operations on the DOM. [DOMCORE]

Implementations must support some version of DOM Core and DOM Events, because this specification is defined in terms of the DOM, and some of the features are defined as extensions to the DOM Core interfaces. [DOMCORE] [DOMEVENTS]

In particular, the following features are defined in the DOM Core specification: [DOMCORE]

  • Attr interface
  • CDATASection interface
  • Comment interface
  • DOMImplementation interface
  • Document interface
  • DocumentFragment interface
  • DocumentType interface
  • DOMException interface
  • Element interface
  • Node interface
  • NodeList interface
  • ProcessingInstruction interface
  • Text interface
  • createDocument() method
  • getElementById() method
  • insertBefore() method
  • ownerDocument attribute
  • childNodes attribute
  • localName attribute
  • parentNode attribute
  • namespaceURI attribute
  • tagName attribute
  • textContent attribute

The following features are defined in the DOM Events specification: [DOMEVENTS]

  • Event interface
  • EventTarget interface
  • UIEvent interface
  • click event
  • target attribute

The following features are defined in the DOM Range specification: [DOMRANGE]

  • Range interface
  • deleteContents() method
  • selectNodeContents() method
  • setEnd() method
  • setStart() method
  • collapsed attribute
  • endContainer attribute
  • endOffset attribute
  • startContainer attribute
  • startOffset attribute
Web IDL

The IDL fragments in this specification must be interpreted as required for conforming IDL fragments, as described in the Web IDL specification. [WEBIDL]

The terms supported property indices and supported property names are used as defined in the WebIDL specification.

Except where otherwise specified, if an IDL attribute that is a floating point number type (double) is assigned an Infinity or Not-a-Number (NaN) value, a NOT_SUPPORTED_ERR exception must be raised.

Except where otherwise specified, if a method with an argument that is a floating point number type (double) is passed an Infinity or Not-a-Number (NaN) value, a NOT_SUPPORTED_ERR exception must be raised.

JavaScript

Some parts of the language described by this specification only support JavaScript as the underlying scripting language. [ECMA262]

The term "JavaScript" is used to refer to ECMA262, rather than the official term ECMAScript, since the term JavaScript is more widely known. Similarly, the MIME type used to refer to JavaScript in this specification is text/javascript, since that is the most commonly used type, despite it being an officially obsoleted type according to RFC 4329. [RFC4329]

Media Queries

Implementations must support some version of the Media Queries language. [MQ]

URIs, IRIs, IDNA

Implementations must support the semantics of URLs defined in the URI and IRI specifications, as well as the semantics of IDNA domain names defined in the Internationalizing Domain Names in Applications (IDNA) specification. [RFC3986] [RFC3987] [RFC3490]

CSS modules

While support for CSS as a whole is not required of implementations of this specification (though it is encouraged, at least for Web browsers), some features are defined in terms of specific CSS requirements.

In particular, some features require that a string be parsed as a CSS <color> value. When parsing a CSS value, user agents are required by the CSS specifications to apply some error handling rules. These apply to this specification also. [CSSCOLOR] [CSS]

For example, user agents are required to close all open constructs upon finding the end of a style sheet unexpectedly. Thus, when parsing the string "rgb(0,0,0" (with a missing close-parenthesis) for a color value, the close parenthesis is implied by this error handling rule, and a value is obtained (the color 'black'). However, the similar construct "rgb(0,0," (with both a missing parenthesis and a missing "blue" value) cannot be parsed, as it closing the open construct does not result in a viable value.

This specification does not require support of any particular network protocol, style sheet language, scripting language, or any of the DOM specifications beyond those described above. However, the language described by this specification is biased towards CSS as the styling language, JavaScript as the scripting language, and HTTP as the network protocol, and several features assume that those languages and protocols are in use.

This specification might have certain additional requirements on character encodings, image formats, audio formats, and video formats in the respective sections.

2.2.2 확장성

ISSUE-41 (Decentralized-extensibility) blocks progress to Last Call

HTML은 안전한 방법으로 여러 가지 의미들을 더할 수 있는 확장 메커니즘을 갖고 있습니다.

HTML has a wide number of extensibility mechanisms that can be used for adding semantics in a safe manner:


Vendor-specific proprietary user agent extensions to this specification are strongly discouraged. Documents must not use such extensions, as doing so reduces interoperability and fragments the user base, allowing only users of specific user agents to access the content in question.

If such extensions are nonetheless needed, e.g. for experimental purposes, then vendors are strongly urged to use one of the following extension mechanisms:

For markup-level features that can be limited to the XML serialization and need not be supported in the HTML serialization, vendors should use the namespace mechanism to define custom namespaces in which the non-standard elements and attributes are supported.

For markup-level features that are intended for use with the HTML syntax, extensions should be limited to new attributes of the form "x-vendor-feature", where vendor is a short string that identifies the vendor responsible for the extension, and feature is the name of the feature. New element names should not be created. Using attributes for such extensions exclusively allows extensions from multiple vendors to co-exist on the same element, which would not be possible with elements. Using the "x-vendor-feature" form allows extensions to be made without risk of conflicting with future additions to the specification.

For instance, a browser named "FerretBrowser" could use "ferret" as a vendor prefix, while a browser named "Mellblom Browser" could use "mb". If both of these browsers invented extensions that turned elements into scratch-and-sniff areas, an author experimenting with these features could write:

<p>This smells of lemons!
<span x-ferret-smellovision x-ferret-smellcode="LEM01"
      x-mb-outputsmell x-mb-smell="lemon juice"></span></p>

Attribute names beginning with the two characters "x-" are reserved for user agent use and are guaranteed to never be formally added to the HTML language. For flexibility, attributes names containing underscores (the U+005F LOW LINE character) are also reserved for experimental purposes and are guaranteed to never be formally added to the HTML language.

Pages that use such attributes are by definition non-conforming.

For DOM extensions, e.g. new methods and IDL attributes, the new members should be prefixed by vendor-specific strings to prevent clashes with future versions of this specification.

For events, experimental event names should be prefixed with vendor-specific strings.

For example, if a user agent called "Pleasold" were to add an event to indicate when the user is going up in an elevator, it could use the prefix "pleasold" and thus name the event "pleasoldgoingup", possibly with an event handler attribute named "onpleasoldgoingup".

All extensions must be defined so that the use of extensions neither contradicts nor causes the non-conformance of functionality defined in the specification.

For example, while strongly discouraged from doing so, an implementation "Foo Browser" could add a new IDL attribute "fooTypeTime" to a control's DOM interface that returned the time it took the user to select the current value of a control (say). On the other hand, defining a new control that appears in a form's elements array would be in violation of the above requirement, as it would violate the definition of elements given in this specification.

When adding new reflecting IDL attributes corresponding to content attributes of the form "x-vendor-feature", the IDL attribute should be named "vendorFeature" (i.e. the "x" is dropped from the IDL attribute's name).


When vendor-neutral extensions to this specification are needed, either this specification can be updated accordingly, or an extension specification can be written that overrides the requirements in this specification. When someone applying this specification to their activities decides that they will recognize the requirements of such an extension specification, it becomes an applicable specification for the purposes of conformance requirements in this specification.

Someone could write a specification that defines any arbitrary byte stream as conforming, and then claim that their random junk is conforming. However, that does not mean that their random junk actually is conforming for everyone's purposes: if someone else decides that that specification does not apply to their work, then they can quite legitimately say that the aforementioned random junk is just that, junk, and not conforming at all. As far as conformance goes, what matters in a particular community is what that community agrees is applicable.


User agents must treat elements and attributes that they do not understand as semantically neutral; leaving them in the DOM (for DOM processors), and styling them according to CSS (for CSS processors), but not inferring any meaning from them.

When support for a feature is disabled (e.g. as an emergency measure to mitigate a security problem, or to aid in development, or for performance reasons), user agents must act as if they had no support for the feature whatsoever, and as if the feature was not mentioned in this specification. For example, if a particular feature is accessed via an attribute in a Web IDL interface, the attribute itself would be omitted from the objects that implement that interface — leaving the attribute on the object but making it return null or throw an exception is insufficient.

2.3 대소문자 구분과 문자열 비교

두 문자열을 비교할 때, "대소문자를 구분하여"라고 하면, 두 문자열이 정확히, 코드 단위로 일치해야 함을 의미합니다.

Comparing two strings in a case-sensitive manner means comparing them exactly, code point for code point.

두 문자열을 비교할 때, "아스키, 대소문자 구분없이"라고 하면, 두 문자열이 정확히 일치하여야 하지만, A-Z와 a-z는 서로 일치하는 것으로 간주함을 뜻합니다.

Comparing two strings in an ASCII case-insensitive manner means comparing them exactly, code point for code point, except that the characters in the range U+0041 to U+005A (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z) and the corresponding characters in the range U+0061 to U+007A (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z) are considered to also match.

두 문자열을 비교할 때, "호환성, 대소문자 구분없이"라고 하면, 두 문자열을 비교할때 유니코드의 해당 메커니즘을 이용함을 뜻합니다.

Comparing two strings in a compatibility caseless manner means using the Unicode compatibility caseless match operation to compare the two strings. [UNICODE]

별다르게 명시된 것이 없다면, 문자열은 대소문자를 구분하여 비교해야 합니다.

Except where otherwise stated, string comparisons must be performed in a case-sensitive manner.

Converting a string to ASCII uppercase means replacing all characters in the range U+0061 to U+007A (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z) with the corresponding characters in the range U+0041 to U+005A (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z).

Converting a string to ASCII lowercase means replacing all characters in the range U+0041 to U+005A (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z) with the corresponding characters in the range U+0061 to U+007A (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z).

패턴이라 함은 변수 pattern이 문자열 변수 s보다 길지 않고, spattern의 길이만큼 축소했을 때 서로가 일치하는 pattern을 말합니다. 이러한 비교를 prefix match라고 합니다.

A string pattern is a prefix match for a string s when pattern is not longer than s and truncating s to pattern's length leaves the two strings as matches of each other.

2.4 UTF-8

사용자 에이전트가 바이트 문자열을 UTF-8로 디코드할 것을 요구한다고 하는 것은, 바이트 스트림을 UTF-8로 해석해서 유니코드 문자열로 치환하여야 하는데, 그러한 과정에서 에러가 발견된다면 아래 목록에서 설명하는대로 처리해야 한다는 뜻입니다. 다음의 목록에 열거된 바이트들은 16진수로 표현된 것입니다.

When a user agent is required to decode a byte string as UTF-8, with error handling, it means that the byte stream must be converted to a Unicode string by interpreting it as UTF-8, except that any errors must be handled as described in the following list. Bytes in the following list are represented in hexadecimal. [RFC3629]

One byte in the range FE to FF
Overlong forms (e.g. F0 80 80 A0)
One byte in the range C0 to C1, followed by one byte in the range 80 to BF
One byte in the range F0 to F4, followed by three bytes in the range 80 to BF that represent a code point above U+10FFFF
One byte in the range F5 to F7, followed by three bytes in the range 80 to BF
One byte in the range F8 to FB, followed by four bytes in the range 80 to BF
One byte in the range FC to FD, followed by five bytes in the range 80 to BF
One byte in the range E0 to FD, followed by a byte in the range 80 to BF, not followed by a byte in the range 80 to BF
One byte in the range F0 to FD, followed by two bytes in the range 80 to BF, not followed by a byte in the range 80 to BF
One byte in the range F5 to FD, followed by three bytes in the range 80 to BF, not followed by a byte in the range 80 to BF
One byte in the range FC to FD, followed by four bytes in the range 80 to BF, not followed by a byte in the range 80 to BF
The whole sequence must be replaced by a single U+FFFD REPLACEMENT CHARACTER.
One byte in the range 80 to BF not preceded by a byte in the range 80 to FD
A sequence of bytes in the range 80 to BF that does not follow a byte in the range C0 to FD
One byte in the range C0 to FD not followed by a byte in the range 80 to BF
Each byte must be replaced with a U+FFFD REPLACEMENT CHARACTER.

For the purposes of the above requirements, an overlong form in UTF-8 is a sequence that encodes a codepoint using more bytes than the minimum needed to encode that codepoint in UTF-8.

For example, the byte string "41 98 BA 42 E2 98 43 E2 98 BA E2 98" would be converted to the string "A��B�C☺�".

명세의 번역 전체에 걸쳐 "속성"이라는 표현은 attribute를 가리킵니다. property는 구성원, 구성요소 등으로 표현할 수 있지만 이러한 표현이 오히려 혼동을 일으킬 수 있다는 판단으로 프로퍼티라는 표현을 사용하겠습니다. 돌아가기

서러게이트 : UTF-8로 표현할 수 없는 문자들을 표현하기 위해, 2개의 유니코드 문자를 이어붙여서 표현하는 것을 말합니다. 돌아가기