차례
    1. 2.6 URL
      1. 2.6.1 단어 사용
      2. 2.6.2 기본 URL의 동적 변화
      3. 2.6.3 URL 조작 인터페이스

2.6 URL

ISSUE-56 (urls-webarch) blocks progress to Last Call

2.6.1 단어 사용

URL은 자원을 식별하기 위해 사용되는 문자열입니다.

A URL is a string used to identify a resource.

유효한 URL은 다음 중 최소 하나의 조건을 만족해야 합니다.

A URL is a valid URL if at least one of the following conditions holds:

문자열이 유효한 URL이고 비어 있는 문자열이 아니라면 그것은 유효한, 비어 있지 않은 URL입니다.

A string is a valid non-empty URL if it is a valid URL but it is not the empty string.

문자열에서 앞뒤의 공백을 제거한 결과가 유효한 URL이라면 그것은 유효한, 앞뒤로 공백을 허용하는 URL입니다.

A string is a valid URL potentially surrounded by spaces if, after stripping leading and trailing whitespace from it, it is a valid URL.

문자열에서 앞뒤의 공백을 제거한 결과가 유효한, 비어 있지 않은 URL이라면 그것은 유효한, 앞뒤로 공백을 허용하고 비어 있지 않은 URL입니다.

A string is a valid non-empty URL potentially surrounded by spaces if, after stripping leading and trailing whitespace from it, it is a valid non-empty URL.

To parse a URL url into its component parts, the user agent must use the parse an address algorithm defined by the IRI specification. [RFC3987]

Parsing a URL can fail. If it does not, then it results in the following components, again as defined by the IRI specification:


To resolve a URL to an absolute URL relative to either another absolute URL or an element, the user agent must use the following steps. Resolving a URL can result in an error, in which case the URL is not resolvable.

  1. Let url be the URL being resolved.

  2. Let encoding be determined as follows:

    If the URL had a character encoding defined when the URL was created or defined
    The URL character encoding is as defined.
    If the URL came from a script (e.g. as an argument to a method)
    The URL character encoding is the script's URL character encoding.
    If the URL came from a DOM node (e.g. from an element)
    The node has a Document, and the URL character encoding is the document's character encoding.
  3. If encoding is a UTF-16 encoding, then change the value of encoding to UTF-8.

  4. If the algorithm was invoked with an absolute URL to use as the base URL, let base be that absolute URL.

    Otherwise, let base be the base URI of the element, as defined by the XML Base specification, with the base URI of the document entity being defined as the document base URL of the Document that owns the element. [XMLBASE]

    For the purposes of the XML Base specification, user agents must act as if all Document objects represented XML documents.

    It is possible for xml:base attributes to be present even in HTML fragments, as such attributes can be added dynamically using script. (Such scripts would not be conforming, however, as xml:base attributes are not allowed in HTML documents.)

    The document base URL of a Document object is the absolute URL obtained by running these substeps:

    1. Let fallback base url be the document's address.

    2. If fallback base url is about:blank, and the Document's browsing context has a creator browsing context, then let fallback base url be the document base URL of the creator Document instead.

    3. If the Document is an iframe srcdoc document, then let fallback base url be the document base URL of the Document's browsing context's browsing context container's Document instead.

    4. If there is no base element that has an href attribute, then the document base URL is fallback base url. Otherwise, let url be the value of the href attribute of the first such element.

    5. Resolve url relative to fallback base url (thus, the base href attribute isn't affected by xml:base attributes).

    6. The document base URL is the result of the previous step if it was successful; otherwise it is fallback base url.

  5. Return the result of applying the resolve an address algorithm defined by the IRI specification to resolve url relative to base using encoding encoding. [RFC3987]

URL해석한 결과가 그 해석 기준과는 무관하게 같은 결과물을 가리키고 또한 실패하지 않았다면 그것은 절대 URL입니다.

A URL is an absolute URL if resolving it results in the same output regardless of what it is resolved relative to, and that output is not a failure.

절대 URL해석하고 파싱한 결과에서 <scheme> 부분 직후에 나타나는 문자가 /라면 그 URL은 계층 URL입니다.

An absolute URL is a hierarchical URL if, when resolved and then parsed, there is a character immediately after the <scheme> component and it is a U+002F SOLIDUS character (/).

절대 URL해석하고 파싱한 결과에서 <scheme> 부분 직후에 나타나는 문자 2개가 //라면 그 URL은 권리에 근거한 URL입니다.

An absolute URL is an authority-based URL if, when resolved and then parsed, there are two characters immediately after the <scheme> component and they are both U+002F SOLIDUS characters (//).


이 명세에서는 about:legacy-compat URL을 예비된 것으로 정의합니다. 비록 해석할 수 없지만, about: URI는 XML 도구들이 HTML 문서를 다룰 때 호환성 측면에서 필요하기 때문입니다. [ABOUT]

This specification defines the URL about:legacy-compat as a reserved, though unresolvable, about: URI, for use in DOCTYPEs in HTML documents when needed for compatibility with XML tools. [ABOUT]

이 명세에서는 about:srcdoc URL을 예비된 것으로 정의합니다. 비록 해석할 수 없지만, about: URI 는 the document's address of iframe srcdoc 문서의 주소로 사용되기 때문입니다. [ABOUT]

This specification defines the URL about:srcdoc as a reserved, though unresolvable, about: URI, that is used as the document's address of iframe srcdoc documents. [ABOUT]

이 명세에서 URL 이라는 단어는 RFC 3986에서 주어진 세밀하게 기술적인 내용과는 조금 다르게 사용됩니다. RFC에 친숙한 독자들은 이 명세가 조금 읽기 쉬운 것을 발견할 것입니다. 이것은 RFC 3986에 대한 의도된 위반입니다. [RFC3986]

The term "URL" in this specification is used in a manner distinct from the precise technical meaning it is given in RFC 3986. Readers familiar with that RFC will find it easier to read this specification if they pretend the term "URL" as used herein is really called something else altogether. This is a willful violation of RFC 3986. [RFC3986]

2.6.2 기본 URL의 동적 변화

When an xml:base attribute changes, the attribute's element, and all descendant elements, are affected by a base URL change.

When a document's document base URL changes, all elements in that document are affected by a base URL change.

When an element is moved from one document to another, if the two documents have different base URLs, then that element and all its descendants are affected by a base URL change.

When an element is affected by a base URL change, it must act as described in the following list:

If the element creates a hyperlink

If the absolute URL identified by the hyperlink is being shown to the user, or if any data derived from that URL is affecting the display, then the href attribute should be re-resolved relative to the element and the UI updated appropriately.

For example, the CSS :link/:visited pseudo-classes might have been affected.

If the element is a q, blockquote, section, article, ins, or del element with a cite attribute

If the absolute URL identified by the cite attribute is being shown to the user, or if any data derived from that URL is affecting the display, then the URL should be re-resolved relative to the element and the UI updated appropriately.

Otherwise

The element is not directly affected.

Changing the base URL doesn't affect the image displayed by img elements, although subsequent accesses of the src IDL attribute from script will return a new absolute URL that might no longer correspond to the image being shown.

2.6.3 URL 조작 인터페이스

URL 분해에 대한 IDL 속성을 보충하는 인터페이스입니다. 아래의 정의와 함께 7개의 속성을 가집니다.

An interface that has a complement of URL decomposition IDL attributes will have seven attributes with the following definitions:

   attribute DOMString protocol;
   attribute DOMString host;
   attribute DOMString hostname;
   attribute DOMString port;
   attribute DOMString pathname;
   attribute DOMString search;
   attribute DOMString hash;
o . protocol [ = value ]

기반 URL의 스키마를 반환합니다.

Returns the current scheme of the underlying URL.

URL의 스키마를 설정하기 위해 사용할 수 있습니다.

Can be set, to change the underlying URL's scheme.

o . host [ = value ]

현재의 호스트와 포트(기본 포트가 아닐 경우)를 반환합니다.

Returns the current host and port (if it's not the default port) in the underlying URL.

호스트와 포트를 설정하기 위해 사용할 수 있습니다.

Can be set, to change the underlying URL's host and port.

호스트와 포트는 콜론(:)으로 분리됩니다. 포트 부분이 생략되었다면, 그것은 현재 스키마의 기본 포트인 것으로 간주합니다.

The host and the port are separated by a colon. The port part, if omitted, will be assumed to be the current scheme's default port.

o . hostname [ = value ]

현재의 호스트를 반환합니다.

Returns the current host in the underlying URL.

호스트를 설정하기 위해 사용할 수 있습니다.

Can be set, to change the underlying URL's host.

o . port [ = value ]

현재의 포트를 반환합니다.

Returns the current port in the underlying URL.

포트를 설정하기 위해 사용할 수 있습니다.

Can be set, to change the underlying URL's port.

o . pathname [ = value ]

경로를 반환합니다.

Returns the current path in the underlying URL.

경로를 설정하기 위해 사용할 수 있습니다.

Can be set, to change the underlying URL's path.

o . search [ = value ]

쿼리 부분을 반환합니다.

Returns the current query component in the underlying URL.

쿼리 부분을 설정하기 위해 사용할 수 있습니다.

Can be set, to change the underlying URL's query component.

o . hash [ = value ]

조각 식별자를 반환합니다.

Returns the current fragment identifier in the underlying URL.

조각 식별자를 설정하기 위해 사용할 수 있습니다.

Can be set, to change the underlying URL's fragment identifier.


The attributes defined to be URL decomposition IDL attributes must act as described for the attributes with the same corresponding names in this section.

In addition, an interface with a complement of URL decomposition IDL attributes will define an input, which is a URL that the attributes act on, and a common setter action, which is a set of steps invoked when any of the attributes' setters are invoked.

The seven URL decomposition IDL attributes have similar requirements.

On getting, if the input is an absolute URL that fulfills the condition given in the "getter condition" column corresponding to the attribute in the table below, the user agent must return the part of the input URL given in the "component" column, with any prefixes specified in the "prefix" column appropriately added to the start of the string and any suffixes specified in the "suffix" column appropriately added to the end of the string. Otherwise, the attribute must return the empty string.

On setting, the new value must first be mutated as described by the "setter preprocessor" column, then mutated by %-escaping any characters in the new value that are not valid in the relevant component as given by the "component" column. Then, if the input is an absolute URL and the resulting new value fulfills the condition given in the "setter condition" column, the user agent must make a new string output by replacing the component of the URL given by the "component" column in the input URL with the new value; otherwise, the user agent must let output be equal to the input. Finally, the user agent must invoke the common setter action with the value of output.

When replacing a component in the URL, if the component is part of an optional group in the URL syntax consisting of a character followed by the component, the component (including its prefix character) must be included even if the new value is the empty string.

The previous paragraph applies in particular to the ":" before a <port> component, the "?" before a <query> component, and the "#" before a <fragment> component.

For the purposes of the above definitions, URLs must be parsed using the URL parsing rules defined in this specification.

Attribute Component Getter Condition Prefix Suffix Setter Preprocessor Setter Condition
protocol <scheme> U+003A COLON (:) Remove all trailing U+003A COLON characters (:) The new value is not the empty string
host <hostport> input is an authority-based URL The new value is not the empty string and input is an authority-based URL
hostname <host> input is an authority-based URL Remove all leading U+002F SOLIDUS characters (/) The new value is not the empty string and input is an authority-based URL
port <port> input is an authority-based URL, and contained a <port> component (possibly an empty one) Remove all characters in the new value from the first that is not in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), if any. Remove any leading U+0030 DIGIT ZERO characters (0) in the new value. If the resulting string is empty, set it to a single U+0030 DIGIT ZERO character (0). input is an authority-based URL, and the new value, when interpreted as a base-ten integer, is less than or equal to 65535
pathname <path> input is a hierarchical URL If it has no leading U+002F SOLIDUS character (/), prepend a U+002F SOLIDUS character (/) to the new value input is hierarchical
search <query> input is a hierarchical URL, and contained a <query> component (possibly an empty one) U+003F QUESTION MARK (?) Remove one leading U+003F QUESTION MARK character (?), if any input is a hierarchical URL
hash <fragment> input contained a non-empty <fragment> component U+0023 NUMBER SIGN (#) Remove one leading U+0023 NUMBER SIGN character (#), if any

아래의 표는 검색 결과에 대한 getter 들이 URL에 따라 어떻게 다른 결과를 나타내는지 보여줍니다.

The table below demonstrates how the getter condition for search results in different results depending on the exact original syntax of the URL:

Input URL

search value

설명

http://example.com/

빈 문자열

empty string

URL에 <query> 부분이 없습니다.

No <query> component in input URL.

http://example.com/?

?

<query> 부분이 있지만 비어 있습니다.

There is a <query> component, but it is empty. The question mark in the resulting value is the prefix.

http://example.com/?test

?test

<query> 부분의 값은 test입니다.

The <query> component has the value "test".

http://example.com/?test#

?test

(비어 있는) <fragment> 부분은 <query> 부분에 속하지 않습니다.

The (empty) <fragment> component is not part of the <query> component.

아래의 테이블도 비슷합니다. 주어진 URL을 분해하는 IDL 속성입니다.

The following table is similar; it provides a list of what each of the URL decomposition IDL attributes returns for a given input URL.

Input

protocol

host

hostname

port

pathname

search

hash

http://example.com/carrot#question%3f

http:

example.com

example.com

빈 문자열

/carrot

빈 문자열

#question%3f

https://www.example.com:4443?

https:

www.example.com:4443

www.example.com

4443

/

?

빈 문자열