Getting Started
15
The Components of a URL
A web URL follows a standard form that can be broken down into a few key parts, diagrammed in Figure
1-3. Each segment of the URL communicates specific information to both the client and the server.
http://www.example.com/examples/example.html
Protocol Hostname Path
Name Extension
File
Prefix Domain
Figure 1-3. The components of a URL
The protocol indicates one of a few different sets of rules that dictate the movement of data over the
Internet. The Web uses HyperText Transfer Protocol (HTTP), the standard protocol used for transmitting
hypertext-encoded data from one computer to another. The protocol is separated from the rest of the URL
by a colon and two forward slashes (://).
A hostname is the name of the site from which the browser will retrieve the file. The web server’s true
address is a unique numeric Internet Protocol (IP) address, and every computer connected to the Internet
has one. IP addresses look something like “66.211.109.45,” which isn’t very easy on the eyes and is
certainly a challenge to remember. A domain name is a more memorable alias that directs Internet traffic
to an IP address. Many web hostnames feature a domain prefix, further naming the particular server being
accessed (especially when there are multiple servers within a single domain), though that prefix is
frequently optional. A prefix can be almost any short text label, but “www” is traditional. It’s possible for
another entire website to exist separately within a domain under a different prefix, known as a subdomain.
A hostname will also feature a domain suffix (sometimes called an extension) to indicate the domain’s
category, such as “.com” for a U.S. commercial domain, “.edu” for a U.S. educational institution, or “.co.uk”
for a commercial website in the United Kingdom. Every country also has its own domain extension, and
you’ll often see URLs that indicate a country of origin but not any particular category.
The path specifies the directory on the web server that holds the requested document, just as you save
files in different virtual folders on your own computer. Files on a web server may be stored in
subdirectories—folders within folders—and each directory in the path is separated by a forward slash (/).
This path is the route a client will follow to reach the ultimate destination file. The top-level directory of a
website (the one that contains all other files and directories) is called the site root directory and doesn’t
appear in the URL.
The specific file to retrieve is identified by its file name and extension. You can give your files just about
any name you want, and a file extension indicates what type of file it is. An HTML (or XHTML) document
will have an extension of .html or .htm (the shorter version is used on some servers that support only
three-letter file extensions). CSS files use the .css extension, JavaScript files use .js, and so forth. Web