Tech Center Current


The basics of HTML

March 24th, 2007 by David Hammond

HTML (HyperText Markup Language) is the fundamental computer language used to make webpages. It organizes and defines all of the elements of the document structure — such as headings, paragraphs, lists, tables, and links — to give computer-readable context to everything on the page.

Here is what a simple HTML document looks like:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
    "http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
  <head>
    <title>My First Webpage</title>
  </head>
  <body>
    <p>This is a paragraph.</p>
  </body>
</html>

You’ll notice that a lot of things are surrounded by < and > characters. These are called “tags”. Tags are interpreted by the web browser as the start or end of an element. A “start tag” looks like <tagname> and an “end tag” looks like </tagname>. Everything in between is considered to be inside the element.

Not all elements have end tags. “img”, “input”, “link”, and “meta” are examples of elements that only consist of a start tag because they never have anything inside them. All of the elements I’ll discuss below will have end tags.

Start tags can also have things called “attributes”, like the “lang” attribute you see above in the “html” tag. Attributes tell special information about the element. In this case, it says that the HTML document is written in the English language.

At the top of the example HTML code, you’ll notice a weird sort of tag. Something that begins with a <! is called a “declaration”. Declarations don’t actually define elements, and they never have end tags. The “DOCTYPE” declaration tells the web browser what document language you’re using. In this case, we’re using the “strict” version of HTML 4.01, which is probably the one you should be using for best practices. Another common type of declaration is a comment declaration, which looks like <!-- This is a comment --> and is used to add notes that won’t be displayed on the page.

Let’s go back to the elements. After the DOCTYPE declaration, every HTML document must start with an “html” element. Inside that are always two sections: the head, and then the body. The body is where you’ll find all of the content you see on the page. The head is for additional information to describe the page, such as its title, references to “style sheets” that add the fancy colors and layout the page will use, and so on. But the only thing required in the head is the “title” element. The title text is what you’ll see at the very top of your browser window, the name of your site on a Google search result, etc.

In the body, you have a lot more types of elements available to you, but there are a few that you will be using frequently:

p:

This is a basic paragraph of text. Here is an example:

<p>The HTML language was invented by Tim Berners-Lee and
    was first published as an international standard in
    1993. The grammar of the HTML language was defined in a
    language called SGML, the Standard Generalized Markup
    Language.</p>

<p>XML is another common language whose grammar is defined
    in SGML. Unlike HTML, XML doesn't have its own
    standard set of elements and attributes. Instead,
    other languages are built on top of the XML
    grammar.</p>

The “p” element is used incorrectly on many websites. You should never simply put a <p> at the end of a paragraph, and you shouldn’t have extra <p> tags just for spacing purposes. Extra spacing can be done using style sheets, which you can learn about later. “p” tags should only be used to surround actual paragraphs.

h1, h2, h3, h4, h5, and h6:

These are headings for sections of the page. You can think of the heading structure like a document outline: the main points of the document should use h1, the first sub-level should use h2, and so on. For example:

<h1>City Zoo</h1>

<p>Welcome to the City Zoo! Here you'll find information
    about what to expect in the zoo.</p>

<h2>Animals</h2>

<p>Be on the lookout for the following animals.</p>

<h3>Elephant</h3>

<p>The world's African elephant population is estimated
    to be between 400,000 and 660,000.</p>

<h3>Ostrich</h3>

<p>Because ostriches don't have teeth, they will swallow
    stones that help grind the food in their stomachs.</p>

<h2>Plants</h2>

<p>Here are some exotic plants you may see.</p>

<h3>Banana tree</h3>

<p>Supermarket bananas are called Cavendish bananas and
don't exist naturally. Their existence is currently being
threatened by a variant of the Panama Disease which drove
the previous dominant species of banana to extinction in
the 1950s...</p>

You should never pick a heading element purely based on its font size. The font size of any element may be adjusted in a style sheet.

ul, ol, and li:

These are used for lists. “ul” tags surround an “unordered list”, and “ol” tags surround an “ordered list”. Ordered lists are numbered, while unordered lists simply use bullet points. “li” tags surround the items in the list. Here is an example (recipe from Wikibooks, licensed under GNU FDL):

<h1>Recipe for Oatmeal Stout Brownies</h1>

<h2>Ingredients</h2>

<ul>
  <li>1 cup all-purpose flour</li>
  <li>3/4 cup unsweetened cocoa powder (Dutch-
      processed preferred)</li>
  <li>1/4 teaspoon salt</li>
  <li>6 tablespoons unsalted room temperature
      butter, cut into cubes</li>
  <li>8 ounces dark bittersweet chocolate,
      chopped</li>
  <li>3/4 cup white chocolate chips</li>
  <li>4 large eggs, at room temperature</li>
  <li>1 cup granulated sugar</li>
  <li>1 1/4 cups (10 ounces) Oatmeal Stout beer
      (room temperature)</li>
  <li>1 cup semi-sweet chocolate chips</li>
  <li>1 package (8 ounces) cream cheese,
      softened</li>
  <li>1/3 cup granulated sugar</li>
  <li>1/2 teaspoon vanilla</li>
  <li>1 large egg</li>
</ul>

<h2>Procedure</h2>

<ol>
  <li>Preheat oven to 375°F (190°C).</li>
  <li>Line a 9 x 13-inch baking pan with non-stick foil
      (or grease and flour - or parchment paper; your
      call).</li>
  <li>In a medium bowl, whisk together flour, cocoa
      powder, and salt until evenly combined. Set
      aside.</li>
  <li>Melt butter, bittersweet chocolate, and white
      chocolate chips in a double boiler over very low
      heat, stirring constantly until melted. Remove from
      heat.</li>
  <li>In a large mixing bowl, beat eggs and sugar on high
      speed until light and fluffy, about 3 minutes. Add
      melted chocolate mixture, beating until
      combined.</li>
  <li>Beat reserved flour mixture into melted chocolate
      mixture. Whisk in stout beer. The batter will seem a
      bit thin. Drop semisweet chocolate chips evenly on
      top of batter (some will sink in).</li>
  <li>Pour into prepared baking pan.</li>
  <li>Beat cream cheese in medium bowl with electric
      mixer on medium speed until smooth. Gradually beat
      in sugar. Beat in vanilla and 1 egg just until
      blended.</li>
  <li>Pour cream cheese mixture over brownie batter in
      pan; cut through mixture with knife several times
      for marbled design.</li>
  <li>Bake 25 to 35 minutes on center rack in the oven,
      until a toothpick inserted in the center comes out
      almost clean.</li>
  <li>Let brownies cool, uncovered, to room temperature.
      Dust with confectioners' sugar before serving if
      desired.</li>
</ol>

As with other HTML elements, don’t worry too much about what these list elements look like. Looks can always be changed by style sheets. If you want to add something to your page like a navigation bar — something that is, in concept, a list even if you don’t want it displayed vertically with bullets — you should use a “ul” element and style it later. This is because lots of programs like search engines will actually better understand the page if you use elements for what they mean rather than how they look. In the web design biz, this is called “semantic markup”.

a:

The “a” element (short for “anchor”, for historical reasons) is used to create a link to somewhere else. It should always be used with an “href” (hypertext reference) attribute to tell where the link will go. Here is an example:

<p>For more information about web standards, visit the
    <a href="http://www.w3.org/">W3C site</a>.</p>

In the above example, the user will see the text “W3C site” (typically underlined and blue) which, when clicked, will take the person to http://www.w3.org/.

em and strong:

These give emphasis to a word or phrase. They have the same meaning, except “strong” is meant to indicate stronger emphasis. A typical web browser will display “em” text in italics and “strong” text in bold. If the web browser is configured to read the webpage aloud, “em” text will be spoken with emphasis and “strong” text will be spoken even louder. Here is an example:

<p>The em and strong elements provide <em>meaning</em> and
    <em>context</em>, unlike the i and b elements which
    only change <strong>visual appearance</strong>.

For a complete list of available elements in HTML, see the HTML element index. Elements listed with “L” or “F” in the “DTD” column are not allowed in the “strict” version of HTML we’re using. It’s considered bad practice to use them in any version.

If you want to try out some HTML code of your own, head on over to my webpage test service. Just type the HTML and click “Display” to see the result below.

HTML is not a free-for-all language. While some popular browsers will try to let you get away with errors, you should always be careful that you’re putting your tags in the right places and following the standard rule sets. Luckily, there is a simple free tool to help check that your documents don’t contain HTML errors: the HTML Validator. If you ever come to an HTML support forum for help, this tool is usually the first thing they’ll use. Making sure that your HTML is always valid helps reduce your risk of headaches in the future.

For more information about HTML, see the Wikipedia article on HTML and the Wikibook on learning HTML.

To look at any webpage’s HTML code, right-click somewhere on the page and click “View Source” or “View Page Source”. Keep in mind that there tends to be more poorly-written HTML code on the Web than well-written HTML code, so don’t rely entirely on example.

There is another language called XHTML, based on HTML, which has been used more and more lately. You should avoid XHTML. At this point in time, you must have truly expert knowledge in web standards in order to use XHTML correctly, and then there are major compatibility problems if you do. As a result, almost every XHTML page on the Web has serious overlooked problems that have all but destroyed XHTML’s original goal on the Web. For technical information about the problems with XHTML, see Beware of XHTML.

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.