Looking for the previous guiStuff?
It's still here, the content didn't go anywhere. You may want to check out this new guiStuff though -- It's rather informative.
References/Tutorials:
Intro Documents:
guiStuff:
::Stuff for the multi-spec coder;
Coding, formats, standards, and other practical things.
<!-- JavaScript
The DOM: Part 1 - Intro and API
To properly understand the subject of the Document Object Model, before moving on to implementing elements of it in practice, there are a few prerequisites. First, it's highly recommended that you first read about the Document Tree, which is the relational model that is used in HTML/XHTML documents. That article deals with the information required to implement CSS Selectors and understand CSS property relations, however it also covers the document tree's various relationships, which you will have to be aware of in order to understand the material covered here. Specifically, I'm referring to relationships between elementes: parent-child, ancestor-descendant, preceding/following sibling, etc..It's also necessary to understand the difference between HTML and XHTML. While one could apply the Document Tree's relations to both, you'll see in a moment why it's very difficult, to say the least, to apply them to HTML, as opposed to applying them to XHTML.
HTML is based on SGML: Standard Generalized Markup Language. It defines data using tags, similar to how XML does. It has opening and closing tags, and allows tags to have attributes. The problem is that SGML is quite loose in terms of what it considers valid syntax. When is was first created, no one dreamed to what extremes it would be taken, so this wasn't initially a problem. However, when you're trying to apply models like the Document Tree, and make them dynamic, you're going to run into severe implementation issues. Here are some of them:
- While most tags require both a start and an end tag, some of them disallow end tags, such as <img> and <br> tags.
- Some tags have assumed, implied, or just optional end tags, like the <li> and <p> tags.
- Diagonal overlap of tags is allowed. For example, this is valid HTML:
<b> one <i> two </b> three </i>. How can you apply a relational structure, like the Document Tree, to code like that? - Some attributes don't require values:
<td nowrap>. - Attribute values may be defined without quotation marks:
<img src=example.png>.
There are other elements that hinder parsing of HTML in a fashion that requires massive amounts of guesswork on the part of the browser, but they're either more subtle or too technical to get into here.
As opposed to HTML, XHTML is based on XML. XML was created with relational models and dynamic access and modification in mind, and so implements the following rules:
- Every tag must have a start tag and an end tag.
- Shorthand syntax is defined:
<tag />is the same as<tag></tag>. - Tags must be embedded wholly within one another. The inner-most tag must end before another begins, at which point it is removed and the cycle is recursive. The above example of
<b> one <i> two </b> three </i>would not follow this rule. It would have to change to<b> one <i> two </i> three </b>. - All attributes require values.
- All attribute values must be surrounded by quotes.
When talking about the Document Tree, I expanded to the point that was relevant to understanding the relational model used in HTML/XHTML, and how it is used by CSS in terms of in heritence and the various selectors. There, the Element was the "smallest relevant body", when in fact the DOM treats the document in a higher-resolution manner. First, the DOM is made up of what are called nodes, which may be of various types. The Element Node is just one type of node. Another type of node is the Text Node, which is actually any span of contiguous text within the document.
Consider:
<b>Hello World!</b>. That's text that resides within an element. If you've been using HTML/XHTML and CSS, you'd simply think of the entire thing as being one element, with 'element content' (as described in the CSS Box Model). The DOM 'sees deeper', or rather, in a higher resolution:

The
B element is an Element Node, which is a parent of a Text Node. The DOM makes this distinction because, when you address nodes within the DOM, you could, say, replace the text within the B element, which means you'll be replacing the Text Node that is nested within the Element Node, while in fact not effecting the Element Node at all.Let's take a more involved example. Consider the following markup:
<p>The <b>DOM</b> is <i>extremely</i> powerful!</p>Here's how this appears as a DOM Tree:

This may look like a bit much at first glance, but there's really nothing new here. The parent of all of the rest of the nodes in this 'branch' is the
P element node. It has 5 children -- that is to say -- 5 direct descendants. They are:
- A text node that contains the text 'The'
- A
Belement node - Another text node that contains the text 'is'
- An
Ielement node - The last text node that contains the text 'powerful!'
P element node siblings. This may seem odd at first, but when traversing the DOM, you'll get used to text nodes appearing as preceding and following nodes to element nodes. The DOM also makes the distinction of which node appear in which order in the document. The text node that contains the text 'The' is the First Child of the P element node. You can extrapolate, then, that the text node that contains the text 'powerful!' is the Last Child of the P element node.These are specifically named properties which you'll encounter later, but overall, the DOM treats evey group of siblings as an Array. Just like an Array in most programming languages, it begins at index
0, and increments by 1 as it progresses. In other words, at index 0 of this group of siblings, you'll find the the First Child of the P element node, which is the text node that contains the text 'The'. At index 1 you'll find the B element node, and so on.The element nodes in this example have child text nodes of their own. The
B element node has a child text node which contains the text 'DOM', while the I element node has a child text node which contains the text 'extremely'. Element nodes don't have to contain text nodes, or in fact any nodes at all. An element node may be empty.There are additional types of nodes, which will be introduced gradually throughout the document. This is much easier to understand when you're working on the DOM hands-on, so I'll end the abstract-only speeches here and move on to more tangible material.
It's important to note that the DOM is language-independant. The end goal of this series of documents is to manipulate the DOM within the browser, which is why we're obviously going to be using JavaScript, but the DOM can be handled by a variety of different languages when it's "not in the browser".
Hands-on DOM
In the Document Tree article, you encountered the concept of the root element. The DOM refers to the top-most node of a DOM tree as the Document Node. This is the root of the tree, and the only element which has no parents. There must only be one document node in every document, and it is the ancestor of every other node in the document.Consider the following markup:
<html>
<head>
<title>Page Title</title>
</head>
<body>
<h1>Page Heading</h1>
<p>Greetings World!</p>
</body>
</html>
It's easy to see that the top-most node in this instance, and therefor every HTML/XHTML document, is the
HTML element. The DOM refers to it as the Document Node.Now for the first bit of refreshing news in this article: JavaScript is extremely straightforward in handling the DOM. For example, to access the document node in an HTML/XHTML document, you'd write:
var objHtml = document.documentElement;
The
objHtml variable now contains the HTML element (actually, a pointer to it, but those specifics won't matter until later). Earlier I mentioned the First Child and Last Child as named properties specified by the DOM. In JavaScript, they're literally here. If you look at the markup again, what would you expect to be the first child lf the HTML element, and what would be the last child? It only has two direct descendants: The HEAD element, and the BODY element, hence:
var objHtml = document.documentElement;
var objHead = objHtml.firstChild;
var objBody = objHtml.lastChild;
We also referred to the DOM as a set of Arrays, where every group of siblings represented an array. We can find this very functionality in JavaScript:
var objHtml = document.documentElement;
var objHead = objHtml.childNodes[0];
var objBody = objHtml.childNodes[1];
This time, we're using
childNodes to access the children of the HTML element. The first child is located at index 0 of the array, whereas the second (and in this case, last) child is located at index 1. Note that this isn't actually a JavaScript Array -- it's a pseudo-array, which means it doesn't have all of the properties and methods that the JavaScript Array Object has, though it does have some. Strictly speaking, the square bracket notation isn't part of the DOM, but rather was something that was added specifically to JavaScript. The proper method is the item() method of the childNodes property:
var objHtml = document.documentElement;
var objHead = objHtml.childNodes.item(0);
var objBody = objHtml.childNodes.item(1);
When implementing JavaScript within a browser, you can use the bracket notation, and you'll get the same results. If, however, you're using JavaScript in an external environment, it's possible (though unlikely) that it won't recognize the bracket notation, and that you'll need to switch to using the
item() method.A word about childNodes: Technically,
childNodes is a property of the Node Object (anything you see that can use childNodes to access nodes within itself is a Node Object). However, childNodes is also an object itself. Generally, childNodes is referred to as "the childNodes array", which is what I'll refer to as it from this point, for the most part.One of the properties that the
childNodes array has in common with a JavaScript Array Object is the length property:
var objHtml = document.documentElement;
alert(objHtml.childNodes.length); // alerts 2
At this point, you'll want to start testing the scripts as you go along, and there's a specific condition to running DOM-related scripts: The document must be fully loaded. This is because the DOM tree structure is only completely built when the root element has been opened and closed, which means you're basically waiting for the ending
</HTML> tag. The solution is to run the scripts at the BODY element's onload event.Here's a good page template to use for these scripts:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" >
<head>
<title>Page Title</title>
<script type="text/javascript">
function loadTrigget(){
//...DOM Scripts go here...
}
</script>
</head>
<body onload="loadTrigget()">
</body>
</html>
By placing your script in the
loadTrigget() function, you make sure it's being run after the page has been fully loaded. Mind you, there are ways to add an onload event handler to the BODY element using JavaScript, so that you don't have to modify your HTML, but I'll leave that for later.So far we've been walking in the dark in terms of knowing what nodes we were placing into our variables. There's actually an astonishingly simple way to find out -- the
nodeName property:
var objHtml = document.documentElement;
var objHead = objHtml.childNodes.item(0);
var objBody = objHtml.childNodes.item(1);
alert(objHead.nodeName); // alerts 'HEAD'
alert(objBody.nodeName); // alerts 'BODY'
The
nodeName property will return a String only if applied to an Element Node. The next logical step is therefore to find a way to check what type of node we're dealing with. So far we've encounterd three types: The Document Node (which is the root of the document), the Element node, and the Text node. The property we'll use to find out what type of a node we're dealing with is the aptly named nodeType property:
var objHtml = document.documentElement;
var objHead = objHtml.firstChild;
alert(document.nodeType); // alerts 9
alert(objHead.nodeType); // alerts 1
These are obviously enumerations of something, but they won't do much good if we have nothing to compare them to. Lucky for us, the
Node object defines 12 constants to which we can compare them to:Node.ELEMENT_NODE= 1Node.ATTRIBUTE_NODE= 2Node.TEXT_NODE= 3Node.CDATA_SECTION_NODE= 4Node.ENTITY_REFERENCE_NODE= 5Node.ENTITY_NODE= 6Node.PROCESSING_INSTRUCTION_NODE= 7Node.COMMENT_NODE= 8Node.DOCUMENT_NODE= 9Node.DOCUMENT_TYPE_NODE= 10Node.DOCUMENT_FRAGMENT_NODE= 11Node.NOTATION_NODE= 12
var objHtml = document.documentElement;
var objHead = objHtml.firstChild;
alert(document.nodeType == Node.DOCUMENT_NODE); // alerts 'true'
alert(objHead.nodeType == Node.ELEMENT_NODE); // alerts 'true'
The script above will work in any browser except Internet Explorer, which doesn't define these constants. Of course, since we know what they are, we can define them ourselves in an Array:
var node_types = new Array("??",
"ELEMENT_NODE",
"ATTRIBUTE_NODE",
"TEXT_NODE",
"CDATA_SECTION_NODE",
"ENTITY_REFERENCE_NODE",
"ENTITY_NODE",
"PROCESSING_INSTRUCTION_NODE",
"COMMENT_NODE",
"DOCUMENT_NODE",
"DOCUMENT_TYPE_NODE",
"DOCUMENT_FRAGMENT_NODE",
"NOTATION_NODE");
Now, using this Array in future scripts, we can identify the types of nodes like so:
var objHtml = document.documentElement;
var objHead = objHtml.firstChild;
alert(node_types[document.nodeType]); // alerts 'DOCUMENT_NODE'
alert(node_types[objHead.nodeType]); // alerts 'ELEMENT_NODE'
This is more convenient since we don't have to guess what type of node the variable contains, we just combine the
nodeType property with the node_types Array and have it 'tell' us.Locating Elements
We already know that we can usechildNodes to access the nodes like an array, and we know of firstChild and lastChild. This is good for 'short-range travel' around the document, and there are several additional properties that are defined by the standard. Here's a quick roundup:firstChild- Returns the first element in thechildNodesarray.lastChild- Returns the last element in thechildNodesarray.parentNode- Returns the parent element. The Document node returnsnull, and so do a few other exceptions.nextSibling- Returns the preceding element in thechildNodesarray of theparentNode. Will returnnullif it is the first element in that array.previousSibling- Returns the following element in thechildNodesarray of theparentNode. Will returnnullif it is the last element in that array.
Let's throw some more content into the page. If you're working with the template of the page mentioned above, then the
BODY element is now empty. Add the following markup to the BODY of the page:
<h1>This is the Page Heading</h1>
<p align="left" id="para1">This is some text placed
on the page that's contained <i>within</i> a paragraph.</p>
<p>This is a second paragraph.</p>
The Problem with Whitespaces
Now that we have some content on the page, let's go 'fetch' it using the DOM functions we already know:
var objHtml = document.documentElement;
var objBody = objHtml.childNodes[1];
alert(objBody.childNodes.length);
alert(node_types[objBody.childNodes[0].nodeType]);
alert(node_types[objBody.childNodes[1].nodeType]);
alert(node_types[objBody.childNodes[2].nodeType]);
Note that I didn't mention what each alert message would output. This is because each browser will interpret these differently. Internet Explorer may output what you're expecting: The
length of the childNodes array would be 3, and each of the three nodeTypes would return an element (H1, P, P). Firefox version 2 will alert that the length of the childNodes is 7, while Opera version 9 will tell you that it's actually 8. If you look at what each of these browsers tell you are the nodeTypes, you'll understand why -- both will alert: 'TEXT_NODE', 'ELEMENT_NODE', 'TEXT_NODE'. Firefox and Opera see the newlines and spaces between the tags as nodes, which is something that's not completely clear in the XHTML implementation. In XML, they would be nodes, but we know that HTML treats any amount of successive whitespaces as one whitespace, and newlines are ignored in many cases. Not to mention, Firefox and Opera still don't agree on the total amount of nodes (in fact, it's more than probable that if you copied and pasted the content into your page with additional/fewer newlines, you'd get different results).So what do we do about this? There are quite a few ways around it. The most common is to use long-range element locators, which we'll get to in a bit, in which you're being much more specific about what elements you're targeting. That's the practical and most often used way to navigate to your specific destination within the page, and then move to short-range travel when it's required. Second, this is JavaScript, and we have plenty of ways to condition our scripts to get exactly what we want, even if the browsers don't agree with each other.
Here's just one:
var objHtml = document.documentElement;
var objBody = objHtml.childNodes[1];
alert(objBody.childNodes.length);
for (var i=0; i<objBody.childNodes.length; i++){
if(node_types[objBody.childNodes[i].nodeType] != "TEXT_NODE"){
alert(node_types[objBody.childNodes[i].nodeType]);
alert(objBody.childNodes[i].nodeName);
}
}
The moral of the story is: While the different browsers may not agree on the implementation of the DOM, they will each provide you with information on how "it sees things". You will then be able to use this information to make sure that your script translates into the same effect across the various implementations. There aren't that many differences in imeplementations, though, so you won't have to worry about conditioning your code that much (I'll make note when incompatibilities between browsers occur), though just like any JavaScript code that manipulates the document, you'll want to test your code across the different browsers.
Also, notice that the script above doesn't use 'browser sniffing' to condition the code. That's the way to go about it if at all possible: You shouldn't care about what the browser name is or what version it is -- instead, probe its abilities and the way it handles things, so that you can cover not just particular browsers, but various behaviours as well.
Long-Range Travel
We know how to move around the page one step at a time, but it's hardly practical to do so in large documents. We'll need more ways of locating elements we're looking for in a 'long-range' manner. Assume we're still using the markup from above (the heading and two paragraphs).getElementbyId()
The easiest way to pinpoint a specific element within your page is thegetElementById() method. This is an HTML DOM method, not an XML method, but it really doesn't matter if you're coding JavaScript with HTML/XHTML. In HTML/XHTML, the id attribute is unique, which means that there can only be one element with a specific id value on each page, which in turn means that getElementById() will conveniently return one element, and one element only.Here's how it's used:
var objP1 = document.getElementById("para1");
alert(node_types[objP1.nodeType]); // alerts 'ELEMENT_NODE'
alert(objP1.nodeName); // alerts 'P'
As you can see, this really cuts down the amount of work you have to do in terms of getting to a specific element within the page. Now let's see how to mix this functionality with the short-range travel we've already seen:
var objP1 = document.getElementById("para1");
alert(objP1.childNodes.length); // alerts '3'
alert(node_types[objP1.childNodes[0].nodeType]); // alerts 'TEXT_NODE'
alert(node_types[objP1.childNodes[1].nodeType]); // alerts 'ELEMENT_NODE'
alert(objP1.childNodes[1].nodeName); // alerts 'I'
alert(node_types[objP1.childNodes[2].nodeType]); // alerts 'TEXT_NODE'
The first paragraph within the page contains three nodes. We know that it contains text, and that there's an
I element within that text. So, from what we've learned about how the DOM sees the document, it makes sense that the first P element on the page, with the id of 'para1', contains a text node, followed by an element node (which it indicates to be an I element), followed by another text node.This is ok on an abstract level, but it'd be nice to actually see what those text nodes contain. Worry not -- the DOM provides:
nodeValue. The nodeValue property won't give you much when you try it on an element node (in fact, it will return null), but it's perfect for text nodes: it will simply return a string with the text that the node contains:
var objP1 = document.getElementById("para1");
alert(objP1.childNodes.length); // alerts '3'
alert(node_types[objP1.childNodes[0].nodeType]); // alerts 'TEXT_NODE'
alert(objP1.childNodes[0].nodeValue);
// alerts 'This is some text placed on the page that's contained '
alert(node_types[objP1.childNodes[1].nodeType]); // alerts 'ELEMENT_NODE'
alert(objP1.childNodes[1].nodeName); // alerts 'I'
alert(node_types[objP1.childNodes[2].nodeType]); // alerts 'TEXT_NODE'
alert(objP1.childNodes[2].nodeValue); // alerts ' a paragraph'
Of course, once again, this is JavaScript, we can do better than that:
var objP1 = document.getElementById("para1");
for (var i=0; i<objP1.childNodes.length; i++){
alert(node_types[objP1.childNodes[i].nodeType]);
if(node_types[objP1.childNodes[i].nodeType] == "TEXT_NODE"){
alert(objP1.childNodes[i].nodeValue);}
if(node_types[objP1.childNodes[i].nodeType] == "ELEMENT_NODE"){
alert(objP1.childNodes[i].nodeName);}
}
The above script will loop through the
childNodes of objP1. If it encounters a text node, it will alert 'TEXT_NODE', and then the text contained within the node using nodeValue. If it's an element node, it will alert 'ELEMENT_NODE' and alert the element's name using nodeName. This is a small component of a "DOM-Scanner" script which you can build to scan either parts of, or the entire contents of a document. This only covers element and text nodes, and only one group of siblings, but the idea is the same.getElementsByTagName()
As can be deduced from its name, thegetElementsByTagName() method will locate tags that match the name that you specify. The getElementsByTagName() method, unlike getElementById(), is defined by the XML DOM, which means that in case you were to use JavaScript to handle XML in a non-browser environment, you'd be able to use this method to access XML and XHTML tags. Also unlike the getElementById() method, getElementsByTagName() may return more than one element, and is fact likely to. If you were to look for P elements in the document we're currently dealing with, you would find two of them.The
getElementsByTagName() method returns what is called a NodeList, which is almost the same object as the childNodes object: It's a pseudo-array, it's read-only, and you can access elements (nodes) within it using either the item() method or bracket notation. Also, just like childNodes is generally referred to as the childNodes array, the NodeList object is also commonly referred to as the NodeList array. To make things complete, NodeList even provides the same length property.Here's how you would use it:
var paras = document.getElementsByTagName("p");
alert(paras.length); // alerts '2'
alert(node_types[paras[0].nodeType]); // alerts 'ELEMENT_NODE'
alert(paras[0].nodeName); // alerts 'P'
alert(node_types[paras[1].firstChild.nodeType]); // alerts 'TEXT_NODE'
alert(paras[1].firstChild.nodeValue);
// alerts 'This is a second paragraph.'
Initially, all of the
P elements are placed within the paras variable. The length of the paras (NodeList) array is then alerted to be 2. We alert the node type of paras[0], followed by its node name. At this point, we know that paras[1] must be the second paragraph in the page, and that it must contain a text node, and it does (firstChild was used, but lastChild would have done the same, there's only one child node). The last alert call in the script returns the contents of the text node contained within the second paragraph.The
getElementsByTagName() method is often used with various conditionings in order to match just the right elements. You'll see this when we get to the getAttribute() method. You may also use the wildcard string (*) to match all elements across the page, and get an array of all elements in the page in one variable, like so:
var allElems = document.getElementsByTagName("*");
This will not work in Internet Explorer 6, in which you'll have to use:
// Internet Explorer 6 only:
var allElems = document.all.getElementsByTagName("*");
getElementsByName()
One of the methods defined by the HTML DOM is thegetElementsByName() method. This method goes through all of the elements within the page and returns a NodeList array with all of the elements whose name attribute matches its argument. This particular method isn't often used, because its implementation isn't consistent across browsers, and because you can achieve the same effect using the getElementsByTagName() method combined with the getAttribute() method to get any matching attribute, not just the name.getAttribute()
ThegetAttribute() method is used by an element node to retrieve a certain named attribute. This is the DOM method that takes the place of the browser-specific JavaScript element.attribute method, which means it will work with XML documents in external environments as well.Within the page we've been working with so far, it can be used like so:
var paras = document.getElementsByTagName("p");
alert(paras[0].getAttribute("align")); // alerts 'left'
Commonly, however, you'd use it within a loop to find a certain element, or a group of elements:
var paras = document.getElementsByTagName("p");
// loop through all P elements
for (var i=0; i<paras.length; i++){
var para = paras[i];
if (para.getAttribute("class") == "class_name"){
// take certain action
}
}
So far we've learned how to collect information from the Document Tree using the DOM. In Part 2 we'll get into how to make modifications and manipulate the contents of the document tree using the DOM to get some really dynamic pages.
Return to the JavaScript section, or go the to Main page.