Welcome to Edukum.com

Structuring Documents for the Web

A Web of Structured Documents
As a start, let us consider web to be a sea of documents (pages). In its very short lifetime, since its foundation, the web has grown to feature millions and billions of pages. For the time being, lets think of all these documents as pages, pages that are found in the web. Many of the pages we find in the web bear a strong resemblance to the documents in real life and mostly all these documents have a certain structure. For example, newspapers we read in the morning have structure like articles, pictures, advertisements, health tips and so on. Almost every article has a main heading, which might be followed by subheadings and pictures. This is a good example of a document in real life. On the same way, documents in the web also have this structure.
The web, however, does not understand the document we see in real life. A web browser if required for that purpose. The web browser renders the paragraphs, texts and pictures and shows them in a way required.The languages we need to learn in order to tell a web browser the structure of the document are HTML and XHTML.
Introducing HTML and XHTML
HTML is a markup language which is used to describe the documents present in the web i.e. the pages of the web. HTML stands for Hyper Text Markup Language and consists of a set of markup tags. HTML tags describe the HTML documents. And each tag describes a different content of the document. The initial type of HTML was printed by Tim Berners-Lee in 1993. Since then, there have been several dissimilar forms of HTML. The utmost extensively used form during the 2000's was HTML 4.01, which grew into an official tandard in December 1999.
Sideways with CSS, and JavaScript, HTML is a keystone equipment used to form web pages, as well as to generate user interfaces for smart phones and web applications. Web browsers can recite HTML files and condense them into noticeable or perceptible web pages. HTML defines the construction of a website semantically and, beforehand the beginning of Cascading Style Sheets (CSS), comprised signals for the demonstration or presence of the document (web page), making it a markup language, relative than a programming language.HTML basics form the construction chunks of HTML pages. HTML permits pictures and additional things to be implanted and it can be used to produce communicating forms. It delivers a resource to generate organized documents by signifying operational semantics for script such as headings, paragraphs, lists, links, quotes and additional items. HTML elements are outlined by tags, inscribed by means of angle brackets. Tags such as  and announce content into the page straight forwardly. Others such as ... border and deliver statistics about document text and may comprise additional tags as sub-elements. Browsers do not present the HTML tags, but use them to understand the content of the page.
HTML can implant calligraphies inscribed in languages such as JavaScript which touch the performance of HTML web pages. HTML markup can also state the browser to Cascading Style Sheets (CSS) to express the appearance and design of text and other material. The World Wide Web Consortium (W3C), maintainer of both the HTML and the CSS standards, has fortified the usage of CSS over plain presentational HTML since 1997.A simple example of a HTML document can be seen below: 

This is my first heading
This is my first paragraph.
The example can be seen in a browser as:
XHTML is the short form for Extensible HyperText Markup Language. XHTML is identical to HTML4 but clearer and strict than HTML4. It was developed for helping web developers make a transition from HTML to XML (Extensible Markup Language) by World Wide Web Consortium (W3C).XHTML 1.0 is “a reformulation of the three HTML 4 document types as applications of XML 1.0”.[1] The World Wide Web Consortium (W3C) too remains to continue the HTML 4.01 Recommendation, and the conditions for HTML5 and XHTML5 are being vigorously established. In the present XHTML 1.0 Recommendation file, as issued and reviewed to August 2002, the W3C remarked that, "The XHTML family is the next step in the evolution of the Internet. By migrating to XHTML today, content developers can enter the XML world with all of its attendant benefits, while still remaining confident in their content's backward and future compatibility."[1]
Additional form, XHTML, was a revision of HTML as an XML linguistic. XML is a standard markup language that is castoff to produce supplementary markup languages. Hundreds of XML languages are in practice nowadays, as well as GML (Geography Markup Language), MathML, MusicML, and RSS (Really Simple Syndication). Meanwhile each of these languages was inscribed in a shared language (XML), their content can effortlessly be shared across applications. This brands XML possibly very prevailing, and it's no shock that the W3C would produce an XML version of HTML (again, called XHTML). XHTML became an official standard in 2000, and was updated in 2002. XHTML is very similar to HTML, but has stricter rules. Strict rules are necessary for all XML languages, because without it, interoperability between applications would be impossible.

XHTML documents must be declared. Below is shown the declaration of an XHTML document:

There are a few rules when creating an XHTML document.

  1. All tags and attributes should be written in lower case only.
  2. Every element should have a closing tag.
  3. All the values of attributes must be quoted.
  4. Attribute minimization is now allowed.
  5. The id attribute replaces the name attribute.
  6. Nesting tags should be error free.

A simple example of XHTML document is shown below:


  • "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  • Insert your mandatory title here 
  • ...your content goes here... 

Many documents on the Web today were assembled with either HTML 4.01 or XHTML 1.0. However, in recent years, the W3C (in association with additional association, the WHATWG), has been at work on a brand new form of HTML, HTML5. However, HTML 5 is by now extensively reinforced by browsers and other web-enabled devices, and is the technique of the imminent time.
Tags and Elements
Let us look at the example of a HTML document given above. There we can find pairs of angle brackets containing the letters ‘html’ as. The two brackets and all the characters in between in them are called a tag. And as we can see, there are a lot of tags in the example. All the tags in the example come in pairs, i.e. they have both opening and closing tags. A closing tag is somewhat different from an opening tag. The closing tag contains a forward slash (‘/’) before the characters.


  • Opening tag:
  • Closing tag:

An element contains the pair of tags (opening and closing) and the content inside the tags.

This is a paragraph.
All of the content from the first ‘less than’ angle bracket to the last ‘greater than’ angle bracket is an element.
Separating Heads from Bodie
An HTML document consists of two main parts: Head and Body. Every HTML document has aand atag each inside thetags.

  1. The head element: It consists of an openingtag and a closingtag. The tags include the basic information of the document like title of the page, description of the page or keywords that search engines can use to index the page.
  2. The body element: It consists of an openingtag and a closingtag. The set of body tags include all other relevant information required in the page like paragraph, pictures and so on. The browser only shows the content present inside the pair of body tags. 

Attributes tell us about elements
The attributes provide necessary information about an element. They are present on the opening tag of the element that carries them. Attributes are made up of two parts: name and value.
The name is the property we want to set whereas, the value is what we want the value of the property to be.
The value of the property should be put in double quote and to separate it from name, an equals sigh in used.
In the above example, the value for the property ‘color’ is ‘red’. 
Learning from other by viewing their source code
Viewing source code means extracting the code for an already rendered web page. In browsers like Internet Explorer, Mozilla Firefox and Internet Explorer, you can go to view and select View Source. This gives you the html code for the web page.

However, there are two things you need to remember when looking into others source code.

  1. Other people have written the code for this page and they hold the required copyright rules for it. Using it for learning purpose doesn’t harm anybody though.
  2. People still follow HTML and are not habituated to strict XHTML rules and you are prone to see missing brackets and many other blunders which should be strictly ignored and you should learn to never make those mistakes. 

Elements for marking up text
In HTML, there are various ways to markup the content of the document. Some general ways of markup are by using general structural elements like headings, paragraphs, embedding quotes and code.

For example, a heading is written in HTML as:

Heading number one

-- this tends to present the text in the largest format 

This element consists of all the content of the document in it. Theelement is has both opening and closing tags among which the opening tag is situated at the beginning of the document while the closing tag is situated at the end of the document.


  • //your head here
  • //your body here

This element works as a container for all other header elements. In a HTML document, the openingtag should be the first to appear after the openingtag. Theelement must contain the:


#Things To Remember