Objective: Learn the concepts of HTML including how it used, why it is important, and some of the key terms and concepts.
Learn:
Book: Project 1
Web Readings:
To simulate in-class lectures
Name | MB | Description |
VuSource | 2 | View Html in browser |
Xml | 2 | Xml example |
This week is different from the rest of the course in that there is no coding or hands-on. Rather it is mostly reading about concepts and terms. Indeed, many of these concepts and terms are what we cover in detail the rest of the semester so you may not understand all of these fully just by reading and instead will understand better once we cover the hands-on part.
A web site contains multiple, related web pages and supporting files like images. Most pages are HyperText Mark-up Language (HTML) files which are normal text files with many markup <tags>. The html (or source code) gets interpreted by a browser. In Internet Explorer (IE) browser, the source code can be viewed using menu View | Source. A simple outline of how the web works & where html fits in is:
Actually web sites are just a collection of files that can be accessed over the internet. The files can be of any type but what happens on the user end depends on the file types which can be categorized as
Html (client) type pages are the most popular because:
However, html is limited especially which is why other programs are used in some case for things like
Other file types, like Word, Excel, etc., are rarely used on the web since they require programs that are not readily available for free.
In summary, markup languages (html, xhtml, xml) are the most popular files on the web since they are the only files the browser understands and are easy, free, and do not require any special software or server platform. Html is especially prevalent for informational, static type web pages that do not need to get information from a database. Server programs are widely used when the page is data driven (i.e., comes from a database like what items are on backorder) and are often used by larger companies or anyone who is selling something. In either case (html client page or server page) one must be able to generate html code since that is all the browser can understand. Other files, mainly flash and PDF, are used for special cases and usually not for complete web sites. The point is almost every web site uses html even if some of the site uses other files like server pages, Flash, etc
The book and Prof Yoxheimer notes at the end of this page give a brief overview of html structure, but I prefer to go into more details next week. For now I will just mention a few things:
You can view the html code for a web page in a browser by using menu View | Source (in IE). Sometimes the code will be very hard to follow especially if JavaScript is mixed in
Many people use web editors to generate html. So why learn html? The reasons vary:
Html and Xhtml are basically the same structure and purpose in that
XML on the other hand is quite different. XML has the same structure as html in that it uses elements and content, but the purpose is quite different. XML elements are not pre-defined in the browser so the browser has no idea by default how to display xml elements. Instead you make up your own elements and xml is often used for data exchange instead of formatting web pages.
Important Definitions and Terms:
The Internet is a collection of computer networks. A computer network is a
collection of computers of various types and other devices that can share data
across a communications channel. In one respect a computer network is a
communications channel where devices can share data and programs (An exe file is
really data, in Binary, that a computer processor can utilize as instructions on
what to do). A network in generally is a collection of people and resources
devoted to accomplish a simple or complex task or set of tasks. A computer
network simply carries this concept into the electronic realm.
The Internet
is vast. At present there are roughly 1,500,000,000 computers and other devices
connected to the internet at any one time. With devices, including computers,
going on and off line all the time. Content on the Internet is changing all the
time. And the functions performed by various devices on the Internet is changing
all the time. One truly major aspect of the Internet is the push towards
wireless technology all the time. Now you can virtually be connected to the
internet anywhere at any time. The shear amount of data that can be shared is
increasing all the time.
When I was a kid the holy grail of technology was the video phone. When we had
the video phone we would really be an advanced society. Never did we realize you
would be able to carry a phone with you where-ever you wanted. Now not only can
we share voice and pictures but data in just about any other format you want
instantly or just about instantly anyway. And never did we comprehend being able
to do this to any country in the world as easily as we do now. Also computers
were not common and a good computer had roughly 4000 bytes of RAM and a tape
unit and maybe a punched card reader. Now computers are very common, in fact we
don't know how to get rid of them when their no-longer of use and a good PC
today has 4.000.000.000 bytes of RAM. RAM by the way is working memory or a
scratchpad for a CPU. While I'm on my horse; another aspect of computers which
is extremely important is the rate at which data is communicated. Data
communications today enjoys the same rate of increase as data storage, RAM and
cost.
Another factor of the strength of the Internet as with any technology is cost.
For a technology to be truly powerful it must be cheap enough for everyone to
afford. In 1976 a meg of RAM for an IBM mainframe was about $1.000.000.00. Now a
meg of ram is about 75 cents retail. At this point no-one but the manufacturers
can complain about the cost of technology.
It is also important to remember, not just for this class, but for all time. Is
that the Internet has so impacted the way we socialize and do business that the
effects are incalculable. For example you can go to college and never see a
class room. The amount of information overload is great. We are never out of
touch with each other. True manual labor is almost a thing of the past, at least
in this country. Its getting to the point that we interact more on-line than in
person.
The WWW is just one service provided over the Internet. HTML is the language of
the WWW not the Internet. I point this out because the Internet is a platform
for many types of services not just the World Wide Web. Now on the flip side we
see the WWW more than anything else on the Internet.
HTML stand of course for HyperText Markup Language, which is really just a
markup language. So what's a markup language??? well think of MS Word; you type
in text and insert images and so forth. These are the elements data in a word
document. You can apply display characteristics to the various data in a word
document such as bolding. In a word document, you can select any sentence and
select the bold button on the formatting toolbar and your sentence appears
highlighted or stronger than the other sentences so it stands out. In the actual
doc file you are creating, MS Word puts in some markers around the sentence, in
binary form, that tells it (the word program) to bold or emphasize that
sentence. In other words it puts in formatting marks inside the doc file. HTML
is a set of plain text markers (also called tags and elements) to tell a web
browser how to display or render various parts or data in an HTML document. So
in other words you markup up your data with special tags or elements to tell the
web browser how to display that data. Or to tell the web browser to include an
image.
Just a note on HTML tags and/or elements. HTML tags or HTML commands, as they
can also be thought of, also include attributes, which you will see as you learn
HTML. HTML attributes are usually
attributename = some_value pairs
note the format, that supply additional information for the HTML tag/command to
do its job. An HTML tag can have any number of attributes.
I should also mentions not all browsers work the same. For example IE and
firefox will render the same web page differently. They may also have
tags/elements unique to that particular browser. Rule is just because a page
renders the way you want it to in one browser does not mean it will render that
way in a different browser.
I will also mention that not only do HTML documents contain markup (In the form
of HTML tags) and data, but they also can contain program code (usually in the
form of JavaScript) to initiate behavior and respond to the viewers actions on
the document. For example a user may click on an image to initiate a news feed
or get further information.
The development of the World Wide Web:
A worldwide network of computer networks has been around in one form or another
since 1961 in one form or another.
By the mid-1970s, many government agencies, research facilities, and
universities were on this network of networks (called the ARPAnet), but each was
running on its own internal network developed by different vendors and used
different protocols altogether. For example, the Army's system was built by DEC,
the Air Force's by IBM, and the Navy's by Unisys. All were capable networks, but
all spoke different languages. What was clearly needed to make things work
smoothly was a set of networking protocols that would tie together disparate
networks and enable them to communicate with each other.
The Department of Defense decided the TCP/IP suite of networking protocols would
be the standard for all military computer networking. TCP/IP has been ported to
most computer systems, including personal computers, and has become the new
standard in internetworking. It is the TCP/IP protocol set that provides the
infrastructure for the Internet today.
TCP/IP comprises over 100 different protocols. It includes services for remote
logon, file transfers, and data indexing and retrieval, among others. The most
common protocol in use on the TCP/IP suite is HTTP or Hypertext Transfer
Protocol, which is the protocol that is used to transfer HTML pages or simply
web pages.
Web Servers and Web Browsers also called Web Clients:
To understand what a server is and a client are: A server is a computer or other
device on a network that provides services to consumers or clients. Such
services include:
When you enter a URL into the address bar or your web browser that
URL must
be sent to a DNS server to be translated into an IP address.
This IP address is
used to address a service request to the computer or device on the Internet that
will provide the service (or web page). Once the browser has the IP address, it
creates a request for the resource required. A resource may be a web page, a
file, a piece of music, an email, whatever. The request is received by the
server machine and it essentially either
provides the service or sends back a
denial message.
I recommend the Student understand this process.
Another term I'll introduce at this time is the idea of a port.
A port is simply
a number ranging from 0 to 65535. On a computer connected to the Internet, the
computer is identified by its IP address. Or actually to be more specific
the
network card connecting the computer to the Internet is identified by an IP
address. A computer can of course have many applications installed on the
computer. Each application can have an address or ID number as well. This way
when a message does arrive on the destination computer (Identified by IP
address). The message or data arriving on the computer can be sent to the
application that can make sense of the data, addressed by port number. So
the
port number uniquely identifies an application in a computer.
HTML The Language of the Web:
HTML is simply a markup language or formatting language that directs the web
browser on how to format or render the data in the HTML document or web page as
they are called. Markup is simply commands included in the file or document
along with the data. These commands are used by the application reading and
processing the data to give it direction on how to process the data. An example
of this would be a MS word document. In a word document is the data you type in
or insert. Along with the data are bits to tell word how to display that data,
hence the formatting commands, you specify from the toolbars of MS Word.
HTML is simply a human readable formatting language to direct a browser on how
to render or display the data/content of a web page on the browser window. These
commands are called HTML tags or HTML elements.
There are also HTML tags or elements that direct the web browser to include
certain data such as images or sounds.
Tools for Creating HTML documents:
There are any number of tools for creating HTML documents. Some are very
sophisticated others are not. There is a rule that says: the best tool is the
one you know and are comfortable with. I strongly believe in this rule.
For this course I want you to use MS Notepad to start out with. I believe this
is the best tool to allow you to understand HTML, because it does not provide
handy services for you. If you are using Notepad you must understand the HTML to
get the document formatted and working/looking correct. HTML is very easy to
learn, but you can't learn it if you don't write it yourself.
It is also important you learn to write it yourself; because no matter how good
your HTML editor/converter is you will always need to tweak it yourself with a
simple text editor.
After the course progresses I will allow you to use more sophisticated tools.
I do however at this time recommend that if you haven't looked at other HTML
tools you starting look at tools like:
I also recommend you look at other browsers such as firefox and Maya from the
W3C themselves.
To see some examples of an HTML document download any web page and from the IE
menu select View/Source to see the raw HTML.
Marking Elements with Tags:
Notice that the tags or elements begin with a less-than symbol and end with a
greater than symbol. Also note that an opening tag must be closed with an ending
tag, which is the same as the opening tag with a forward slash after the
less-than symbol. See the examples in the book.
HTML is not case sensitive, however I recommend when you create an opening tag
the closing tag be in exactly the same case.
XHTML is case sensitive. So look to the XHTML standard found in at
http://www.w3.org/TR/xhtml1
Empty elements are HTML tags that are not meant to contain data. So instead of
have an opening and closing tag with is the rule, there is a short-hand
notation. Simply include the element with the less-than as the first character
of the element and include a forward-slash followed by a greater-than symbol at
the end of the element name or after the attribute list if there is one.
White Space and HTML:
White space are characters in a document that don't display in the document.
Such characters include the space and the tab key. The New-line or Enter key is
also considered white space.
When an HTML document is processed or read by a web browser white space is
simply ignored, so use white space as you see fit to make your HTML documents
more readable. For example I put the opening and closing Head and Body tags on
their own lines to make them easy to spot.
Element Attributes:
Many HTML tags have attributes that modify the behavior of an HTML element. For
example the body tag has an attribute that lets you set the back-ground
color of your HTML document. Attributes take on the form of attribute-name =
value.
An HTML tag can contain many attributes.
The Structure of an HTML document:
A computer is a device for manipulating bits and bytes. All files in a computer
are composed of and are a sequence of bits. A bit is a 1 or a 0, more
specifically a bit is an indicator is a circuit is on or off. If a circuit is
on, on a computer, the bit it represents is a 1. If a circuit is off the bit it
represents is said to be 0. Files are nothing more than a stream of bits. The
term byte is simply a group of eight bits.
In order for a computer application to work, the application must be able to
translate the bits in a file into data and information of a higher level. The
data or information can be things like numbers, letters, graphical data, music,
what have you. The important point here is that bits are used to encode
different, and any type of, information such as much, word documents,
photographical or video. These bits or binary data can then be manipulated by a
computer.
An HTML document is a plain text document. this means that the bits that make up
an HTML file are decoded using the ASCII table. The ASCII table is a table with
256 entries. Remember that a byte is eight bits. A bit is a number which is
either 0 or 1. There are 256 bit patterns in the range of 00000000 to 11111111.
Each bit pattern in the ASCII table represents a number, letter (upper and lower
case), punctuation mark, or special symbol used in the English language. So to
decode an ASCII or plain text file, you simply take eight bits, find that
pattern in the ASCII table, and use that pattern's corresponding letter or
symbol.