www.columbia.edu Open in urlscan Pro
128.59.105.24  Public Scan

URL: http://www.columbia.edu/~fdc/sample.html
Submission: On March 12 via manual from IN — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

DO-IT-YOURSELF WEB AUTHORING - A BEGINNER'S HTML TUTORIAL


A random photo... (The Hudson River at 125th Street about 2002)

Frank da Cruz
Updated in 2019 and 2021 for HTML5 and "fluidity".

This page shows how to create Web pages by hand, the original way. Although
today most Web pages are created by "Web authoring systems" that are designed to
shield you from technical details, the fact is that HTML (the "programming"
language of the Web) is not that difficult, as you can see if you follow this
tutorial. To get an idea of what is possible with this technique, see these 100%
hand-made websites:

 * The New Deal in New York City 1933-1943
 * The History of Computing at Columbia University 1890-2005
 * The Washington DC Nation Mall in World War II
 * Arlington, Virginia, 1956-61: The Hall's Hill Segregation Wall
 * Frankfurt, Germany, 1959-61


CONTENTS

 1.  Creating a Web Page
 2.  HTML Syntax
 3.  Special Characters
 4.  Converting Plain Text to HTML
 5.  Effects
 6.  Lists
 7.  Links
 8.  Tables
 9.  Viewing your Web page
 10. Installing your Web Page on the Internet
 11. Where to go from here
 12. Postscript: Cell Phones

You can create a Web page on your desktop computer but nobody can see it but
you. If your want other people to be able to see your Web pages, you need an
account on a computer that has a Web server. Nowadays most people have their own
computers on their desks, but normally they don't have Web servers and anyway
you don't want the whole world coming into your desktop computer to see your web
page because (a) it's not designed for that, and (b) who knows what else they
might see. And (c) for security reasons, Web servers should be managed by
professionals. Most institutions have big central shared computers for this
purpose, which usually have a Unix-like operating system such as Linux. You need
an account on one of these so you can put your web pages there. If you don't
have access to such a computer, you can get a low-cost account on a service like
Panix.com.

You can still create Web pages on your own computer and look at them with your
computer's Web browser, but for other people to see them, you have to upload
them to the "big" computer that has the Web browser. The rest of this document
is about how to create your first Web page.


1. CREATING A WEB PAGE

This page was typed by hand. Anybody can do this, you don't need any special
"web creation" tools or HTML editors, and the pages you make can be viewed from
any browser. To see how this page was made, choose View Source (or View Page
Source, or View Document Source) in your browser's menu (or — in at least Chrome
and Firefox — Ctrl-U on your keyboard). A simple web page like this one is just
plain text with HTML commands (markup) mixed in. HTML commands (properly called
"tags") themselves are plain text.

When you're just learning and want to experiment, you can do everything on your
PC. Create a new directory ("folder") for your website, and then put the
web-page files (HTML plus any pictures) in it. Use NotePad or other plain-text
editor (not word processor) on your PC to create your "home page", a file named
index.html, which you can view locally with your Web browser. (You can also use
a word processors such as Word or WordPad if you save in "plain text", "text",
"text document", or "text document MS-DOS format".) Later I'll explain how you
can install your web site on the Internet.

Once you've made your "home page" (index.html) you can add more pages to your
site, and your home page can link to them.


2. HTML SYNTAX

Web pages are written in Hyper Text Markup Language (HTML). HTML has three
special characters: <, &, >. An HTML command is enclosed in <...>, for example
<p>, which is a paragraph separator, or <b> ("begin bold") and </b> ("end
bold"). So the following HTML text:

> This sentence contains <b>bold</b> text.

produces:



> This sentence contains bold text.

A Web page starts with a series of HTML commands, and ends with a few more. The
contents go in between:



> <!DOCTYPE HTML>
> <html lang="en">
> <head>
> <META charset="UTF-8">
> <META name="viewport"
>  content="width=device-width, initial-scale=1.0">
> <title>Sample Web Page</title>
> </head>
> <body>
> 
> 
> (Contents go here)
> 
> 
> 
> </body>
> </html>

The first line (DOCTYPE) specifies which markup language the page uses (HTML =
Hypertext Markup Language); just copy this line. The next line,
<html lang="en">, starts the page and specifies the (human) language it is
written in (language codes are specified here), and is matched by the line
</html>, which closes the page. <head> starts the heading, which contains a
title to be displayed on the browser's title bar and a declaration of the
character set (nowadays it should always be UTF-8) and the "viewport" line which
is a compulsory adaptation for cell phones, "smart" watches, etc. </head> closes
the heading. The head can also contain other items such as style parameters that
you can learn about later; for example by asking Google ("HTML how do I change
the font size?").

The <body> tag starts the body of the document, is closed by </body> tag.

As you can see, most HTML commands come in begin-end pairs: <b>...</b>,
<head>...</head>, etc. The closing part of the command has a slash (/) between
the < and the first letter of the command.

Blank lines and line breaks are ignored. The browser automatically "flows" your
text into lines and paragraphs that fit in its window. Paragraphs must be
separated by <p>. Line breaks can be forced by <br>.



Example for Windows: Use the mouse to copy the HTML above into NotePad. Then
save the file (File -> Save As...) in your Web directory as index.html. Suppose
your Windows username is Olga. Then (depending on which version of Windows you
have) this might be:

> C:\Users\Olga\Desktop\Web\index.html

Now to see your new web page, just double-click on the Web folder and then
double-click on index.html.

Now you're ready to start adding "content" to your web page. Go back to NotePad
and replace the title and "(Contents go here)" with whatever you want. Any time
you want to see the result, use File -> Save in NotePad and then click the
Reload button on your browser.

The next sections tell how to achieve different kinds of effects.


3. SPECIAL CHARACTERS

HTML special "character entities" start with ampersand (&) and end with
semicolon (;), like "&euro;" = "€". The ever-popular "no-break space" is &nbsp;.
There are special entity names for accented Latin letters and other West
European special characters such as:



> &auml; a-umlaut  ä  &Auml; A-umlaut  Ä  &aacute; a-acute  á  &agrave; a-grave
>  à  &ntilde; n-tilde  ñ  &szlig; German double-s  ß  &thorn; Icelandic thorn
>  þ 

(The table above is shown in the basic, default style of HTML. Of course there
are many ways to customize the appearance of tables; more about this below.



Examples: For Spanish you would need: &Aacute; (Á), &aacute; (á), &Eacute; (É),
&eacute; (é), &Iacute; (Í), &iacute; (í), &Oacute; (Ó), &oacute; (ó), &Uacute;
(ú), &uacute; (ú), &Uuml; (Ü), &uuml; (ü), &Ntilde; (Ñ), &ntilde; (ñ); &iquest;
(¿); &iexcl; (¡).
Example: Añorarán = A&ntilde;orar&aacute;n.



For German you would need: &Auml; (Ä), &auml; (ä), &Ouml; (Ö), &ouml; (ö),
&Uuml; (ü), &uuml; (ü), &szlig; (ß).
Example: Grüße aus Köln = Gr&uuml;&szlig;e aus K&ouml;ln.
CLICK HERE for a complete list. When the page encoding is UTF-8, which is
recommended, you can also enter any character at all, Roman, Cyrillic, Arabic,
Hebrew, Greek. Japanese, etc, either as numeric entities or (if you have a way
to type them) directly from the keyboard.

And remember: if you want to include <, &, or > literally in text to be
displayed, you have to write &lt;, &amp;, &gt;, respectively.


4. CONVERTING PLAIN TEXT TO HTML

If you have a plain text file that you want to convert to HTML, load the file
into a plain-text editor and then follow these steps.



 1. Change all occurrences of "&" to "&amp;".
 2. Change all occurrences of "<" to "&lt;".
 3. Change all occurrences of ">" to "&gt;".
 4. Change any accented letters to HTML entity names (previous section)*.
 5. Put "<p>" between each paragraph.
 6. Insert the standard prolog at the top, substituting an appropriate title.
 7. Add </body> and </html> at the end.
 8. Save the result as xxx.html, where xxx is the part of the original file's
    name before the dot, or whatever-else-you-want-to-call-it.html.

If you are a Kermit user, you can find a script to convert plain text to HTML
HERE.

If the text contains lists, tables, or other structures, read on.

If you have a Microsoft Word document you want to convert to HTML, and your copy
of Word does not allow the file to be "Saved As" HTML, then save it as plain
text and follow the same instructions. In this case you lose the "richness"
(bold, italics, font changes, etc) when you save the file, and will have to put
the effects back by hand (next section).



* Not necessary if your text is already encoded as UTF-8. If it's not UTF-8, you
can identify the encoding in the <META charset="..."> directive, but this topic
is a bit advanced for this simple tutorial.


5. EFFECTS

The rest of this document shows some of what you can do with simple HTML
commands, but I don't explain how to do it. To see that, just tell your browser
to View Source and compare the HTML in the source window with the result in the
original window.



> Note: In this and the following sections, I use some "deprecated" features
> from earlier HTML versions because they are easier for beginners to learn (for
> example <big>...</big> versus <span style="font-size:120%">...</span>).

This sentence is bold. This sentence is in italics. This sentence is in bold
italics. This sentence is in typewriter font. This sentence has underlined words
and underlined bold words. This sentence has colored words. This sentence has
big words. This one has very big words. This one has very small words.



> This is a "blockquote", which is like a regular paragraph, but it has bigger
> margins. Begin a blockquote with <blockquote> and end it with </blockquote>.
> Environments such as blockquotes, lists, etc, that have a beginning and an end
> always use paired commands like <blah>...</blah>.
> 
> 
> 
> > This is a blockquote inside another blockquote, which shows how HTML
> > environments can be "nested".
> 
> Here we are back in the first blockquote again.

And here we are back outside of the first blockquote.


6. LISTS

Here is an Unordered (bullet) List (<ul>..</ul>):
 * This is a List Item (<li>).
 * This is another item.
 * This is yet another item.

Here is an Ordered (numbered) List (<ol>..</ol>):



 1. This is a List Item (<li>).
 2. This is another item.
 3. This is yet another item.

And here is a Description List (<dl>). using Kermit commands as an example:



SET FILE TYPE BINARY This command tells Kermit to transfer files in binary mode.
In other words, don't mess with the file, just send it as-is. The result on the
receiving computer should be identical to the original.



SET FILE TYPE TEXT This command tells Kermit to transfer files in text mode.
This should be used with plain-text files, especially when transferring them
between computers with different file formats or operating systems, such as VMS
and Unix, or Unix and Windows. It converts the file's format and character-set
(if necessary) so the received file is usable on the destination computer.

You can have lists within lists:



 1. A gromet
 2. A widget
 3. A framus, which consists of the following components:
    * A doohickey.
    * A veeblefetzer -- these come in several colors:
      1. Purple
      2. Purple
      3. Purple
    * A whatchamacallit.
 4. A doodad.

And you can have ordered lists that use letters instead of numbers:



 a. Pennies
 b. Nickels
 c. Dimes
 d. Quarters


7. LINKS

Links can be internal within a Web page (like to the Table of Contents at the
top), or they can be to external web pages or pictures on the same website, or
they can be to websites, pages, or pictures anywhere else in the world.

Here is a link to the Kermit Project home page. And here is what the HTML looks
like:

> <a href="http://www.kermitproject.org/">
> Kermit Project home page
> </a>

The part inside the quotes is called the URL (Uniform Resource Locator). Here is
a link to Section 6 of the page you are reading, and the HTML:

> <a href="#lists">Section 6</a>

The "#" indicates an internal section ID, in this case:

> <h3 id="lists">6. Lists</h3>

Here is a link to Section 4.0 of another document, at another website; the
C-Kermit for Unix Installation Instructions. And the HTML:

> <a href="http://kermitproject.org/ckuins.html#x4.0">
> Section 4.0
> </a>

Here is a link to a picture: CLICK HERE to see it.

If you want to link to a particular section of somebody else's Web page, visit
the page, "view source", search for the text at that spot and see if there is an
"id=" clause; if so, use the ID as shown just above; if not you're out of luck.

If you want to link to a particular page of a PDF document, just put "#page=123"
(replace by the desired number) at the end of the URL.


8. TABLES

Here's a simple table with some headings and a few rows:



> Heading A Heading B Heading C Cell 1A Cell 1B Cell 1C Cell 2A Cell 2B Cell 2C
> Cell 3A Cell 3B Cell 3C

Same table again but with borders:



> Heading A Heading B Heading C Cell 1A Cell 1B Cell 1C Cell 2A Cell 2B Cell 2C
> Cell 3A Cell 3B Cell 3C

The appearance with double borders is the default (and therefore easiest) table
style. You can use table attributes to change the appearance.

Here's the same table again but with Column C right-adjusted:



> Heading A Heading B Heading C Cell 1A Cell 1B Cell 1C Cell 2A Cell 2B Cell 2C
> Cell 3A Cell 3B Cell 3C

And finally, here it again with some "style" parameters applied to get rid of
the ugly double borders, which you can see in the <style> section of the <head>
at the top of this page, if you "view source".



> Heading A Heading B Heading C Cell 1A Cell 1B Cell 1C Cell 2A Cell 2B Cell 2C
> Cell 3A Cell 3B Cell 3C

So with just three lines added to the <style> section at the top of the page,
you can make all your tables look better.


9. VIEWING YOUR WEB PAGE

Anyway, back to basics. If you make a simple index.html in your Web directory
like:

> <!DOCTYPE HTML>
> <html lang="en">
> <head>
> <title>My first web page</title>
> <META charset="utf-8">
> <META name="viewport"
>  content="width=device-width, initial-scale=1.0">
> </head>
> <body>
> <h2>This is a heading</h2>
> And this is some text.
> </body>
> </html>

Then if you double-click on index.html, it will open in your Web browser.

Now you can work on your page's <body>: add more text, add some images, add some
links, add subheadings, some lists, some tables, whatever you want. Each time
you make a change, reload the page in your browser (usually done by clicking on
the ⟳ symbol, or typing Ctrl-R).

By the way, a web page can have any name at all, it doesn't have to be
index.html. Index.html is a special name that is used for the "home page" of a
website. To open a web page that has some other name, right-click on the
filename and then choose "Open with..."; then click on your Web browser's name.


10. INSTALLING YOUR WEB PAGE ON THE INTERNET

How to put your web page on the Internet depends on your Internet Service
Provider (ISP). At Columbia University, each user has a "shell account" on the
central server, which runs a Unix-based operating system, and which you can
access with a terminal emulator such as Kermit. Here's an example that applies
to Columbia University's web server, showing how to upload your files from
Windows:



> There are easier ways to do this than what I describe below, but they require
> add-on software. The following method should work for everybody who has
> Windows and an Internet connection.

If you create a public_html subdirectory of your login directory, give it
"world" read and search permission, and then create an index.html file in that
directory and give it world read permission, you'll have a home page. In this
example "$" is the shell prompt (yours might be different), and what you type is
underlined. CAUTION: the directory name is public_html but the underscore might
be obscured the underline in the examples below. Whenenever typing "public_html"
always include the underscore. CAUTION#2: Some Web hosting sites might use
different a different name for the user's Web directory.



> $ cd                      (Change to your login directory)
> $ mkdir public_html       (Create public_html subdirectory)
> $ chmod 755 public_html   (Give it world read/search permission)
> $ cd public_html          (Enter the public_html subdirectory)

You only have to do this part once. Remember, it's public_html with an
underscore, which tends not to show up when a command is underlined.

Let's assume you have created a website in the Web folder on your PC. Here's an
example of how to upload your Web files to your public_html directory on
Columbia University's Cunix server using FTP (File Transfer Protocol). First
start the FTP program:



> Start -> Run

and type "ftp" in the box. An FTP window opens and an "ftp>" prompt appears.
Type the underlined commands at the "ftp>" prompt (substituting your own user
ID, etc):

> ftp> lcd Desktop
> Local directory now C:\Users\olga\Desktop.
> ftp> lcd Web
> Local directory now C:\Users\olga\Desktop\Web.
> ftp> open cunix
> Connected to cunix.cc.columbia.edu.
> 220 Cunix FTP server (Version 5.60) ready.
> User (cunix.cc.columbia.edu:(none)): olga
> 331 Password required for olga.
> Password: (type your password here)
> 230 User olga logged in.
> ftp> cd public_html ("public_html" with underscore)
> ftp> binary
> ftp> put index.html
> 200 PORT command successful.
> 150 Opening BINARY mode data connection for index.html.
> 226 Transfer complete.
> ftp: 285 bytes sent in 0.00Seconds 285000.00Kbytes/sec.
> ftp> site chmod 644 index.html
> 200 CHMOD command successful
> ftp> bye

This sends the index.html file to your public_html directory on the server. You
can send any other file by substituting its name for "index.html. If you want to
send all the files in your Web folder, replace "put index.html with "put *"
(asterisk, meaning "all files" in this directory). Always use binary mode unless
you know what you're doing.

If the "site chmod" command failed (this service is not supported by some FTP
servers), you have one more step. Before others can see your web files, you have
to give them "world read" permission. Again, log in to the server using a
terminal emulator (Telnet, SSH, Kermit, whatever), and:



> $ cd ~/public_html        (Enter the public_html subdirectory)
> $ chmod 644 *             (Make all files publically readable)

Now you have a home page. If you were at Columbia and your login ID was "olga",
the address (URL) of your home page would be:



> http://www.columbia.edu/~olga/

If you want to add pictures to your Web page, you can upload those too (also
with Kermit or FTP), and you also have to "chmod 644" all the files to make them
readable by everybody. Every time you add new files to your public_html
directory, you have to "chmod 644" them so they are accessible, either in the
FTP session itself (as shown previously), or by logging in to the host and:



> $ cd ~/public_html ("public_html" with underscore)
> $ chmod 644 *

Pictures should be in JPG or PNG or GIF format. To include a picture ("image")
in your page, include a sequence like this at the desired spot:



> <img src="filename" alt="brief description">

Replace filename by the name of the file (e.g. skyline.jpg). Almost every HTML
tag can be customized by "attributes" in the begin tag. For example if you want
the image to scale itself to the viewer's window (on a computer, cell phone, or
other device), and furthermore you want the text of the page to flow around it,
you can do:

> <img alt="brief description" style="width:50%; max-width:480px; float:left;
> margin:10px;" src="filename">

You can look up the attributes in Google, just search for html width, html
float, etc.

Now you have your own home page on the Web, and your own URL (Uniform Resource
Locator, or Web address). In this example, the URL is:



> http://www.columbia.edu/~olga/

Of course, if you prefer, you can also do all the Web-page editing directly on
the server, using an a server-based text editor like EMACS or Vi while logged in
to the Unix shell. In that case you don't need to upload anything (except maybe
photos), but then you also need to be more familiar with the server's Unix
environment and commands and utilities.


11. WHERE TO GO FROM HERE

Most Web pages are created by hideous bloated "Web Authoring" tools, which are
usually designed to hook you (and readers of your web pages) into some corporate
profit-making scheme. If all you want is text with some pictures and links, some
section headings, and maybe some tables, as opposed to spinning blinking popup
holograms with streaming video, sound effects, etc, it's best to keep it simple
and do it yourself. This is how the Web started off in the HTML 1.0 days of the
early 1990s. The ingenious thing was that it was self propogating. If you saw a
web page with a certain effect and wanted to know how it was done, you could
simply "view source" to get the "source code" and then adapt it to your own
page. You can still do that with pages that look like this one, but since most
Web pages are no longer made by hand, you'll often see tons of incomprehensible
gibberish (the more special effects, the more gibberish), for example at CNN.

Anyway, if you have mastered the simple techniques shown in this page, you know
the basics. Which is more than can be said of many "web designers" who only know
how to use prepackaged software to create web pages by picking things from menus
and moving things around with a mouse. To go further, you can almost always find
out how to do what you want by searching Google ("html how do I ...?"), or
looking at the HTML code of different websites (browser "view source" command)
but, again, only for pages that look like this one.

Of course HTML is a standard, and here are the official references:

 * The HTML 5.2 specification (2017).
 * Cascading Style Sheets (yet another "standard" layered on top of HTML, which,
   like HTML goes through many versions, each one making the previous one
   obsolete).

You can check the validity of your web page at the W3C Markup Validation
Service. Note that this page itself does not pass the Validator because it uses
a number of "obsolete" elements. That's because (a) they are much easier to
explain, and (b) they still work. For the first 20 years or so HTML was in
constant flux, but with the release of HTML5 in 2014, it seems to be pretty
stable.

If you have made mistakes it will let you know, and if you have used "old" or
"deprecated" HTML features it will let you know that too, and usually also
suggest a modern replacement.


12. POSTSCRIPT: CELL PHONES

The original Web was composed of pages designed to fit on desktop computer
screens, which, over time, became wider and wider. But then suddenly they also
had to fit on miniscule cell-phone and even "smart watch" screens. The main
pitfall is that an image might be too wide for the screen, so the image width
should be specified as a percentage of the viewport width, e.g.:

> <img alt="Brief description" 
>  title="Slightly longer description"
>  src="picture-of-something.jpg"
>  style="width:100%;">

Text, on the other hand, usually just flows to accomodate the viewport. But if
your page includes text that must not be "wrapped" (for example, program source
code, poetry, computer dialog transcripts), you have to enclose such sections
within:



> <div style="overflow-x:auto; white-space:pre">
> material that must not wrap
> </div>

as has been done in several places above, in which case a horizontal scroll bar
will appear automatically if the non-wrapping text is wider than the viewport.
If you are viewing this page on a wide screen, you can see this effect if you
squeeze your browser horizontally to its minimum width and then scroll through
this page.

(End)

Frank da Cruz
Page created: 1992
Last update: 17 September 2021
[validate]



--------------------------------------------------------------------------------