One of the fundamental differences between the Web and other forms of mass media is that the Web is interactive.
With radio, television, newspapers, magazines and the like, communication is largely one way. The publisher presents information that users consume (or not) as they see fit. While the publisher may try to elicit a response from the user, both a temporal disconnect and a change of medium is often required. For example, a TV commercial may encourage the viewer to “Call now,” but that seldom results in immediate action. When it does eventually result in an action, that communication takes place through another medium such as the phone.
The Web, by comparison, can support two-way communications. A Web page can present the user with info and attempt to elicit an immediate response from the user. A temporal disconnect is not necessary (or even particularly likely), and the response can occur over the same medium that carried the message. For example, a Web site can present info about a product that is for sale and on the very same page offer users an opportunity to order that product. Users don’t have to add it to their “to do” list and deal with it later. Users don’t have to pick up the phone or go to the mailbox. They are free to act immediately upon their impulse and order the product they desire. The same advantages apply to more than just placing orders. The two-way nature of the Web can be used to collect all manner of info from users, including comments, feedback, reviews, questions, testimonials, questions and anything else you can imagine. The primary tool for collecting info from users on the Web is the form.
When you place an order, submit a request, provide your comments or do anything similar on paper, it typically involves filling out and submitting a paper form. When you perform similar actions on the Web, they also typically involve filling out and submitting a form.
Web forms are similar to paper forms in many ways. They generally have clearly marked areas into which you enter info, and those areas are typically labeled to let you know exactly what info you are expected to enter into them. Scattered among the entry areas, there are often instructions or additional details to help you complete the form correctly.
There are, however, also some useful differences. Info entered into Web forms is entered with a mouse or keyboard, eliminating the problems associated with illegible handwriting. Web forms are much easier for computers to process without human intervention, making them much more efficient. Since the computer can also check the contents before accepting a form, it’s much easier to ensure the completeness and accuracy of Web forms. And Web forms have no physical presence, making it feasible to record the info they contain without warehousing stacks of paper.
Within a Web form, the parts of the form used to provide info are collectively known as controls.
XHTML provides the ability to generate several different types of controls within a form. In general, each type of control is intended for a specific purpose.
There are also controls for more advanced purposes with which we will not concern ourselves in this course.
It’s important to understand what forms do (and don’t do).
Forms are strictly for collecting input from the user. They don’t actually act upon that input. Instead, the browser generally packages the input into a form data set and sends it to a program running on the server host. This program is referred to as the form’s processing agent. Since it’s just a software program, it can do anything you can imagine a software program doing. That means it can send e-mail; generate images, CSS or XHTML; access databases and other files; perform calculations; and/or much, much more.
When the user submits a form, the browser builds a form data set and sends it to the form processing agent by packaging it up in an HTTP GET or POST request message and sending it to the appropriate server. The server extracts the form data set from the HTTP request and passes it to the appropriate form processing agent. The form processing agent runs, using the form data set as its input, and typically produces an XHTML document as its output. In addition to producing an XHTML document, the form processing agent may do any number of other tasks as well. The XHTML document produced as output by the form processing agent gets passed back to the server. The server then sends the resulting document back to the browser in an HTTP response message. And, of course, the browser renders the document it receives for the user to view.
From the user’s perspective, therefore, submitting a form is very similar to activating a hyperlink. It’s just that the page they see as a result was created especially for them. Such pages are often said to be dynamically generated pages.
There are several important concepts involved in understanding form data sets.
When forms are written in XHTML, each control is identified with a control name assigned by the page author. When forms are completed by a user, each control is assigned a current value as a result of the user’s actions. Many controls have an initial value that is set by the page author and utilized by the browser when the form is first displayed or reset by the user.
When the form is submitted by a user, any control for which the author has provided a control name and the user has provided a current value is determined to be a successful control. Only the successful controls are considered valid for the purpose of submission to the form processing agent. The control name and current value of each successful control are paired to form a name-value pair. These pairs are then collected together to create the form data set that the browser sends to the server. And the server eventually passes this form data set to the form processing agent.
By examining the name-value pairs in this form data set, the form processing agent can act in response to the info the user has entered into the form’s controls.
There are two methods that a browser can use to submit a form data set to the server for the form processing agent.
In the GET method, the form data set is tacked onto the URL and an HTTP GET request is used to send it to the server. This works because the URL of the requested resource is sent as part of an HTTP GET request. When the form data set is appended to the end of that URL, it gets sent to the server along with the rest of the URL. This method is only appropriate when the form data set is limited in size and the actions of the form processing agent have no side-effects.
The alternative is to use the POST method, in which the form data set is sent in the body of an HTTP POST request. This method is significantly more flexible than the GET method. It supports larger form data sets and more types of data. The POST method should be used when the form data set is large, the form data set contains potentially sensitive data, and/or the actions of the form processing agent generate one or more side-effects. As a Web author, you choose the method to be used when you create a form in your XHTML.
A form is just one part of an XHTML document.
As such, it is created within the XHTML through the use of the <form> container
element and added to the page much like a table or a list would be. As with tables
and lists, this means that a single Web page can contain multiple forms, though
most commonly only one form per page is used. In the remainder of this document,
I have used several sample forms, each surrounded by a border to make it easier
to see where the form begins and ends.
Attributes of the <form> element
are used to tell the browser how to handle the form. The method attribute
is used to select the desired form submission method. Acceptable values
are “get” (the
default) and “post.”
The action attribute is used to specify the URL of the form
processing agent. At its simplest, this will be the (typically absolute)
URL of the executable file that contains the form processing agent. Since
most servers require all form processing agents to be stored in a special
folder and/or named with specific extensions, the server can tell from
the URL that an incoming request will require the execution of a form processing
agent.
Since we cannot publish our own processing agents on the pubpages server
(and this is not a course in programming), we will use a processing agent
named cgiemail, which has already been published for us on the
pubpages server. Its URL is http://pubpages.unh.edu/cgi-bin/cgiemail.
However, as we will soon see, we cannot use this URL without adding more
to it.
The contents of the <form> container element
represent the form. These contents can contain any XHTML you wish, including
paragraphs, line breaks, horizontal rules, lists, tables, and so on. Of
course, to be useful, they should also contain the XHTML necessary to create
one or more controls.
Virtually all forms will contain at least one text input control and a submit button.
The most common type of text input control supports a single
line of input. To create this type of control, use the <input /> element.
To get a simple single line text input control, all you need to use is the name attribute.
The name attribute is used to assign a control name to the text
input field. Since this control name will be used by the form processing agent
to associate a value with this specific control, the value of the name attribute
should be unique within the form.
To control the width of the box representing
the text input control, use the size attribute. The value of
this attribute should be a number without any units. It specifies the width
of the box as a number of “average character widths.”
Note that this is only a rough measure of box size. The size of the box
is purely cosmetic, it does not limit what the user can enter into the
text input control. To specify the maximum number of characters the text
input control will accept, use the maxlength attribute. The
value of this attribute should be a number without any units. The browser
should not allow more than the specified number of characters to be entered
into the text input control. In general, you’ll
want the value of the size attribute to be less than or equal
to the value of the maxlength attribute.
If you want to assign
an initial value to the text input control, use the value attribute.
A form needs to have a submit button to ensure users have some way to
submit the form when they’re done
filling it out. To create a submit button, use the <input /> element
with a type attribute set equal to a value of “submit.”
Use the value attribute to set the text label to be displayed
within the button.
Some controls, such as submit buttons, have built-in labels, but most do not. If you just put a bunch of controls on a page without letting the user know what to use them for, you won’t get very useful information submitted. For this reason, controls should always have an associated label of some sort.
Since all controls are created using replaced elements, one way to label them is to simply precede or follow them with text that acts as a label. However, this gives the browser no structural knowledge of the association between the control and its label.
A better approach is
to use the <label> container
element to associate a label with a control. There are two ways to use
the <label> element.
One way is to simply place the label text and the element that creates the control
being labeled inside the <label> container.
The other way is to use the id attribute of the control-creating
element to associate a unique identity with that control, and then use the same
value with the for attribute in the start tag of the <label> element.
In the second method, the <label> container only contains
the label itself. The following three examples are all equivalent, though the
second two give the browser more structural info:
The advantage to the last line is that it allows the label and the control being labeled to appear in different parts of the XHTML. This can be particularly useful when a table is being used to arrange the controls and their labels.
Consider the following simple example. It includes a single-line text input control and a submit button. The submit button has its own built in label, but the text input element needs some sort of label so the user knows what they are expected to do with it.
If you examine the source code for this sample form, you will notice that it is about as simple as a valid form can get. The text input control is the default size of 20 average character widths. It has no maximum placed on the number of characters it will accept (though browsers will typically apply some sort of maximum). And the submit button accepts the browser’s default label. Also notice that since the label, the text input control and the submit button are all part of the same division, the browser runs them together into a single line. This is not a problem for this simple form, but it could be a major problem for larger, more complex forms. Let’s examine a somewhat more complex example:
As forms go, it is still very simple. However, notice that the size of the text input control has been substantially increased, a label has been specified for the submit button and a reset button has been added to allow the user to clear the form (although it’s a bit silly to do so for such a simple form). You can’t see it just by looking, but if you try to type more than 50 characters into the text input field, the browser will stop at 50. Try typing the digits 1, 2,… 9, 0 five times in a row and observe what happens when you try to type another 1 at the end. Compare this to the behavior of the previous example.
If you haven’t already done so, fill in and submit at least one of the sample forms above. Notice that doing so is similar to following a link — when you do it your browser displays a new page (and you need to use your browser’s Back button to get back to this page). The new page that you see after submitting a form is generically referred to as a results page.
Nearly all form processing agents produce some sort of results page for the browser to display in response to a form submission. In some cases, such as submitting your query to a search engine like Google, the generation of the results page is all the processing agent needs to do to keep the user satisfied. However, the vast majority of processing agents also have more useful side effects. The range of possible side effects is virtually limitless, but they typically involve using the info submitted by the form for some purpose.
It’s quite common for the form processing agent to be written by a programmer specifically to handle the submission of a specific form in the context of a specific Web site. But we’re not programmers, and even if we were, the pubpages server would not allow us to run our own form processing agents. Since they are programs that are effectively run anonymously on the server host by people all over the Web, form processing agents present a large risk to the security of the server host. And for this reason, the administrator of the pubpages server prohibits us from publishing our own processing agents. Instead, we’re going to work with a form processing agent that’s already installed on our server — one named cgiemail.
Cgiemail takes the form data set that the browser sends it and uses the name-value pairs it contains to compose and send an e-mail message. After sending the e-mail message, cgiemail places the contents of that message into a simple XHTML results page and sends that back to the browser to display.
Cgiemail treats the e-mail that it is being asked to send like a “form letter,” and it uses the contents of the form to fill in the “blanks” in that “form letter” before sending it. The “form letter” is called a template file and the “blanks” it contains are called merge fields. In general, each merge field in the template file will get replaced by one of the values the user has provided in the form, resulting in a customized e-mail message. This complicates the use of cgiemail, because as the author of the form you must also write the template file and tell cgiemail where to find it.
Cgiemail must have a template file in order to function. The job of the template file is to provide a framework for the e-mail message that gets sent when the form is submitted.
Most of the template file will consist of constant text that will appear the same in each e-mail that gets sent.
Some portions, however, will consist of variable text represented by merge fields. The text that eventually appears in place of these merge fields will come from the current values that accompany the control names in the form data set.
A template file is just a simple text file with a valid e-mail structure. A valid e-mail structure requires one or more header lines, followed by a blank line, followed by one or more body lines. Header lines start with a keyword followed by a colon and end with a value for the keyword.
The To: keyword is used before the e-mail
address of the intended recipient. Generally, you would put your own e-mail
address after this keyword so that the e-mails that cgiemail generates
would be sent to you.
The From: keyword is used before the e-mail address
of the sender. If you want to use this keyword, you would need to provide
a text input field in your form into which the user could enter their e-mail
address and then use a merge field to insert the text they enter into the
template after this keyword. Keep in mind, however, that there’s no
guarantee the user will enter their address correctly or truthfully!
The
Subject: keyword is used before the subject for the message.
You can follow this keyword with a merge field if you wish to allow the
user to specify their own subject for the generated message. However, if
you use constant text, all the e-mails that your form generates will have
the same subject line, making them easier to manage as they arrive in your
mailbox.
The first blank line in the template file signals the end of the header. Any other lines following the first blank line are interpreted as part of the body and may contain anything you wish.
Merge fields may appear in both the header and the body. To create a merge field, simply surround a control name with square brackets. The control names used as merge fields must match the control names used in the XHTML for the form in both spelling and case. Remember that only successful controls will appear with a current value in the form data set (and be available for use by cgiemail).
Although you typically would not do so, for each example form on this page I have provided a link to the template file so you can examine the template file for each example. If haven’t already done so, go back up the page and examine the template file for the previous example forms (they both use the same template).
Do not
use a merge field to provide the value of the To: keyword in
the header. This could conceivably create an e-mail gateway that could
be abused by spammers and get you in trouble with the server administrator.
If you look closely at the source code for the two sample forms above,
you will notice that the action attribute of both has the following value:
This rather strange looking URL is actually a composite formed by combining two separate URLs into one. The first URL is that of the cgiemail processing agent:
And the second is the URL of the template file:
We drop the scheme and hostname from the template file’s URL (since
they’re the same as those in cgiemail’s URL) and add the remaining
portion to the end of the URL for cgiemail to get the appropriate value
for the form’s action attribute. Cgiemail can then use
this extra information at the end of its URL to find the appropriate template
file. This allows us all to use the same cgiemail program as many times
as we wish, since each time we use it we can specify a different template
file. In fact, several of the sample forms in this page use different template
files, as you will see.
Radio buttons are used to provide the user with sets of mutually exclusive choices. A set of choices is mutually exclusive when one, and only one, must be chosen.
For this reason, radio buttons always come in sets of two
or more, and the set is viewed as a single successful control by the form
processing agent. One and only one radio button within a set must be selected
at all times. Whichever one is selected when the form is submitted determines
the current value of the control. Radio buttons are created using the <input
/> element,
setting its type attribute to a value of “radio.”
Each <input /> element generates one radio button.
There must therefore always be at least two <input /> elements
per set. To associate the <input /> elements that form
a set of radio buttons with one another, you must give them all the same
value for their name attributes.
Each radio button will require
its own label. If you wish to associate labels with each radio button using
the <label> element’s for attribute,
you must give each <input /> element a unique value for
its
id attribute.
So the form processing agent can determine which
radio button was selected at the time of submission, you must give each <input
/> element
in a set of radio buttons a different value for its value attribute.
Since there must always be one, and only one, radio button selected within
a given set, it’s important to indicate
which radio button in a set should be selected initially. To indicate a
radio button to be selected initially, set the
checked
attribute of that
<input /> element to a value of “checked.”
Here’s a sample form that utilizes a set of radio buttons:
Since it wouldn’t make sense for a user to rate the site as both poor and excellent at the same time, the rating is an excellent example of a set of mutually exclusive choices. Selecting one rules out all the others.
Be sure to always select one (and only one) radio button in each set initially. Otherwise, your set of radio buttons will initially appear in an invalid state (with none selected).
If you have a set of choices where it is conceivable that more than one or none will be chosen by the user, then you have a set of possibly inclusive choices. In such cases, radio buttons are not the proper tool, checkboxes are.
Checkboxes are used to present the user with one or more possibly inclusive choices. Generally, a set of choices is considered possibly inclusive when selecting one does not necessarily rule out the others. In a set of possibly inclusive choices, the user is free to choose them in whatever combination makes sense at the time, even if it means selecting all or none.
Although they can appear singly, checkboxes most commonly appear in groups. However, they are grouped only logically and visually. There is no grouping enforced by the browser as there is with radio buttons.
Within a group of checkboxes, a user may check them in any combination necessary. However, only those checked at the time of submission will be considered successful controls and included in the form data set.
Checkboxes are created using the <input
/> element,
setting its type attribute to a value of “checkbox.”
Each <input /> element generates one checkbox.
Generally, each checkbox is given a unique value for its name attribute.
As with radio buttons, each checkbox will require its own label.
The form processing agent will receive the value “on” as the
current value for any successful checkbox control. If you wish to have
a different value sent to the form processing agent, specify it as the value
of the <input
/> element’s value attribute.
To indicate that a checkbox should be selected initially, set the checked attribute of that <input
/> element to a value of “checked.”
Here’s a sample form that utilizes a set of check boxes:
Since a user may feel that the site is fine, it’s conceivable that they will not feel that any parts need more work. On the other hand, another user may feel that all parts of the site need more work. Hopefully, most users will be somewhere in between and select one or two areas that need more work.
None of the checkboxes in this sample have been checked initially, since it doesn’t really make sense to do so. However, it would be trivial to check some or all of the check boxes initially, as in the following sample:
When the user enters text into a text input control, the text they type appears on the screen as they type it. And anyone sitting or standing in sight of the monitor has a chance to read it from the screen. For certain types of sensitive data, such as passwords, this can present problems.
For this reason, there is a variant of the text input control called a password input control, which obscures the text entered by displaying a generic character, such as an asterisk, in place of each character typed by the user. Password input controls are commonly used when the user is being asked to type something that should remain known only to them. Keep in mind, however, that the user cannot visually confirm what they have typed, so typos are a distinct possibility. For this reason, it is common (though not necessary) to use password input controls in pairs, asking the user to enter the same information twice so it can be compared by the processing agent. If the two values match, the processing agent can be reasonably sure the user typed the entry without typos, since the probability of the user repeating exactly the same typos twice in succession is very small.
The password input control is created with
the <input /> element, setting its type attribute
to a value of “password.” The name, size and maxlength attributes
work the same for the password input control as they do for the text input
control. Technically, the value attribute works the same way
as well, but the initial value in a password input control will be obscured,
making the value attribute essentially worthless in this case.
Be careful not to ascribe more security to the password input control than is warranted. It serves only to protect the contents of the field from the view of someone looking at the screen. There is absolutely no protection provided for the entry “behind the scenes.” The contents of the control are stored and transmitted as part of the form data set no differently than the contents of a text input control.
Experiment a bit with the following example to see how the password input controls work:
The sample forms provided here have relatively minimal internal structures to make them as easy to understand as possible. However, more realistic forms can get much more complex, and they will, in turn, require far more complicated internal structures involving divisions, paragraphs, line breaks, headings, lists, tables and various other structural elements.
However, many forms have an internal structure that associates certain controls with one another in ways that cannot be semantically represented by the standard XHTML structural elements. Conveying the true semantics of this structure to browsers can significantly improve the accessibility of your forms.
The primary tool for accomplishing this is the <fieldset> container
element. This element may be used to group controls together, making
the association among them clearer to the browser, and by extension the user.
The <fieldset> element
may contain any portion of the form necessary, including both controls and
other structural elements. A single form may contain several fieldsets,
and they may be nested within one another.
Let’s add some fieldsets to the previous example to group the password input fields together and to group the portions that involve rating the site:
The border that a browser displays around a fieldset helps to visually group
its contents. However, in some cases it is helpful to add a textual label
to the fieldset as well. A <fieldset> element
may contain a <legend> container
element for this purpose. The contents of the <legend> element
will be used by the browser as a label for the fieldset, typically displayed
in the upper left corner of the fieldset frame. Let’s add some legends
to the fieldsets of the previous example:
Menu controls provide users with several options from which to choose. Menus can be used for both mutually exclusive and possibly inclusive sets of choices, and often require less room on the page than a set of radio buttons or checkboxes.
To create a menu, use a <select> container element.
Use a name attribute to identify the control.
By default, a menu control presents mutually exclusive choices, but the <select> element
can accept a multiple attribute with a value of “multiple” to
instruct the browser to accept multiple selections from the list of choices.
How the user indicates multiple selections is browser specific, but it
generally involves holding down the Shift or other keys while clicking
items in the menu. When this attribute is used, the menu is often displayed
as a (potentially scrolling) list of choices rather than a popup menu.
The <select> element
can also accept a size attribute with a numeric value to inform
the browser how many choices to display at once. When used with the multiple attribute,
the size attribute determines how many choices appear in the
list at one time. When used without the multiple attribute, a size value
greater than 1 generally results in a (potentially scrolling) list of choices
rather than a popup menu.
A <select> element
must contain one or more <option> container elements.
Each <option> element represents one of the choices within
the menu.
There are two ways to specify the option text that the user sees within
the displayed menu. The most common approach is to specify the text that
is to appear for the option as the contents of the <option> container.
It’s also possible to specify the text that the user sees as the value
of the <option> element’s label attribute.
If both approaches are combined, the value of the label attribute
should be used.
By default, the current value of the menu control will be
the contents of the <option> element
that is selected at the time of submission. To specify a different value,
assign it to the <option> element’s value attribute.
By default, the first option in the menu will be selected initially.
To specify a different option as the initially selected choice, set that <option> element’s
selected attribute to a value of “selected.”
The following sample form replaces the radio buttons and checkboxes of the previous form with menu controls:
Most text input can be handled using the single line text input controls
created with the <input /> element.
However, it is sometimes necessary to collect multi-line text input using the <textarea> container
element.
This element accepts a name attribute to assign it an identity.
It also requires both a rows and a cols attribute,
which take numeric values. The rows attribute specifies the height
of the text area in lines of text. The cols attribute specifies
the width of the text area in average character widths. These attributes
only determine the size of the text area, not the amount of input it will
accept. Browsers should provide horizontal and/or vertical scrolling capabilities
as necessary.
The contents of the <textarea> element, if
any, will be used as the initial value of the text area. If you do not
want there to be anything in the text area by default, the </texarea> end
tag should immediately follow the <textarea> start tag.
Let’s add a text area control to the previous sample form to allow the user to express their comments:
Experiment a bit with the text area control in the sample form above. Try typing more on a line than will fit. Try typing more lines than will fit. Observe how the browser handles these events.