![]() |
my home page my professional page my personal page things I've written |
Basic CGI ProgrammingWritten by Valerie Mates, May 18, 1999CGI programs generate web pages on the fly. When you type text in boxes on a web page and press a button to submit the data, you are running a CGI program. This page describes how to write a CGI program.
The BasicsAt its most basic, a CGI program is one that reads an environment variable and writes out ordinary HTML. For example, here is a simple shell script CGI program:
#!/bin/sh
echo "Content-type: text/html"
echo ""
echo "Hello world!"
Ideally, the HTML from that script would include tags like <html> and
<head> and <body>, but browsers will know what to do with it
even if those are missing.
What Programming Language?A CGI program can be written in any programming language. This page talks about CGI programming in Perl.Where do I put the program?CGI programs need to go in a "cgi-bin" directory. This is a special directory of programs that can be run by the web server. Unfortunately there is no standard location for a cgi-bin directory. To find yours, check your web server configuration or ask your system administrator.FormsYou can send data to a CGI program from a form, either on a web page or from another CGI program. The HTML for a simple form might look like this:
<form action="/cgi-bin/foo.cgi" method=post>
Greeting: <input type="text" name="greeting" size=10 maxlength=20><br>
Your Name: <input type="text" name="your_name" size=20 maxlength=30><br>
<input type="submit" name="submit" value="Send">
</form>
In a web browser window, that HTML will produce this:
cgi-lib.plFor CGI programming, I use a library called cgi-lib.pl. It is available from cgi-lib.stanford.edu/cgi-lib/To use cgi-lib in a Perl program, put it in the same directory as your program (or in your Perl search path) and include the line:
require("cgi-lib.pl");
cgi-lib has several very useful routines. One is called ReadParse. Reading Input From A FormIf you include this line in your Perl program:
&ReadParse;
then all the variables on the form are put into a hash named %in. In your
program you can refer to the variables like this $in{'greeting'} and
$in{'your_name'} that is $in{'name_of_variable'}.
HTML HeadersThe first thing a CGI program must do, before displaying any text, is to tell the browser that the program will be sending text. One way to do this is to print the string: Content-type: text/html followed by two newlines. The other option is to use a cgi-lib function called PrintHeader, which you do by including this line in your program:
print &PrintHeader;
If you leave out the HTML headers, you will get a web server error. Using the VariablesHere is a sample Perl program that uses the variables from the form:
#!/usr/local/bin/perl
require("cgi-lib.pl");
&ReadParse;
print &PrintHeader;
print <<EOF;
<html>
<head>
<title>A Greeting From $in{'your_name'}</title>
</head>
<body bgcolor="#FFFFFF">
$in{'your_name'} sends you this greeting:<br>
<blockquote>$in{'greeting'}</blockquote>
</body>
</html>
EOF
When someone runs that program, its output will look something like this:
Handling ErrorsNormally in a Perl program, if an error condition occurs, you would use the Perl command die to display an error message and exit. However, you cannot do this in a CGI program. If you do this in a CGI program, the error message will be hidden away in a web server error log where the user cannot see it. The user will see only an error message that says something like "500 - Server error".Instead of die, a CGI program should display an intelligent error message and then call exit. I wrote a routine called "crash" that I use. Here is the code for it:
#
# Subroutine to exit gracefully from errors:
#
sub crash{
print $_[0];
print "</td></tr></table></td></tr></table> </body></html>";
exit;
}
The end-of-table code in the crash routine is useful because if you print a table without a </table> tag, the browser won't show anything in the table. The extra tags make sure that even if the crash occurs while you are in the middle of writing out a table, the error message will still be readable. Here is an example of a program that calls crash:
#!/usr/local/bin/perl
require("cgi-lib.pl");
&ReadParse;
print &PrintHeader;
# If greeting is blank, display error message and exit:
if ($in{'greeting'} eq "") {
crash("Please enter a greeting. Press your browser's
Back button to enter it.");
}
print <<EOF;
<html>
<head>
<title>A Greeting From $in{'your_name'}</title>
</head>
<body bgcolor="#FFFFFF">
$in{'your_name'} sends you this greeting:<br>
<blockquote>$in{'greeting'}</blockquote>
</body>
</html>
EOF
#
# Subroutine to exit gracefully from errors:
#
sub crash{
print $_[0];
print "</td></tr></table></td></tr></table> </body></html>";
exit;
}
Debugging TipsSome techniques that are useful for debugging CGI programs;
SecurityQuestion: What is wrong with this line of code?
system("log_to_database $in{'user_data'}");
Answer: This program runs a Unix command with user-supplied
data. That is, it runs the command:
log_to_database something
where something could be anything at all. Suppose the user had
entered this text: ; rm /. Then the Unix command that would be
run is log_to_database ; rm /. That is, by adding a semicolon,
the user terminated the log_to_database command and started a second command
on the same command line. That second command in this case is (a mild
version of) the command to delete all the files on the system.
Since you don't want users running random commands on your system, be very careful what you do with user-supplied data. Avoid passing user-supplied data to system commands. If you must do so, first filter out all possible bad characters from the data, or, better yet, to avoid missing any special characters you haven't thought of, filter out all characters except the ones that are acceptable. For example, the command:
$in{'user_data'} =~ s/[^A-Z0-9]//gio;
will remove all non-alphanumeric characters from the variable
$in{'user_data'}. If the user enters ; rm / and you run that
substitution, the user's entry will be pared down to only its
alphanumeric characters, which in this case are the letters "rm" without
the dangerous semicolon. Now you can safely run
the command as log_to_database rm, which will merely log the
letters "rm" to the database which is vastly preferable to deleting
all the files on your system!
Be careful too about filenames. If the user enters a filename, beware allowing a carefully placed .. or other special characters to overwrite a file in some other directory from the one where you intended the data to be stored. That's it!That's it! Have fun writing CGI programs!Back to Valerie's main page.
|