Tcl Applications In The Unix Environment

Creating a Unix Script

There are two kinds of executable Unix programs: binaries and scripts. Executable programs are recognized by a magic number: if the first few bytes of the file match one of a set of patterns, the file is considered (potentially) executable and if the file has execute permission for a given user, that user may run the program. If a file has execute permission, but doesn't have the right magic number, the kernel will generate an error.

On our Sun Sparc machines, the usual magic number for executables is 0x81 0x03, though there are variations.

The magic number concept is used in Unix to type or identify more than just executable programs. For example, the two byte magic number 0x1f 0x8b identifies a particular species of compressed file (GNU gzip files).

The magic number is a binary bit pattern, but it may happen to correspond to printable ASCII characters. This allows magic numbers to be used in text files. For example, the magic number for PostScript files is 0x25 0x21, which is %!, and the magic number for executable script files is #!.

Number Bang

Unix allows executable script files to be written in any language. But when the kernel attempts to run the program, how is it to know what interpreter to use to execute it?

When the kernel tries to execute a file, it checks the magic number. If it is #!, the kernel reads more bytes, up to a newline character. All these bytes (not including the #! itself) are taken to be the pathname of an interpreter for the rest of the file, optionally including a single argument. If the interpreter itself exists and is executable, the kernel fires it up and gives it the name of the original script file as a command line argument.

For example, a Tcl script (say, /tmp/foo.tcl) can be made executable (for our system) by making the first line be:

#!/local/bin/megatcl -f
causing the kernel to execute:
/local/bin/megatcl -f /tmp/foo.tcl

Note that the first line that the interpreter will see is the #! line! This is why Tcl (and many many interpreted languages that were developed under Unix) uses # as the comment character!

Command Line Arguments

argv0 argv Unix programs typically take command line arguments in addition to being able to read standard input and write standard output. Sometimes the command line arguments are used to read file names and options, as in the cat and ls commands:
cat /tmp/foo.tcl /etc/motd
ls -l /etc/passwd
and sometimes the command line arguments are used for completely arbitrary things.

Command line arguments are parameters to Unix programs, much the same way that Tcl procs take arguments. In fact, Tcl's command syntax was modelled on the typical Unix shell syntax for passing command line arguments to programs.

You can read command line arguments that are passed to your Tcl program. There are two special global variables:

argv0
A string variable, the name under which your Tcl program was invoked. May be an absolute or a relative pathname.
argv
A list, one element for each command line argument passed to your program.
(Note that in a C program, the C char * array argv contains argv0 as the first element of the the array, not in a separate variable.)

Exiting and Exit Status

Unix programs return an 8-bit exit status to the OS when they terminate; this status can be read by the program's parent. Conventionally, an exit status of 0 indicates a successful termination, while any other number indicates failure. The most common value used to signal failure is 1, but some programs use a range of small numbers to indicate why or how they failed.

The exit Command

The Tcl exit command terminates execution of your Tcl program. It takes an optional integer argument as the exit status. If no argument is provided, the exit status will be 0.

Environment Variables

Every Unix process has an environment, a set of pairs of strings. These pairs are usually interpreted as variables and values, and are called environment variables. A process inherits its parent's environment, and so environment variables can be used to pass information to a process.

There is no way for a child to modify its parent's environment.

Your shell will allow you to set environment variables, with the exact syntax differing from shell to shell. Here's an example of setting the environment variable FOO to the value Armageddon, in several different shells:

sh
FOO=Armageddon; export FOO
bash, zsh, ksh
export FOO=Armageddon (these shells also accept the same syntax as sh)
csh, tcsh
setenv FOO Armageddon
rc, es
FOO=Armageddon

Accessing The Environment From Tcl

Most Unix programming languages provide access to the environment via a function that takes the name of a variable and returns its value. Tcl uses a more natural (for Tcl) approach: the global array env contains an entry for each environment variable. This not only allows access to any named environment variable, but allows the use of foreach and array names env to iterate over the entire environment.
set env(HOME)
=> /home/keith
set env(NETHACKOPTIONS)
=> !pickup,rest_on_space,time,fruit:durian
set env(PRINTER) ps
=> ps
array names env
=> ...
parray env
=> ...

Tcl Interpreter Options

The Tcl interpreter takes a number of commandline options itself. These vary depending on which interpreter you're using. Here are some of the options that megatcl accepts:
-f filename
Run the program in filename
-n
No procedure call stack dump; errors are shortened to a single line. Useful in the #! line of debugged scripts to avoid scaring your users...
-c Tcl command
Execute Tcl command and terminate.
--
No more options; useful to allow you to pass command line arguments to your program which start with a hyphen.

The cmdtrace Command

Extended Tcl supports tracing of program execution as a debugging technique. To turn tracing on, give the command:
cmdtrace on
To turn tracing off, give the command:
cmdtrace off

See the Extended Tcl man page for more details (such as how to control the amount of trace output, and how to redirect the output to a file).

The time Command

time script ?iterations?

Profiling

Extended Tcl supports profiling of your application, to find hot spots and candidates for optimization. See the Extended Tcl man page for details. Here is a handy interface to the profiling commands that takes a script as an argument and generates a profile; you can actually wrap your entire main program in this command.
proc profile-script {cmd {sort cpu} {levels 1}} {
    profile on
    uplevel $cmd
    profile off pro
    profrep pro $sort $levels
}

Here is an example profile report from a run of a real Tcl program (my program check-urls). Note that the proc null is called far more often than any other proc, and may be a good candidate for optimization (or not). Some of the procs with the highest CPU time may also be good candidates.

---------------------------------------------------------
Procedure Call Stack          Calls  Real Time   CPU Time
---------------------------------------------------------
begin_GLOBAL()                          2      88982       6904
eachFile                          1      88907       6852
urlSource                         1      88885       6818
htmlFile                          1      88883       6818
checkUrl                         26      87245       5219
checkFtp                          3      72685       2728
verify-ftp                        3      70835       2213
checkWebPage                     22      12438       2063
netgets                          99      78384       1959
checkUrlProtoWise                22      12320       1959
ftp-response                     24      69640       1784
null                            467        856        858
log-url                          47        777        774
timeout                          31        551        480
hostname                          3       1751        446
verbose                          86        504        428
ping                              3        732        257
unwindProtect                    32        231        192
checkNews                         1       1722        120
verify-nntp                       1       1654         86
nntp-response                     3       1500         35
getopt                            3         55         35
vardefault                       14         30         17
errdefault                        5         20         17
envdefault                        5         18         17

Structuring Your Application

Any large Tcl application should be broken up into a number of separate files, comprising a main program in one file and a number of library files, containing various procs used in the program. Each library file should contain a small number of related procs. I recommend naming library files with an extension of \&.lib, rather than \&.tcl. You should use \&.tcl as the extension for your main program. Your main program can easily make use of the routines in your library files via autoloading (see below).

Installation and make

You should always maintain a separation of source code and executable, even for scripting languages like Tcl where the source code is executable. The standard Unix make application is designed to automate the construction of executables from source, and their installation.

Autloading

The Tcl interpreter can autoload procedures on demand from library files. To do this, you need to put you library files in a directory somewhere, index them, and then add this directory to your application's auto_path variable.

Indexing for Autoloading

auto_mkindex directory glob pattern
To index your library files, use the standard Tcl procedure auto_mkindex. It indexes all the files that match glob pattern and creates an index file in directory. Example:
auto_mkindex . *.lib

Extending Your Autoload Path

The global variable auto_path is a list of directories that contain Tcl library files and indices. After creating your index, you need to add the directory that contains it to the auto_path list; here are two examples:
lappend auto_path .
set auto_path [linsert $auto_path 0 ~/lib/tcl]
Tcl searches for autoload procs in each directory in the auto_path in order, using the first one it finds.
Keith Waclena
The University of Chicago Library
This page last updated: Thu Aug 18 14:34:01 CDT 1994
This page was generated from Extended HTML by xhtml.