WinCGI Package 2.0

WinCGI provides access to the submitted CGI data on a web server supporing the
Windows CGI (WinCGI) specification. The available routines are:

CGIRead   filename
  Process the WinCGI ini file, setting up all submitted variables.
  When the TCL script is started, the first argument is the WinCGI ini file.

CGIValue  key
  Return the value of a submitted data item. Key is the submitted data name.

CGIAccept key
  Return yes/no settings for what the client can support. Key is the MIME type.

CGISystem key
  Return system information such as client browser identifier etc. Key is
  the argument name

CGIDebug  local | remote
  A mechanism to display all of the submitted information.
    local - dipslay to the standard out of TCL
    remote - display back to the client browser. No HTML tags are present in
             this display, so the programmer must ensure correct HTML wrappers
             are included.

CGIWrite  string
  Place the string into the data stream going back to the client browser.

---------------
Author - Evan Rempel erempel@UVic.CA
         I will attempt to provide support for this package, and to extend its
         capabilities when requested and appropriate. Feedback is welcome.

----------------------------------------------------------------------------------
The following sections are defined in the Windows CGI 1.3 specification.
The complete specification is provided for reference at the end of this document.


          [CGI]  - all kinds of system and server settings
       [Accept]  - lists the mime types that the client browser can receive
       [System]  - special values specific to Windows CGI 1.3
[Extra Headers]  - headers that the browser sent that are not part of http 1.1 spec
 [Form Literal]  - variables and values submitted by client (small)
[Form External]  - variables and values submitted by client (large < 64Kbytes)
    [Form File]  - file uploads (MIME) from client
    [Form Huge]  - variables and values submitted by client ( > 64Kbytes)

The WinCGI Package rearranges these a litle as described below, and leaves out
the secitions:

[Extra Headers]  - if they aren't part of the spec, they can't really be counted on.
    [Form File]  - this package is not for file uploads
    [Form Huge]  - if >64K, probably a file upload, so not handled by this package.

       [System]  - only needed during CGIWin parsinge (states where data files
                   are etc). This section is processed, but only the 
                   GMT time zone info is maintained.

-------------------
WinCGI Package specification

This package parses a Windows CGI file (.ini) and creates three (3) sections.

SYSTEM - all settings for the current server/client/script interaction.
ACCEPT - all of the MIME types that the client will accept.
VALUES - all of the variables and values that were submitted by the client.
         This includes the "Form Literal", "Form External" and "Query String"
         from the Windows CGI Specification.


Section SYSTEM
--------------
Request Protocol
     The name and revision of the information protocol this request came in with.
     Format: protocol/revision.
     Example: "HTTP/1.0".

Request Method
     The method with which the request was made. For HTTP, this is "GET", "HEAD",
     "POST", etc.

Executable Path
     The logical path to the CGI program executable, as needed for self-referencing 
     URLs. This may vary if the server supports multi-homing with separate logical 
     path spaces. The server must provide the physical path equivalent using the 
     logical to physical mapping for the identity on which the current request was 
     received.

Document Root
     The physical path to the logical root "/". This may vary if the server supports 
     multi-homing with separate logical path spaces. The server must provide 
     the physical path to the logical root for the identity on which the 
     current request was received. 

Logical Path
     A request may specify a path to a resource needed to complete that request. 
     This path may be in a logical pathname space. This item contains the pathname 
     exactly as received by the server, without logical-to-physical translation.

Physical Path
     If the request contained logical path information, the server provides 
     the path in physical form, in the native object (e.g., file) access syntax 
     of the operating system. This may vary if the server supports multi-homing
     with separate logical path spaces. The server must provide the physical 
     path equivalent using the logical to physical mapping for the identity 
     on which the current request was received.

Query String
     The information which follows the ? in the URL that generated the request 
     is the "query" information. The server furnishes this to the back end 
     whenever it is present on the request URL, without any decoding or 
     translation, and the WinCGI package performs the URL decoding on this 
     string. This string is fully URL Decoded when returned from the 
     WinCGI Package.

Request Range
     Byte-range specification received with request (if any). See the 
     current Internet Draft (or RFC) describing the byte-range extension 
     to HTTP for more information. The server must support CGI program 
     participation in byte-ranging to be compliant with this Specification.

Referer
     The URL of the document that contained the link pointing to this 
     CGI program. Note that in some browsers the implementation of this 
     is broken, and cannot be relied-on.

From
     The e-mail address of the browser user. Note that this is in the HTTP 
     specification but is not implemented in some browsers due to privacy 
     concerns.

User Agent
     A string description of the client (browser) software. Not generated 
     by all browsers.

Content Type
     For requests which have attached data this is the MIME content type of 
     that data. Format: type/subtype.

Content Length
     For requests which have attached data, this is the length of the content 
     in bytes.

Content File
     For requests which have attached data, the server makes the data 
     available to the CGI program by putting it into this file. The value 
     of this item is the complete pathname of that file.

Server Software
     The name and version of the information server software answering the 
     request (and running the CGI program). Format: name/version.

Server Name
     The network host name or alias of the server, as needed for self-referencing 
     URLs. This (in combination with the ServerPort) could be used to manufacture 
     a full URL to the server, for URL fixups. This may vary if the servetr 
     supports multi-homing. The value of this item must be the host name on which 
     the current request was received.

Server Port
     The network port number on which the server is listening. This is also needed 
     for self-referencing URLs.

Server Admin
     The e-mail address of the server's administrator. This is used in error 
     messages, and might be used to send MAPI mail to the administrator, or to 
     form "mailto:" URLs in generated documents.

CGI Version
     The revision of the CGI specification to which this server complies. 
     Format: CGI/revision. For this version, "CGI/1.2 (Win)". 

Remote Host
     The network host name of the client (requestor) system, if available. 
     This item is used for logging.

Remote Address
     The network (IP) address of the client (requestor) system. This item 
     is used for logging if the host name is not available. 

Authentication Method
     The protocol-specific authentication method specified in the request. 
     If present, this is normally Basic. The server must provide this 
     whether or not it was used by the server for authentication.

Authentication Realm
     The method-specific authentication realm specified in the request. 
     If present in the request, the server must provide this whether or 
     not it was used by the server for authentication.

Authenticated Username
     The username (in the indicated realm) that the client used to attempt 
     authentication, as specified in the request. If present in the 
     request, the server must provide this whether or not it was used by 
     the server for authentication.

Authenticated Password
     The password that the client used to attempt authentication, as 
     specified in the request. If present in the request, the server 
     must provide this whether or not it was used by the server for 
     authentication.

GMT Offset
     The numper of seconds to be added to GMT time to reach local time. 
     For pacific Standard time, this number is -28,800. Useful for 
     computing GMT times.

Unique
     A file name without an extension or path that is unique within the 
     scope of all outstanding CGI requests.

Output File
     The tcl channel ID of the open output file. This channel must not 
     be closed until then end of processing, or automatically by termination
     of the tcl shell. This is the channel that CGIWrite writes to. If the 
     parent script needs to write to the outgoing channel ID, it should write 
     to this channel ID.


Section ACCEPT
--------------
This section contains the client's acceptable data types found in the request 
header as 

Accept: type/subtype {parameters}

If the parameters (e.g., "q=0.100") are present, they are passed as the value 
of the item. If there are no parameters, the value is "Yes".


Section VALUE
-------------
If the request is an HTTP POST or GET from an HTTP form (with content type of 
application/x-www-form-urlencoded or multipart/form-data), the server will 
decode the form data and put it into the VALUE section.

Both the variable names and the submitted data are URL Decoded before being 
placed into this section. All TCL escape sequences are generated so that 
the TCL operations work with the intended data.

If the form contains any SELECT MULTIPLE elements, there will be multiple 
occurrences of the same key. In this case, the server generates a normal 
"key=value" pair for the first occurrence, and it appends a sequence 
number to subsequent occurrences in the form name_X where X is an 
integer. The WinCGI Package decodes all of these and generates a TCL list 
as the value for the variable. It should be noted that if only one 
selection is made, the submitted field only occurs once, and can not be 
detected as a SELECT MULTIPLE field. The result is that the value of the 
field is NOT a TCL List until the second value gets added.

WARNING: If only one selection is made in a SELECT MULTIPLE field, AND 
         the value of the single selection contains spaces, it is 
         immposible to diferentiate between this single multi-word selection 
         and a multiple selection of these single words. It is recommended 
         that all SELECT MULTIPLE values do not contain spaces.

The "Query String" from the section SYSTEM is URL decoded and any resulting 
values are placed in this section as well.

--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
Windows CGI 1.3a Specification - Version of 18-Feb-96
Written by O'Rielly Software and included with WebSite 1.1 Webserver.

Overview

A large class of World Wide Web applications are best implemented using external programs that
are controlled by a web server. Examples include front-ends to business applications which are
themselves subject to frequent changes in business rules. The broad acceptance of
rapid-application development (RAD) tools such as Visual Basic and Delphi have given rise to the
need to use these tools to Web-enable many kinds of business applications. The widely used
Common Gateway Interface (CGI) uses techniques well suited to the Unix environment. A different
sort of interface is needed to support common Windows RAD tools for CGI. It is the purpose of
this specification to define such an interface. 

I/O Spooling

A key feature of Windows CGI is its spooled exchange of data between the server and the CGI
program. It is essential that the server provide efficient transfer of data between the spool files and
the network. This means that the server should use memory-mapped techniques, and minimize the
number of separate network I/O requests used. 

The reasons for using spooled I/O are: 

     Most RAD packages do not have native network (socket) I/O capabilities. 
     Socket I/O techniques are relatively exotic, and efficient results require a thorough
     knowledge of the Win32 network interface. All input and output would require complex
     buffering to achieve acceptable network efficiency. 
     Sockets cannot be inherited by a 16-bit program. 
     Spooled input (e.g. POST content) can be memory mapped and thus processed far more
     efficiently than is possible using stream-oriented techniques. 
     A reference set of spool files may be used for regression testing and debugging in the
     RAD development environment. 
     Spool files may be retained after a CGI program runs, for "post-mortem" analysis, also
     using the RAD environment. 

HTML Form Data Decoding

Windows CGI requires that the web server decode HTML form data if present in a POST request.
It is not required that the server decode form data if it appears in the "query string" portion of a
request URL. 

There are two ways in which form data may be may be sent by a browser to the server: 

URL-Encoded 
     This is the most common form data format. The contents of form fields are "escaped"
     according to the rules in the HTML 1.0 Specification, then concatenated using unescaped
     ampersand characters. This URL-encoded data is sent as a stream to the server, with a
     content type of application/x-www-form-urlencoded. 

Multipart Form Data 
     This format has been introduced to permit efficient file uploading with forms. It may be
     used without explicitly including a file upload form field, however. The contents of the form
     fields are sent as a MIME multipart message. Each field is contained within a single part.
     The content type indicated by the browser is multipart/form-data. 

Compliant servers must decode both form data types. 

Launching the CGI program

The server uses the CreateProcess() service to launch the CGI program. The server
maintains synchronization with the CGI program so it can detect when the CGI program exits. This
is done using the Win32 WaitForSingleObject() service, waiting for the CGI process
handle to become signalled, indicating program exit. The server must never use a shell to execute
the CGI program. This can create serious security risks. 

NOTE: The CGI program's process handle becomes signalled before the process rundown is
complete. Reliance on rundown to close files, inherited handles, etc., can cause obscure
synchronization problems.

Command Line

The server must execute a CGI program request by doing a CreateProcess() with a command
line in the following form: 

   WinCGI-exe cgi-data-file

WinCGI-exe

     The complete path to the CGI program executable. The server does not depend on the
     "current directory" or the PATH environment variable. Note that the "executable" need not
     be a .EXE file. It may be a document, provided an "association" with a corresponding
     executable has been established. 

cgi-data-file

     The complete path to the CGI data file. 

Launch Method

The server issues the CreateProcess() such that the process being launched has its main
window hidden. The launched process itself should not cause the appearance of a window nor a
change in the Z-order of the windows on the desktop. The server supports a CGI program/script
debugging mode. If that mode is enabled, the CGI program is launched such that its window
shows and is made active. This can assist in debugging CGI applications. 

Document Associations

The server must honor document associations. If the target of a Windows CGI request is a
document (not an executable), the server must attempt to find the associated application for the
document and launch the application such that the document is "processed". 


The CGI Data File

The server passes data to the CGI program via a Windows "private profile" file, in key-value
format. The CGI program may then use the standard Windows API services for enumerating and
retrieving the key-value pairs in the data file. 

The CGI data file contains the following sections: 

     [CGI]
     [Accept]
     [System]
     [Extra Headers]
     [Form Literal]
     [Form External]
     [Form File]
     [Form Huge]

The [CGI] Section

This section contains most of the CGI data items (accept types, content, and extra headers are
defined in separate sections). Each item is provided as a string value. If the value is an empty
string, the keyword is omitted. The keywords are listed below:

Request Protocol

     The name and revision of the information protocol this request came in with. Format:
     protocol/revision. Example: "HTTP/1.0".

Request Method

     The method with which the request was made. For HTTP, this is "GET", "HEAD", "POST",
     etc.

Executable Path

     The logical path to the CGI program executable, as needed for self-referencing URLs. This
     may vary if the server supports multi-homing with separate logical path spaces. The server
     must provide the physical path equivalent using the logical to physical mapping for the
     identity on which the current request was received.

Document Root

     The physical path to the logical root "/". This may vary if the server supports multi-homing
     with separate logical path spaces. The server must provide the physical path to the logical
     root for the identity on which the current request was received. 

Logical Path

     A request may specify a path to a resource needed to complete that request. This path may
     be in a logical pathname space. This item contain the pathname exactly as received by the
     server, without logical-to-physical translation.

Physical Path

     If the request contained logical path information, the server provides the path in physical
     form, in the native object (e.g., file) access syntax of the operating system. This may vary if
     the server supports multi-homing with separate logical path spaces. The server must
     provide the physical path equivalent using the logical to physical mapping for the identity
     on which the current request was received.

Query String

     The information which follows the ? in the URL that generated the request is the "query"
     information. The server furnishes this to the back end whenever it is present on the request
     URL, without any decoding or translation.

Request Range

     Byte-range specification received with request (if any). See the current Internet Draft (or
     RFC) describing the byte-range extension to HTTP for more information. The server must
     support CGI program participation in byte-ranging to be compliant with this Specification.

Referer

     The URL of the document that contained the link pointing to this CGI program. Note that in
     some browsers the implementation of this is broken, and cannot be relied-on.

From

     The e-mail address of the browser user. Note that this is in the HTTP specification but is
     not implemented in some browsers due to privacy concerns.

User Agent

     A string description of the client (browser) software. Not generated by all browsers.

Content Type

     For requests which have attached data this is the MIME content type of that data. Format:
     type/subtype.

Content Length

     For requests which have attached data, this is the length of the content in bytes.

Content File

     For requests which have attached data, the server makes the data available to the CGI
     program by putting it into this file. The value of this item is the complete pathname of that
     file.

Server Software

     The name and version of the information server software answering the request (and
     running the CGI program). Format: name/version.

Server Name

     The network host name or alias of the server, as needed for self-referencing URLs. This (in
     combination with the ServerPort) could be used to manufacture a full URL to the server, for
     URL fixups. This may vary if the servetr supports multi-homing. The value of this item must
     be the host name on which the current request was received.

Server Port

     Tne network port number on which the server is listening. This is also needed for
     self-referencing URLs.

Server Admin

     The e-mail address of the server's administrator. This is used in error messages, and might
     be used to send MAPI mail to the administrator, or to form "mailto:" URLs in generated
     documents.

CGI Version

     The revision of the CGI specification to which this server complies. Format: CGI/revision.
     For this version, "CGI/1.2 (Win)". 

Remote Host

     The network host name of the client (requestor) system, if available. This item is used for
     logging.

Remote Address

     The network (IP) address of the client (requestor) system. This item is used for logging if
     the host name is not available. 

Authentication Method

     The protocol-specific authentication method specified in the request. If present, this is
     normally Basic. The server must provide this whether or not it was used by the server for
     authentication.

Authentication Realm

     The method-specific authentication realm specified in the request. If present in the request,
     the server must provide this whether or not it was used by the server for authentication.

Authenticated Username

     The username (in the indicated realm) that the client used to attempt authentication, as
     specified in the request. If present in the request, the server must provide this whether or
     not it was used by the server for authentication.

Authenticated Password

     The password that the client used to attempt authentication, as specified in the request. If
     present in the request, the server must provide this whether or not it was used by the
     server for authentication. 

     NOTE - Current practice on the O'Reilly WebSite servers require that the CGI program's
     name begin with a dollar sign ($) to have the password supplied through the CGI interface.
     This is not required by this specification. It is recommended, however, as it forces the CGI
     programmer to do something special to have the password info exported from within the
     server's internal environment. 

The [Accept] Section

This section contains the client's acceptable data types found in the request header as 

Accept: type/subtype {parameters}

If the parameters (e.g., "q=0.100") are present, they are passed as the value of the item. If there are
no parameters, the value is "Yes". 

Note: The accept types may easily be enumerated by the CGI program with a call to
GetPrivateProfileString() with NULL for the key name. This returns all of the keys in the section
as a null-delimited string with a double-null terminator.

The [System] Section

This section contains items that are specific to the Windows implementation of CGI. The following
keys are used: 

GMT Offset

     The numper of seconds to be added to GMT time to reach local time. For pacific Standard
     time, this number is -28,800. Useful for computing GMT times. 

Debug Mode

     This is No unless the server's "CGI/script tracing" mode is enabled, then it is Yes. Useful
     for providing conditional tracing within the CGI program. 

Output File

     The full path/name of the file in which the server expects to receive the CGI program's
     results. 

Content File

     The full path/name of the file that contains the content (if any) that came with the request. 

The [Extra Headers] Section

This section contains the "extra" headers that were included with the request, in "key=value"
form. The server must URL-unescape both the key and the value prior to writing them to the CGI
data file. 

Note: The extra headers may easily be enumerated by the CGI program with a call to
GetPrivateProfileString() with NULL for the key name. This returns all of the keys in the section
as a null-delimited string with a double-null terminator.

The [Form Literal] Section

If the request is an HTTP POST from an HTTP form (with content type of
application/x-www-form-urlencoded or multipart/form-data), the server will decode the form data
and put it into the [Form Literal] section.

For URL-encoded form data, raw form input is of the form "key=value&key=value&...", with the
value parts in url-encoded format. The server splits the key=value pairs at the '&', then splits the
key and value at the '=', url-decodes the value string, and puts the result into key=(decoded)value
form in the [Form Literal] section.

For multipart form data, raw form input is in a MIME-style multipart format, with each field in a
separate part. The server extracts the field namd and value from each part and puts the result into
key=value form in the [Form Literal] section.

If the form contains any SELECT MULTIPLE elements, there will be multiple occurrences of the
same key. In this case, the server generates a normal "key=value" pair for the first occurrence, and
it appends a sequence number to subsequent occurrences. It is up to the CGI program to know
about this possibility and to properly recognize the tagged keys. 

The [Form External] Section

If the decoded value string is more than 254 characters long, or if the decoded value string
contains any control characters or double-quotes, the server puts the decoded value into an
external tempfile and lists the field into the [Form External] section as: 

  key=pathname length

where pathname is the path and name of the tempfile containing the decoded value string, and
length is the length in bytes of the decoded value string.

Note: Be sure to open this file in binary mode unless you are certain that the form data is text!

The [Form Huge] Section

If the raw value string is more than 65,535 bytes long, the server does no decoding, but it does get
the keyword and mark the location and size of the value in the Content File. The server lists the
huge field in the [Form Huge] section as: 

  key=offset length

where offset is the offset from the beginning of the Content File at which the raw value string for
this key is located, and length is the length in bytes of the raw value string. You can use the offset
to perform a "Seek" to the start of the raw value string, and use the length to know when you have
read the entire raw string into your decoder. Note: Be sure to open this file in binary mode unless
you are certain that the form data is text!

The [Form File] Section

If the request is in the multipart/form-data format, it may contain one or more file uploads.
In this case, each file upload is placed into an external tempfile similar to the form external data.
Each such file upload is listed in the [Form File] section as: 

  key=[pathname] length type xfer [filename]

where pathname is the pathname of the external tempfile containing the uploaded file, length
is the length in bytes of the uploaded file, type is the MIME content type of the uploaded file,
xfer is the content-transfer encoding of the uploaded file, and filename is the original name of
the uploaded file. The square brackets must be included. They are used to delimit the file and
pathnames, which may contain spaces. 

Example of Form Decoding

In the following sample, the form contained a small field, a SELECT MULTIPLE with 2 small
selections, a field with 300 characters in it, one with line breaks (a text area), and a 230KB field. 

    [Form Literal]
    smallfield=123 Main St. #122
    multiple=first selection
    multiple_1=second selection

    [Form External]
    field300chars=C:\TEMP\HS19AF6C.000 300
    fieldwithlinebreaks=C:\TEMP\HS19AF6C.001 43

    [Form Huge]
    field230K=C:\TEMP\HS19AF6C.002 276920



Results Processing

The CGI program returns its results to the server as a data stream representing (directly or
indirectly) the goal of the request. The server is responsible for "packaging" the data stream
according to HTTP, and for using HTTP to transport the data stream to the requesting client. This
means that the server normally adds the needed HTTP headers to the CGI program's results.

The data stream consists of two parts: the header and the body. The header consists of one or
more lines of text, and is separated from the body by a blank line. The body contains
MIME-conforming data whose content type must be reflected in the header.

The server does not interpret or modify the body in any way. It is essential that the client receive
exactly the data that was generated by the back end.

Special Header Lines

The server recognizes the following header lines in the results data stream:

Content-Type:

     Indicates that the body contains data of the specified MIME content type. The value must
     be a MIME content type/subtype.

URI: <value> (value enclosed in angle brackets)

     The value is either a full URL or a local file reference, either of which points to an object to
     be returned to the client in lieu of the body (which the server shall ignore in this type of
     result). If the value is a local file, the server sends it as the results of the request, as though
     the client issued a GET for that object. If the value is a full URL, the server returns a "401
     redirect" to the client to retrieve the specified object directly.

Location:

     Same as URI, but this form is now deprecated. The value must not be enclosed in angle
     brackets with this form. 

Other Headers

Any other headers in the result stream are passed (unmodified) by the server to the client. It is the
responsibility of the CGI program to avoid including headers that clash with those used by HTTP. 

