Apache Log File Tools

AWStats Logfile Analyzer

I wanted to report on basic activity on my web server so I found and installed AWstats, which I really like.

Importing Apache logs into SQL

Later I wanted to see more details so I started looking for a way to convert Log file entries to database tables.  I found the following on goolge and listed them in the order I think will work best on linux.

  1. Importing Apache (httpd) logs into MySQL
  2. LogToMysql - piped logging from Apache to MySQL
  3. Query Apache logfiles via SQL (windows)
  4. Parsing Apache access_log files using php, mySQL LOAD, importing to mySQL
  5. Convert apache http combined logs into sql (and import it into a mysql database eventually)

My log files from apache with awstats are in the combined format as follows:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined

Custom Log Formats

The format argument to the LogFormat and CustomLog directives is a string. This string is used to log each request to the log file. It can contain literal characters copied into the log files and the C-style control characters "\n" and "\t" to represent new-lines and tabs. Literal quotes and back-slashes should be escaped with back-slashes.

The characteristics of the request itself are logged by placing "%" directives in the format string, which are replaced in the log entry by the values as follows:

%...a:          Remote IP-address
%...A:          Local IP-address
%...B:          Bytes sent, excluding HTTP headers.
%...b:          Bytes sent, excluding HTTP headers. In CLF format
        i.e. a '-' rather than a 0 when no bytes are sent.
%...c:          Connection status when response was completed.
                'X' = connection aborted before the response completed.
                '+' = connection may be kept alive after the response is sent.
                '-' = connection will be closed after the response is sent.
%...{FOOBAR}e:  The contents of the environment variable FOOBAR
%...f:          Filename
%...h:          Remote host
%...H       The request protocol
%...{Foobar}i:  The contents of Foobar: header line(s) in the request
                sent to the server.
%...l:          Remote logname (from identd, if supplied)
%...m       The request method
%...{Foobar}n:  The contents of note "Foobar" from another module.
%...{Foobar}o:  The contents of Foobar: header line(s) in the reply.
%...p:          The canonical Port of the server serving the request
%...P:          The process ID of the child that serviced the request.
%...q       The query string (prepended with a ? if a query string exists,
        otherwise an empty string)
%...r:          First line of request
%...s:          Status.  For requests that got internally redirected, this is
                the status of the *original* request --- %...>s for the last.
%...t:          Time, in common log format time format (standard english format)
%...{format}t:  The time, in the form given by format, which should
                be in strftime(3) format. (potentially localized)
%...T:          The time taken to serve the request, in seconds.
%...u:          Remote user (from auth; may be bogus if return status (%s) is 401)
%...U:          The URL path requested, not including any query string.
%...v:          The canonical ServerName of the server serving the request.

%...V:          The server name according to the UseCanonicalName setting.