Table of Contents:
|
The
Writing Apache Modules with Perl and C
book can be purchased online from O'Reilly
and
Amazon.com.
|
|
Your corrections of the technical and grammatical
errors are very welcome. You are encouraged to help me
improve this guide. If you have something to contribute
please send it
directly to me.
|
This module provides the Apache/mod_perl user a mechanism for storing
persistent user data in a global hash, which is independent of its real
storage mechanism. Currently you can choose from these storage mechanisms Apache::Session::DBI, Apache::Session::Win32,
Apache::Session::File, Apache::Session::IPC. Read the man page of the mechanism you want to use for a complete
reference.
Apache::Session provides persistence to a data structure. The data structure has an ID
number, and you can retrieve it by using the ID number. In the case of
Apache, you would store the ID number in a cookie or the URL to associate
it with one browser, but the method of dealing with the ID is completely up
to you. The flow of things is generally:
Tie a session to Apache::Session. Get the ID number. Store the ID number in a cookie. End of Request 1. |
(time passes) |
Get the cookie. Restore your hash using the ID number in the cookie. Use whatever data you put in the hash. End of Request 2. |
Using Apache::Session is easy: simply tie a hash to the session object, stick any data structure
into the hash, and the data you put in automatically persists until the
next invocation. Here is a quick example which uses cookies to track the
user's session.
# Pull in the required packages
use Apache::Session::DBI;
use Apache;
use strict;
# Read in the cookie if this is an old session
my $r = Apache->request;
my $cookie = $r->header_in('Cookie');
$cookie =~ s/SESSION_ID=(\w*)/$1/;
# Create a session object based on the cookie we got from the
# browser, or a new session if we got no cookie
my %session;
tie %session, 'Apache::Session::DBI', $cookie,
{DataSource => 'dbi:mysql:sessions',
UserName => $db_user,
Password => $db_pass
};
# Might be a new session, so lets give them their cookie back
my $session_cookie = "SESSION_ID=$session{_session_id};";
$r->header_out("Set-Cookie" => $session_cookie);
|
After setting this up, you can stick anything you want into
%session (except file handles), and it will still be there
when the user invokes the next page.
It is possible to write an Apache authen handler using
Apache::Session. You can put your authentication token into the session. When a user
invokes a page, you open their session, check to see if they have a valid
token, and approve or deny their authorization based on that.
As for IIS, let's compare. IIS's sessions are only valid on the same web
server as the one that issued the session.
Apache::Session's session objects can be shared amongst a farm of many machines running
different operating systems, including even Win32. IIS stores session
information in RAM. Apache::Session
stores sessions in databases, file systems, or RAM. IIS's sessions are only
good for storing scalars or arrays. Apache::Session's sessions allow you to store arbitrarily complex objects. IIS sets up the
session and automatically tracks it for you. With
Apache::Session, you setup and track the session yourself. IIS is proprietary. Apache::Session is open-source.
Apache::Session::DBI can issue 400+ session requests per second on light Celeron 300A running
Linux. IIS?
An alternative to Apache::Session is Apache::ASP, which has session tracking abilities. HTML::Embperl hooks
into Apache::Session for you.
[ TOC ]
See mod_perl and relational Databases
[ TOC ]
This module monitors hanging Apache/mod_perl processes. You define the time in seconds after which the process is to be counted as hanging or run away.
When the process is considered as hanging it will be killed and the event logged into a log file.
Generally you should use the amprapmon program that is bundled with this module's distribution package, but you
can write your own code using the module as well. See the amprapmon manpage for more information about it.
Note that it requires the Apache::Scoreboard module to work.
Referer to the Apache::Watchdog::RunAway manpage for the configuration details.
[ TOC ]
Apache::VMonitor is the next generation of
mod_status. It provides all the information mod_status provides and much more.
This module emulates the reporting functions of the top(),
mount(), df() and ifconfig()
utilities. There is a special mode for mod_perl processes. It has visual
alert capabilities and a configurable
automatic refresh mode. It provides a Web interface, which can be used to show or hide all
the sections dynamically.
The are two main modes:
Multi processes mode -- All system processes and information is shown.
Single process mode -- In-depth information about a single process is shown.
The main advantage of this module is that it reduces the need to telnet to the machine in order to monitor it. Indeed it provides information about mod_perl processes that cannot be acquired from telneting to the machine. =head3 Configuration
# Configuration in httpd.conf
<Location /sys-monitor>
SetHandler perl-script
PerlHandler Apache::VMonitor
</Location>
|
# startup file or <Perl> section:
use Apache::VMonitor();
$Apache::VMonitor::Config{BLINKING} = 0; # Blinking is evil
$Apache::VMonitor::Config{REFRESH} = 0;
$Apache::VMonitor::Config{VERBOSE} = 0;
$Apache::VMonitor::Config{SYSTEM} = 1;
$Apache::VMonitor::Config{APACHE} = 1;
$Apache::VMonitor::Config{PROCS} = 1;
$Apache::VMonitor::Config{MOUNT} = 1;
$Apache::VMonitor::Config{FS_USAGE} = 1;
$Apache::VMonitor::Config{NETLOAD} = 1;
@Apache::VMonitor::NETDEVS = qw(lo eth0);
$Apache::VMonitor::PROC_REGEX = join "\|", qw(httpd mysql squid);
|
More information available in the module's extensive manpage.
It requires Apache::Scoreboard and GTop to work. GTop in turn requires the libgtop library but is not available for all platforms. Visit http://www.home-of-linux.org/gnome/libgtop/
to check whether your platform/flavor is supported.
[ TOC ]
This module allows you to kill off Apache processes if they grow too large or if they share too little of their memory. You can choose to set up the process size limiter to check the process size on every request:
# in your startup.pl:
use Apache::GTopLimit;
# Control the life based on memory size
# in KB, so this is 10MB
$Apache::GTopLimit::MAX_PROCESS_SIZE = 10000;
# Control the life based on Shared memory size
# in KB, so this is 4MB
$Apache::GTopLimit::MIN_PROCESS_SHARED_SIZE = 4000;
# watch what happens
$Apache::GTopLimit::DEBUG = 1;
# in your httpd.conf:
PerlFixupHandler Apache::GTopLimit
# you can set this up as any Perl*Handler that handles
# part of the request, even the LogHandler will do.
|
Or you can just check those requests that are likely to get big or unshared. This way of checking is also easier for those who are mostly just running Apache::Registry scripts:
# in your CGI:
use Apache::GTopLimit;
# Max Process Size in KB
Apache::GTopLimit->set_max_size(10000);
|
and/or:
use Apache::GTopLimit;
# Min Shared process Size in KB
Apache::GTopLimit->set_min_shared_size(4000);
|
Since accessing the process info might add a little overhead, you may want to only check the process size every N times. To do so, put this in your startup.pl or your code:
$Apache::GTopLimit::CHECK_EVERY_N_REQUESTS = 2; |
This will only check the process size every other time the process size checker is called.
This module was written in response to questions on the mod_perl mailing list on how to tell the httpd process to exit if:
its memory size goes beyond a specified limit
its shared memory size goes below a specified limit
Note: This module will run on platforms supported by GTop.pm a Perl interface to libgtop (which of course needs libgtop : See http://home-of-linux.org/gnome/libgtop/ ).
Referer to the Apache::GTopLimit manpage for more information.
[ TOC ]
This package contains modules for manipulating client request data via the Apache API with Perl and C. Functionality includes:
- parsing of application/x-www-form-urlencoded data
- parsing of multipart/form-data
- parsing of HTTP Cookies
The Perl modules are simply a thin xs layer on top of libapreq, making them
a lighter and faster alternative to CGI.pm and CGI::Cookie. See the Apache::Request and Apache::Cookie documentation for more details and eg/perl/ for examples.
Apache::Request and libapreq are tied tightly to the Apache API, to which there is no
access in a process running under mod_cgi.
[ TOC ]
Apache::RequestNotes provides a simple interface allowing all phases of the request cycle access
to cookie or form input parameters in a consistent manner. Behind the
scenes, it uses libapreq
Apache::Request>) functions to parse request data and puts references to the data in
pnotes.
Once the request is past the PerlInit phase, all other phases can have access to form input and cookie data without parsing it themselves. This relieves some strain, especially when the GET or POST data is required by numerous handlers along the way.
See the Apache::RequestNotes manpage for more information.
[ TOC ]
See Apache::PerlRun - a closer look.
[ TOC ]
Apache::RegistryNG is the same as Apache::Registry, aside from using filename instead of URI for the namespace. It also uses
an Object Oriented interface.
PerlModule Apache::RegistryNG
<Location /perl>
SetHandler perl-script
PerlHandler ApacheRegistryNG->handler
</Location>
|
Apache::RegistryNG inherits from Apache::PerlRun, but the handler() is overriden. Aside from the
handler(), the rest of
Apache::PerlRun contains all the functionality of
Apache::Registry broken down into several subclass-able methods. These methods are used by Apache::RegistryNG to implement the exact same functionality of Apache::Registry, using the
Apache::PerlRun methods.
There is no compelling reason to use Apache::RegistryNG over
Apache::Registry, unless you want to do add or change the functionality of the existing Registry.pm. For example,
Apache::RegistryBB (Bare-Bones) is another subclass that skips the stat() call
performed by Apache::Registry on each request.
One situation where Apache::RegistryNG may definitely be required is if you are rewriting URIs (using either
mod_rewrite or your own handler) in certain ways.
For instance if you have a rewrite rule of the form:
XYZ123456.html ==> /perl/foo.pl?p1=XYZ&p2=123456 |
Apache::Registry loses big, as it recompiles foo.pl for each unique URL -- Apache::RegistryNG should be used instead.
[ TOC ]
It works just like Apache::Registry, but does not test the x bit (-x) only compiles the file once (no
stat() call is made per requsest), skips the OPT_EXECCGI
checks and does not chdir() into the script parent directory. It uses the Object Oriented interface.
Configuration:
PerlModule Apache::RegistryBB
<Location /perl>
SetHandler perl-script
PerlHandler ApacheRegistryBB->handler
</Location>
|
See Apache::RegistryNG for more info.
[ TOC ]
Apache::OutputChain was written as a way of exploring the possibilities of stacked handlers in mod_perl. It ties STDOUT to an object which catches the output and makes it easy to build a chain of modules that work on output data stream.
Examples of modules that are build on this idea are
Apache::SSIChain, Apache::GzipChain and Apache::EmbperlChain
-- the first processes the SSI's in the stream, the second compresses the
output on the fly, the last adds Embperl processing.
The syntax goes like this:
<Files *.html>
SetHandler perl-script
PerlHandler Apache::OutputChain Apache::SSIChain Apache::PassHtml
</Files>
|
The modules are listed in the reverse order of their execution -- here the Apache::PassHtml module simply picks a file's content and sends it to STDOUT ... then it's
processed by Apache::SSIChain, which sends its output to STDOUT again ... then it's processed by
Apache::OutputChain, which finally sends the result to the browser.
An alternative to this approach is Apache::Filter, which has a more natural ``forward'' configuration order and is easier to
interface with other modules.
It works with Apache::Registry as well, for example:
Alias /foo /home/httpd/perl/foo
<Location /foo>
SetHandler "perl-script"
Options +ExecCGI
PerlHandler Apache::OutputChain Apache::GzipChain Apache::Registry
</Location>
|
It's really a regular Apache::Registry setup, except for the added modules in the PerlHandler line.
(Apache::GzipChain allows to compress the output on the fly.)
[ TOC ]
Have you ever served a huge HTML file (e.g. a file bloated with JavaScript code) and wondered how could you send it compressed, thus dramatically cutting down the download times? After all java applets can be compressed into a jar and benefit from a faster download times. Why can't we do the same with a plain ASCII (HTML, JS etc.)? ASCII text can often be compressed by a factor of 10.
Apache::GzipChain comes to help you with this task. If a client (browser) understands gzip encoding, this module compresses the output and sends it downstream. The
client decompresses the data upon receipt and renders the HTML as if it
were fetching plain HTML.
For example to compress all html files on the fly, do this:
<Files *.html>
SetHandler perl-script
PerlHandler Apache::OutputChain Apache::GzipChain Apache::PassFile
</Files>
|
Remember that it will work only if the browser claims to accept compressed
input, by setting the Accept-Encoding header.
Apache::GzipChain keeps a list of user-agents, thus it also looks at the User-Agent header to check for browsers known to accept compressed output.
For example if you want to return compressed files which will in addition pass through the Embperl module, you would write:
<Location /test>
SetHandler perl-script
PerlHandler Apache::OutputChain Apache::GzipChain Apache::EmbperlChain Apache::PassFile
</Location>
|
Hint: Watch the access_log file to see how many bytes were actually sent, and compare that with the
bytes sent using a regular configuration.
(See also perldoc Apache::GzipChain).
Notice that the rightmost PerlHandler must be a content producer. Here we
are using Apache::PassFile but you can use any module which creates output.
[ TOC ]
META: to be written (actually summarized the info from Apache::Filter manpage)
[ TOC ]
Similar to Apache::GzipChain
but works with Apache::Filter.
This configuration:
PerlModule Apache::Filter
<Files ~ "*\.html">
SetHandler perl-script
PerlSetVar Filter On
PerlHandler Apache::Gzip
</Files>
|
will send all the *.html files compressed if the client accepts the compressed input.
And this one:
PerlModule Apache::Filter
Alias /home/http/perl /perl
<Location /perl>
SetHandler perl-script
PerlSetVar Filter On
PerlHandler Apache::RegistryFilter Apache::Gzip
</Location>
|
Will compess the output of the Apache::Registry scripts. Yes, you should write Apache::RegistryFilter and not Apache::Registry.
You can put as many filters as you want:
PerlModule Apache::Filter
<Files ~ "*\.blah">
SetHandler perl-script
PerlSetVar Filter On
PerlHandler Filter1 Filter2 Apache::Gzip
</Files>
|
You can test that it works by either looking at the size of the respond at access.log or by telnet:
telnet localhost 8000 Trying 127.0.0.1 Connected to 127.0.0.1 Escape character is '^]'. GET /perl/test.pl HTTP 1.1 Accept-Encoding: gzip User-Agent: Mozilla |
And you will get the data compressed if configured correctly.
META: what the full mime type for gzip?
[ TOC ]
With this module you can configure @INC and have modules reloaded for a given Location. Suppose two versions of Apache::Status
are being hacked on the same server, this fixup handler will simply
delete $INC{ $filename }, unshift the preferred PerlINC path into
@INC, and reload the file with require():
PerlModule Apache::PerlVINC |
<Location /dougm-status>
SetHandler perl-script
PerlHandler Apache::Status
PerlINC /home/dougm/dev/modperl/lib
PerlVersionINC On
PerlFixupHandler Apache::PerlVINC
PerlRequire Apache/Status.pm
</Location>
|
<Location /other-status>
SetHandler perl-script
PerlHandler Apache::Status
PerlINC /home/other/current/modperl/lib
PerlVersionINC On
PerlFixupHandler Apache::PerlVINC
PerlRequire Apache/Status.pm
</Location>
|
It's important to be aware that a changed @INC is effective only inside the <Location> or a similar configuration directive.
Apache::PerlVINC subclasses the PerlRequire directive, marking the file to be reloaded by the fixup handler, using the
value of
PerlINC for @INC. That's local to the fixup handler, so you won't actually see @INC changed in your script.
To address possible issues of namespace clashes during reload, the handler
could call $r->child_terminate() so the next server to load the different versions will have a fresh
namespace. This is not a good idea in a high load environment, of course.
If you can't find it on CPAN, get it at: http://perl.apache.org/~dougm/Apache-PerlVINC-0.01.tar.gz
[ TOC ]
When Apache's builtin syslog support is used, the stderr stream is
redirected to /dev/null. This means that Perl warnings, any messages from die(), croak(), etc., will also end up in the black hole. The HookStderr directive will hook the stderr stream to a file of your choice, the default
is shown in this example:
PerlModule Apache::LogSTDERR HookStderr logs/stderr_log |
[ TOC ]
Because of the way mod_perl handles redirects, the status code is not
properly logged. The Apache::RedirectLogFix module works around that bug until mod_perl can deal with this. All you
have to do is to enable it in the httpd.conf file.
PerlLogHandler Apache::RedirectLogFix |
For example, you will have to use it when doing:
$r->status(304); |
and do some manual header sending, like this:
$r->status(304); $r->send_http_header(); |
[ TOC ]
The output of system(), exec(), and open(PIPE,"|program") calls will not be sent to the browser unless your Perl was configured with
sfio.
One workaround is to use backticks:
print `command here`; |
But a cleaner solution is provided by the Apache::SubProcess
module. It overrides the exec() and system()
calls with calls that work correctly under mod_perl.
Let's see a few examples:
use strict;
use Apache::SubProcess qw(system exec);
my $r = shift;
$r->send_http_header('text/plain');
# override built-in system() function
system "/bin/echo hi there";
# send the output of a program
my $efh = $r->spawn_child(\&env);
$r->send_fd($efh);
# pass arguments to a program and sends its output
my $fh = $r->spawn_child(\&banner);
$r->send_fd($fh);
# pipe data to a program and send its output
use vars qw($String);
$String = "hello world";
my($out, $in, $err) = $r->spawn_child(\&echo);
print $out $String;
$r->send_fd($in);
# override the built-in exec() function
exec "/usr/bin/cal";
print "NOT REACHED\n";
sub env {
my $r = shift;
#$r->subprocess_env->clear;
$r->subprocess_env(HELLO => 'world');
$r->filename("/bin/env");
$r->call_exec;
}
sub banner {
my $r = shift;
# /usr/games/banner on many Unices
$r->filename("/usr/bin/banner");
$r->args("-w40+Hello%20World");
$r->call_exec;
}
sub echo {
my $r = shift;
$r->subprocess_env(CONTENT_LENGTH => length $String);
$r->filename("/tmp/pecho");
$r->call_exec;
}
# where /tmp/pecho is:
# --------------------
#!/usr/bin/perl
#read STDIN, $buf, $ENV{CONTENT_LENGTH};
#print "STDIN: `$buf' ($ENV{CONTENT_LENGTH})\n";
|
|
Your corrections of the technical and grammatical
errors are very welcome. You are encouraged to help me
improve this guide. If you have something to contribute
please send it
directly to me.
|
|
The
Writing Apache Modules with Perl and C
book can be purchased online from O'Reilly
and
Amazon.com.
|
|
Written by Stas Bekman. Last Modified at 06/01/2000 |
|
Use of the Camel for Perl is a trademark of O'Reilly & Associates, and is used by permission. |