$Id: README,v 1.10.2.1 2008-01-11 11:46:18 mike Exp $


Introduction
------------

This directory contains the source code for Index Data's open source
link resolver, Keystone Resolver, which is part of the Keystone
Digital Library suite.  It is implemented as a Perl module called
"Keystone::Resolver".

TROUT was our earlier proof-of-concept implementation of a trivial
OpenURL resolver: its name stood for Trout Resolves Open URLs
Trivially.  The code was trivial because it was based on a trivial
standard: OpenURL v0.1, as described in the ten-page document
	http://www.openurl.info/registry/docs/pdf/openurl-01.pdf

The new code does not have this luxury for three reasons:

    1.	It is not limited to resolving OpenURLs, but also intends to
	handle DOIs and, in principle at least, other forms of
	metadata-based link.

    2.	Its OpenURL support is based on the newer and much more
	verbose version of the standard as produced by ANSI/NISO
	Committee AX and described at
		http://library.caltech.edu/openurl/Standard.htm
	This standard abstracts and indirects absolutely everything,
	whether it needs abstracting or not, and the code needs to
	reflect this.

    3.	Unlike TROUT, Keystone Resolver needs to do non-trivial things
	in order to resolve links: in particular, it needs a big,
	complex knowledge-base that tells it what resources are
	available to link to and what they contain.

Accordingly, the new code comes in lots of classes, which are
described in the file "Classes".  If you are about to read the
resolver code, that file is a good place to start.


Directory Structure
-------------------

The Keystone Resolver distribution is laid out in the following
directories:

bin/	Resolver-related scripts to be run from the command-line.

db/	Resource database material, including schemas, sample data and
	database-creation utilities.  At present, this is set up to
	make a tiny "toy" database.  In future releases, it will be
	expanded to make further databases, including one based on
	CUFTS data.

doc/	Embryonic documentation, in plain text format.  Eventually
	this will either be moved into Perl POD format (in the "lib"
	directory with the source code) or formatted using a proper
	system such as DocBook or OpenOffice.

etc/	Various configuration files, including XML DTDs and XSLT
	stylesheets.

lib/	The resolver source-code library.  (The actual resolver
	program is a trivial seven-line script in the
	web/htdocs/mod_perl/ area -- the library does all the work.)

t/	Test scripts, invoked by the distribution's "make test" rule.
	See also the t/regression subdirectory and its README file.

web/	The resolver's web-server files: server configuration files,
	CGI/mod_perl scripts, HTML pages, images, stylesheets ...

The purpose and contents of most of these directories are described in
more detail in their own README files.

If you got this software via CVS rather than as a distribution
tarball, then you will also have an "archive" directory.  The whole
purpose of this is to contain all the stuff that's not interesting to
anyone except the developers, so just delete it :-)


Prerequisites
-------------

-> A web-server.
Any web server that supports the CGI standard should work, but we use
Apache 1.3 with mod_perl.  The rest of these instructions assume
that's what you're using.

Apache 2.0 does not work due to different Perl classes representing
Apache::Request and Apache2::request (among a zilliard other differences).

On Debian Lenny you need to port the package 'libapache-request-perl 
- generic Apache request library - Perl modules' from Etch to Lenny 
by fetching the sources and recompilation, as the similar Lenny 
packages do not exist.    

-> The Perl module CGI
This is not used by the main resolver entry point, but by the
utility method Keystone::Resolver::OpenURL->newFromCGI(), which uses
it to gather the arguments to pass into the Resolver library proper.
So in theory at least we can use the same library to make resolvers
that get their arguments some other way, e.g. link resolution by
email.

-> The Perl module DBI
This is used to access the resource database.  You also need the Perl
module forwhatever driver you use, e.g. DBD::MySQL.

-> The actual database software, e.g. MySQL
You should be able to use any relational database (MySQL,
PostgreSQL, Oracle, etc.), but the development has been done
using MySQL and it'll be simpler to use that unless you have a
compelling reason to do something different.

If you want to port to Oracle on Debian systems, you might want to look at
Oracle Debian packages

        Oracle Database 10g Express Edition (Universal)
        Oracle Database 10g Express Client

http://www.oracle.com/technology/software/products/database/xe/htdocs/102xelinsoft.html

APT source line:
 deb http://oss.oracle.com/debian unstable main non-free



-> The Perl module LWP
This is used to resolve the enormous number of network indirections
that a v1.0 OpenURL can have, e.g. the OpenURL itself can use a
By-Reference transport, the ContextObject can specify any or all of
the six entities by reference.

-> The Perl module XML::LibXSLT
This is used to transform the resolver's XML output into pretty,
user-facing HTML.
	-> Gnome libxslt, including development kit
	-> The Perl module XML::LibXML
		-> Gnome libxml2, including development kit
		-> The Perl module XML::SAX
		-> The Perl module XML::NamespaceSupport
		-> The Perl module XML::LibXML::Common

-> The Perl module Text::Iconv
This is used to translate between different character encodings.
	-> The iconv library, but this seems to be included in libc
	   (the standard C library) in Red Hat 9, and therefore
	   probably also in most modern operating systems.

-> The Perl module Digest::MD5
This is needed to calculate the checksums that Elsevier requires in
the customer-specific URLs that access its full-text documents.


-> The Perl module HTML::Mason
This is needed to power the admin pages.

 


Installation
------------

To install this module type the following:

	perl Makefile.PL
	make
	make test
	sudo make install

You will also need to build the "toy" resource database (or of course
a proper one if you have the data).  To do this, run "make" in the
"db" subdirectory, providing the root MySQL password when requested to
do so.  This will allow the bin/kr-test and
web/htdocs/mod_perl/resolve scripts to run successfully.

Once the toy database has been built, it's possible to run a simple
sanity-test without installing or even building anything, using the
kr-test script:

	perl -I lib bin/kr-test t/regression/zetoc-suuwassea


Configuration
-------------

To set up Keystone Resolver, you need to do the following steps:

* If you're going to run the resolver as a virtual host (which is what
  I do), create an entry in /etc/hosts for the hostname, for example
  x.resolver.indexdata.com -- or of course set up DNS to serve that
  name's IP address.

* Configure your web server so it can execute the resolver code.  If
  you're using Apache 1.3, you can use a lightly tweaked copy of the
  sample configuration file
	web/conf/apache1.3/xeno.conf
  from this distribution.  Just drop it into the server's
  configuration directory, usually /etc/httpd/conf.d or something
  similar depending on what operating system you're using.  Note that
  you will in general need to change the hostnames in this file.

Non-standard installation directory
-----------------------------------

This software expects to be unpacked into the directory
	/usr/local/src/cvs/resolver/
That path is wired into several places.  If you want to run it from
somewhere else, you'll need to change them all:

* The DocumentRoot, Directory, PerlSetEnv and Alias directives in
  web/conf/apache1.3/xeno.conf (or whatever Apache configuration
  you're using)
* The "xsltdir" setting in lib/Keystone/Resolver.pm

Clearly this is too many places; we should try to find a way to reduce
it, ideally to a single place.


Support
-------

Informal support is available on the Keystone Resolver community
mailing list at
	http://www.indexdata.dk/mailman/listinfo/resolver
which any user is free to join.

Commercial support is available from Index Data.
Email <info@indexdata.com> for details.


Copyright and Licence
---------------------

Copyright (C) 2004-2007 Index Data Aps.

This library is free-as-in-freedom software (which means it's also
open source); it is distributed under the GNU General Public Licence,
version 2.0, which allows you every freedom in your use of this
software except those that involve limiting the freedom of others.
A copy of this licence is in the file "GPL-2"; it is described and
discussed in detail at
	http://www.gnu.org/copyleft/gpl.html

The primary author is Mike Taylor <mike@indexdata.com>