.

Harvest::Database::Generic - Parent class for Harvest-NG Databases

DESCRIPTION

This database class provides a set of methods for accessing databases of objects which are fetched during the spidering process.

Methods which require interaction with storage are not implemented. These should be overridden by the child inheriting from this class.

METHODS

$data=new Harvest::Database::Generic($mode,$metaclass,$config);

Create a new database, with the following characteristics:

$mode
The file mode of the database, can be either ro - read only, or rw - read/write.
$metaclass
The metadata class to use in the database. Should currently always be Metadata::SOIF
$config
A Harvest::Config structure, which can contain the following settings

File: The basename to use for all of the Database files (required) Type: The type of DBM database to use (optional, defaults to DB_File) Class: The class of the database

NB: Do not create new instances of this class - doing so will result in an error. Children _must_ override the constructor.

$data->store($obj)

Store a Harvest::Object object in the database. Objects whose management data contains MD5 hashes, and have the same MD5 hash as an object already in the database will not be stored.

Only the management and metadata sections of the object are stored.

NB: Children _must_ override this method.

$data->manage($url)

Return the management structure for the object representing $url

NB: Children _must_ override this method.

$data->fetch($url)

Return the Harvest::Object object for the given $url

NB: Children _must_ override this method.

$data->delete($url)

Remove stored data for $url. Returns false if no data exists for the URL.

NB: Children _must_ override this method.

$data->exists($url)

Return true if the object exists in the database, false otherwise

NB: Children _must_ override this method.

$data->foreach($proc, @args)

Call the function referenced by $proc with the arguments ($object,@args) once for each $object in the database.

$proc can terminate the calls early by returning undef.

NB: Do not undertake any actions which add, or remove, items from the database in your function.

Children _must_ override this method

$data->expire($proc, @args)

Remove any objects which have expired (as determined by calling the outofdate method of Harvest::Object::Manage)