-
Notifications
You must be signed in to change notification settings - Fork 15
Handler
YorickC edited this page Jan 12, 2015
·
17 revisions
A handler is a python class that is plugged into the generic TimeGate to fit any specific technique a web server has to manage its Original Resources and Mementos. Its role is simple: to retrieve the list of URI-Ms (with their archival dates) given a URI-R. It typically does so by connecting to an API.
- If no API is present: The list can be retrieved from many different ways. Page scraping, rule-based or even in a static manner. Anything will do.
- If the history cannot be retrieved entirely: The handler can implement an alternative function that returns one single URI-M and its archival datetime given both URI-R and the datetime the user requested.
- If the TimeGate's algorithms that select the best Memento for a requested date do not apply to the system: Implementing the alternative function could also be used to bypass these algorithms. This is particularly useful if there are performance concerns, special cases or access restriction for Mementos.
![code_architecture] (https://raw.githubusercontent.com/mementoweb/timegate/master/doc/code_architecture.png)
A handler require to have the following:
- It must a python file placed in the
core.handler
module (which is thecore/handler/
folder). And it must be unique. If several classes are needed, consider adding the handler manually in the configuration file. - A handler must extend the
core.handler_baseclass.Handler
base-class. - Implement at least one of the following:
-
get_all_mementos(uri_r)
class function: This function is called by the TimeGate to retrieve the history an original resourceuri_r
. The parameteruri_r
is a Python string representing the requested URI-R. The return value must be a list of 2-tuples:[(uri_m1, date1), (uri_m2, date2), ...]
. Each pair(uri_m, date)
contains the URI of an archived version of Ruri_m
, and the date at which it was archiveddate
. -
get_memento(uri_r, requested_date)
class function (alternative): This function will be called by the TimeGate to retrieve the best Memento foruri_
at the datedate
. Use it if the API cannot return the entire history for a resource efficiently or to bypass the TimeGate's best Memento selection. The parameteruri_r
is a Python string representing the requested URI-R. The parameterdate
is a Pythondatetime.DateTime
object. In this case, the return value will contain only one 2-tuple:(uri_m, date)
which is the best memento that the handler could provide taking into account the limits of the API. - If both are implemented,
get_memento(uri_r, requested_date)
will always be used for TimeGate requests. - If the TimeMap advanced feature is enabled,
get_all_mementos(uri_r)
must be implemented.
-
- Input parameters:
- All parameter values
uri_r
are Python strings representing the user's requested URI-R. - All parameter values
requested_date
aredatetime.DateTime
objects representing the user's requested datetime.
- All parameter values
- Output return values:
- All return values
uri_m
must be strings. - All return values
date
must be strings representing dates. Prefer the ISO 8601 format for the dates.
- All return values
An example handler is provided incore/extension/
and can be edited to match your web server's requirements:
example.py
Other handlers examples are provided for real world APIs in core/extensions_all/
:
-
arXiv.org:
arxiv.py
-
wikipedia.org:
wikipedia.py
-
GitHub.com:
github.py
Other scraping Handlers examples are provided for real world resources without any API:
-
Canadian Web Archives
can.py