-
Notifications
You must be signed in to change notification settings - Fork 15
Handler
YorickC edited this page Jan 7, 2015
·
17 revisions
![code_architecture] (https://raw.githubusercontent.com/mementoweb/timegate/master/doc/code_architecture.png)
A handler is a python file that will typically talk to an API to get the list of archives, along with their dates, and return it to the TimeGate. If no API is present, it is possible to use page scraping or even data-base queries. Anything will do.
They require to have the following:
- It must a python file placed in the
core.extension
module (which is thecore/extension/
folder). And it must be unique. If several classes are needed, consider adding the handler manually in the configuration file. - A handler must extend the
core.handler.Handler
base-class. - Implement the
get_all_mementos(self, uri_r)
class function. This function is called by the TimeGate to retrieve the history an original resourceuri_r
. The return value must be a list of pairs:[(uri_m1, date1), (uri_m2, date2), ...]
. Each pair(uri_m, date)
contains the URI of an archived version of Ruri_m
, and the date at which it was archiveddate
. All URI fields must be strings and all Date fields must be strings, ISO 8601-formatted dates. - If the API cannot return the entire history for a resource, the handler must implement the
get_memento(self, uri_r, date)
function. This function will be called by the TimeGate to retrieve the best Memento foruri_
at the datedate
. In this case, the return value will contain only one pair:(uri_m, date)
which is the best memento that the handler could provide taking into account the limits of the API.
An example handler is provided incore/extension/
and can be edited to match your web server's requirements:
example.py
Other handlers examples are provided for real world APIs in core/extensions_all/
:
-
arXiv.org:
arxiv.py
-
wikipedia.org:
wikipedia.py
-
GitHub.com:
github.py
Other scraping Handlers examples are provided for real world resources without any API:
-
Canadian Web Archives
can.py