Skip to content

mexiCoders/links-tree

 
 

Repository files navigation

Links tree

Project created for a job application, requirements are described below.

Requeriments

As a product owner I would like to have a web crawler application that takes a given site (e.g. informador.com.mx) and lists the links contained there hierarchically; links contained in a second level page should also be included. This is to be replicated ad infinitum as long as the links i within the original domain.

Acceptance criteria

  • List item
  • Links are listed in a tree structure
  • Webcrawler updates the page regularly while still looking for new links
  • Links listed should only be links that point to a different page than the one that it lives on (e.g. anchor links that point to a given section within the same page are to be excluded)
  • Any element that has links that point to a different page should also be included (e.g. images with links)
  • Only list links to external sites as a one shot, the crawler should not list secondary links within that external site (e.g. if a link from informador.com.mx points to espn.com list this link in the hierarchy, but don't show more links under espn.com )

Technical requirements

  • The web crawler should be constructed using node.js
  • The output in the webpage should be a json that nests the links showing hierarchy
  • The tree for the links should contain real time data

Acceptable if the page has to be refreshed when new links are found, extra points if the refresh happens without user interaction.

Notes

The server fetchs the target site every 5 minuts. You should wait until the server cache for the first time. It follow links at second level.

Setup

Clone the repository and install the dependencies.

$ git clone https://github.com/abiee/links-tree.git
$ cd links-tree
$ npm install
$ bower install
$ gulp serve

Open your browser and use the url http://localhost:9000/ to see the project working. Do not forget to install globally gulp and bower if not installed yet.

Testing

You can run server tests with test:server

$ gulp test:server

Licence

Licensed under the MIT license.

About

Creates a tree of links for a given site

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • CSS 40.8%
  • JavaScript 38.6%
  • HTML 19.9%
  • Handlebars 0.7%