Skip to content

alakel/MyWebIntelligence

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MyWebIntelligence

The opensource platform MyWebIntelligence ('MyWi' for short) produced by the MICA laboratory as part of the Institute of Digital Humanities is to provide a strategic tool in the analysis and understanding of communication on the Internet. This is a "crawler" of a new generation which, from a keyword dictionary, buid a database of qualified web pages in the service of strategic intelligence. It not only utilizes numerous external data sources but the latest data classification algorithms (ANS TextAnalysis, etc.)

My Web Intelligence can provide the means to capture, qualify and prioritize considerable discourse mass to map the universe of discourse on your interests. This will not only have real-time studies in the online discourse but also better understand the heterogeneous actors by their arguments and strategies.

Architecture

For now, everything will belong in this repo. Eventually, parts will be separated into their own repos (perhaps some parts will even be released as NPM modules).

MyWI will be a project that can be installed on a server (dedicated machine) and accessed to via a web interface. MyWI needs crawling capabilities; as such, it needs to send HTTP requests all the time and throttle them (to be a good web citizen and not be blocked) as well as storage capabilities which makes it hard to make a browser addon.

Server-side

Off-head, a few server-side components will be needed:

  • user/project management
  • expression domain resolution

User/Project management

  • The database will a PostgreSQL database. We would have loved to use a graph database like Titan, but these have been ruled out for now by lack of experience and maybe lack of tooling around them (and resources to build this tooling ourselves).
  • Express

Client-side

Client-side is built with

Tooling

Testing

Project organisation

TODO figure out and document relationship with Trello

Licence

MIT

About

A tool to understand the web around you

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 86.5%
  • CSS 13.5%