The opensource platform MyWebIntelligence ('MyWi' for short) produced by the MICA laboratory as part of the Institute of Digital Humanities is to provide a strategic tool in the analysis and understanding of communication on the Internet. This is a "crawler" of a new generation which, from a keyword dictionary, buid a database of qualified web pages in the service of strategic intelligence. It not only utilizes numerous external data sources but the latest data classification algorithms (ANS TextAnalysis, etc.)
My Web Intelligence can provide the means to capture, qualify and prioritize considerable discourse mass to map the universe of discourse on your interests. This will not only have real-time studies in the online discourse but also better understand the heterogeneous actors by their arguments and strategies.
For now, everything will belong in this repo. Eventually, parts will be separated into their own repos (perhaps some parts will even be released as NPM modules).
MyWI will be a project that can be installed on a server (dedicated machine) and accessed to via a web interface. MyWI needs crawling capabilities; as such, it needs to send HTTP requests all the time and throttle them (to be a good web citizen and not be blocked) as well as storage capabilities which makes it hard to make a browser addon.
Off-head, a few server-side components will be needed:
- user/project management
- expression domain resolution
- The database will a PostgreSQL database. We would have loved to use a graph database like Titan, but these have been ruled out for now by lack of experience and maybe lack of tooling around them (and resources to build this tooling ourselves).
- Express
Client-side is built with
- React (without JSX)
- Browserify (+ tsify)
- (TypeScript)
- (ESLint)
- (Docker)
- (CasperJS)
TODO figure out and document relationship with Trello