Skip to content

bnlcas/ArxivStudy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ArxivStudy

This repository contains functions relevant to a computational linguistic analysis of the semantic patterns found in the abstracts of scientific publications found on 'arxiv.org'.

The function 'Arxiv Scraper.py' is used to webscrape abstracts from articles published on Arxiv.org to build a linguistic corpus. (Please feel free to message me for a copy of this text). The functions 'LSA_make_term_doc_topic.py' and 'LSA_make_term_doc.py' are used to create a term-doc matrix and then run a single value decomposition to project abstracts into a low dimensional semantic space for various methods of cluster analysis.

The results are found in the 'Figures' folder - for reference here is a sample heirarchical clustering: Research Areas Dendrogram

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages