Skip to content

A set of scripts to extract hyperlinks in a PDF file and check for the presence of URL tracking tags, such as UTM parameters for Google Analytics

License

Notifications You must be signed in to change notification settings

juliencoquet/pdfGoogleAnalyticsLinkTracking

Repository files navigation

pdfGoogleAnalyticsLinkTracking

A set of scripts to extract hyperlinks in a PDF file and check for the presence of URL tracking tags, such as UTM parameters for Google Analytics

Usage

Run python pdf_workflow.py pdfname.pdf from the command line.

Requirements

Check requirements.txt for required modules (pdfx, pdfminer)

Adaptations & TODO:

  • This script can easily be adapted to handle multiple PDF files
  • Offer a web app version where PDF files can be uploaded and processed
  • Adjust output, currently in list form but you can easily modify the script to suit your needs.

About

A set of scripts to extract hyperlinks in a PDF file and check for the presence of URL tracking tags, such as UTM parameters for Google Analytics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages