The "ghostEmailExtractor" script is a practical tool designed for businesses to streamline the extraction of relevant email addresses from text-based documents. With a focus on simplicity and user-friendliness, the script efficiently processes input files, identifies email addresses using regular expressions, and provides customization options such as filtering by domain extensions or excluding specific emails. The script's capabilities include handling duplicate entries, sorting the final list alphabetically, and saving the results in various file formats like CSV and Excel. By offering a straightforward command-line interface, the tool enhances email data management, supporting businesses in organizing and utilizing contact information effectively.
-
Email Extraction: Extracts email addresses from a specified input file.
-
Input Validation: Checks if the input file exists before processing.
-
Logging: Logs errors to a file named 'error.txt'.
-
Data Storage: Uses Pandas DataFrame to store extracted emails.
-
Filtering by Domain Extensions: Optionally filters emails by specified domain extensions.
-
Exclusion of Specific Emails: Optionally excludes specific email addresses.
-
Duplicate Removal: Removes duplicate email addresses.
-
Sorting: Sorts the final list of emails alphabetically.
-
Output File Formats: Saves the extracted emails to CSV, Excel, or text files.
-
User Interaction: Prompts the user for input, including file names and optional parameters.
-
User Feedback: Provides informative messages to the user about the extraction process.
-
Exception Handling: Handles unexpected errors and displays error messages.
-
User-Friendly Prompts: Presents clear and user-friendly prompts for input.
-
To run this script use this command (without the quotation marks) ==> "python3 ghostEmailExtractor.py"
-
Ensure that The email address list begins with ==> "Email" before your email addresses to avoid any error (without the quotation marks).