To run the code for this project, first you must clone (using the following code: git clone url) the repository to your console. This allows you to edit files from the repository but won't change the original file until you push the code back to GitHub. The specific file that we're using is called ClassifyingGalazies.py. To acess this file you must use the code: git pull (this will give you a copy of the most recent updates on files and put them on your console/server) in your console.
To access data from Kaggle to use for this project, first a Kaggle account is needed (use the link below to get to the Kaggle website). Once an account is created, go to the the GalaxyZoo competition page (this page has all the details about the competition) and click on the Data tab. This shows all of the files that can be used for this specific project. Download the images test, images training, and training solutions files, these are the files that are used in our code.
Please pull the code (ClassifyingGalaxies.py, capital C) again and move your solutions training csv file from the zip to the Downloads folder. Then run the code. The updated code in github right now only goes through images 3000-5000 (cause we're not sure how much memory it takes to run more). Once this is finished, please let us know and we will change the code so that it runs the next batch of images. It is important that we change the code between every run for image training files, image test files, and the saved file name.
This project works with Galaxy Zoo data to write an algorithm to classify galaxies. First learn about Galaxy Classification on Galaxy Zoo. Next try to write a classification algorithm for galaxies using the data and format available at Galaxy Zoo, the Galaxy Challenge. Submit your results to the challenge.
This competition asks you to analyze the JPG images of galaxies to find automated metrics that reproduce the probability distributions #derived from human classifications. For each galaxy, determine the probability that it belongs in a particular class. Can you write an algorithm #that behaves as well as the crowd does? --Galaxy Zoo, the Galaxy Challenge1
- http://zoo1.galaxyzoo.org/Tutorial.aspx
- https://www.galaxyzoo.org/#/classify
- https://www.kaggle.com/c/galaxy-zoo-the-galaxy-challenge
- https://data.galaxyzoo.org/
1Note: the time for the monetary prize for the challenge is over, but with the skills you learn doing this project you can eventually create entries solving another for-prize-money challenge.
For an even more advanced version of this final project, use one of the data sets on http://www.astro.caltech.edu/~srk/Ay3/Ay03Main.html to make a new discovery!