đź“… 22 Jan 2021
I was hoping to share some code with this post, but alas, that will have to wait for a later date. I can’t seem to get it formatted the way I want and screenshots of it look terrible.
But, I can at least explain what I am doing and my thought process I’m using to approach the task.
As an aspiring data analyst/data scientist, I am constantly looking for new data sets to work with. I received feedback early on in my job hunt that I needed to work with larger data sets. These are readily available at websites like kaggle.
However, being the type of person that I am, I wanted to work with some data that not a lot of people were working with. As I started searching, I remembered a rather unique article I ran across in the New Yorker.
I sent a short email to the author of the Murder Accountability Project (MAP) asking for permission to use the data for some personal projects. Shortly thereafter, I got a reply - and an offer. The algorithm that the MAP relied on for serial killer detection was originally written in SPSS. However, moving forward, the author wanted to have a working version in R. After a short conversation, I agreed to take on this challenge.
Now, I barely knew what SPSS was at the time. I definitely didn’t know how to read it. But several days later, I started the task of re-writing the algorithm in R.
Since I am currently a student and don’t really have a set schedule, I find that I work best on something like this if I spend 10-20 minute chunks of time on it throughout the day. I try to do these chunks 2-3 times per day. So, my normal process could be described as follows:
- Read what I did yesterday
- Try to interpret the SPSS code (email if I can't figure it out, after search engine of course)
- Re-write it in R, which is sometimes easy and other times ends up being very different than the spirit of the SPSS code. I often re-visit it the next day and re-write/modify accordingly.
- Run the code and fix any errors.
- Compare my output to the SPSS output. Make changes as necessary.
- Repeat
Overall, I have found this process incredibly rewarding. I am getting close to the end of the project and hope to finish it in a few weeks. I could probably finish it in a day or two if I really buckled down but I honestly don’t think that would be as fun or rewarding. Plus, I have other things on my plate.
Hopefully in the future I can actually share some code and give a more clear idea of what I’ve been doing. Until then, I hope this was interesting.
Day 4: #100DaysToOffload
đź’¬ Looking for comments?
I don't have comments on this site because I don't feel like managing them.
Instead of leaving a comment, feel free to ✉️ contact me instead.