Rich

About

Majoring in statistics and economics, I have spent lots of time analyzing data and thinking about problems critically. These analyses have been done mainly in Python and R coding languages. I have done a variety of data projects including decomposing and forecasting time-series, creating interactive web applications in R, and using different types of machine learning to predict events.

One example is a stock prediction side project is called FourKube. A colleague of mine and I started it back in college. Leveraging AWS, we use Lambdas to automatically web-scrape data from the internet, send the data through our ETL pipeline, and then make predictions. Using lambda cron jobs, this entire process is 100% streamlined and automatically executed.

Here is a high-level view on how this process works. The process starts with data being web-scraped from multiple websites on the internet using python libraries pandas, numpy, and BS4. Next with natural language processing, sentiment analysis is done on the news articles and tweets. Following that, data is cleaned and categorical variables are one-hot-encoded to be useful for our model. Finally, feature selection for our Random Forest Classifier model is decided through Gini feature importance. I enjoy this work and have a strong interest in data science and engineering. Thank you.

If you would like to collaborate, don't hesitate to reach out!


"