Loading…

Sam Kenkel

Data Science, Machine Learning, DevOps, CCNA, ACSR
Learn More

Anime_Rec4: Predicting User Scores with Neural Nets

Part 1 of this series explained why I was making an Anime recommendation system, gave a brief overview of the approach I was taking. Part 2 explained how I got my Data. Part 3 explained  how I tuned my 3 Item-Item similarity models to generate ‘possible’ recs. In this part, I’ll talk about predicting user scores with neural nets. Why 3 Neural Nets: Ensembling by targeting different scores: There are 3 pre-trained neural nets. Each net has been trained to predict one type of score: Score, User Scaled Score, Anime Scaled score. The Nets are loaded […]

Anime_Rec3: Generating possible recommendations (Cosine Similarity methods)

The intro to this series explained why I was making an Anime recommendation system, part 1 gave a brief overview of the approach I was taking. Part 2 explained how I got my Data. In this part I will explore how I tuned my three different methods determining Item-Item similarity. Method 1: Item-Item similarity based on user scores. Anime_Score_Sim in my github shows the code for this. First all 0’s (or statuses without a score) are dropped.  Next, I find the average score for each user, and subtract that score from every user’s score. This is to […]

Anime_Rec2: Data Collection, EDA

The intro to these posts explained why I was making an Anime recommendation system, and Part 1  gave a brief overview of the approach I was taking.  In the next part I will start to explore how I tuned my Item-Item similarity models. Before diving into that I wanted to go through my data collection process, and initial analysis that helped guide me in this process. Even though every project like this starts with the data, and as  Data is the New Oil it’s always worth going past the platitudes to figure out where my data came from, […]

Setting up an Nvidia-Docker workstation for DataScience/DeepLearning

After deciding, in my previous post, to switch my z620 to an Nvidia-Docker workstation, I wanted to give a writeup of how exactly I did that, because some of the specific technical steps (such as disabling a graphics card in bios to install the nvidia driver) aren’t all documented in one place. Part 1: HW Setup First I open up my z620, and remove the quad port NICs that I’m no longer going to use.  The z620 has two ‘compartments’ inside of the case: the pci-express ports sit on one side of a partition, and […]

Designing a DeepLearning Homelab: Cloud vs Virtualization vs Docker

As a Data Scientist coming from the Networking and DevOps world, I’m a firm believer in the Homelab philosophy: The best way to learn things is to experiment and build within a lab environment. This was crucial to me getting my CCNA, and how I learned virtualization as well. Now that I’m transitioning into Data-Science and Machine learning, I recently updated my lab. In this post I explain the specific HW and SW setup I followed to convert my z620 in an Nvidia-Docker workstation. I wanted to go through my thought process in how I decided […]

Lol_Scout 3: Final Modelling and Results

Background:Summary of the previous posts. In the 5 v 5 Videogame/Esport two teams of 5 players compete against each other. I have gathered Data using the API from riot games. I’m trying to use machine learning to predict wins or losses based on the characters (Champions) that players choose, and those player’s skill/ practice with those champions. This is the 3rd of 3 blog posts about my process and discoveries working with data from Riot’s online game, League of Legends.  The code I wrote for initial sanity check modelling work can be found here. The feature […]

Lol_Scout 2: Feature Engineering, initial Modelling

This is the 2nd of 3 blog posts about my process and discoveries working with data from Riot’s online game, League of Legends.   This post is a technical writeup of the code I used for my initial ‘baseline’ modeling, and my Data Preparation (and imputation) code. The code I wrote for initial sanity check modelling work can be found here. The feature engineering/Data Prep code is here. The code for my ‘final’ models is here. Background:Summary of the previous post In the 5 v 5 Videogame/Esport two teams of 5 players compete against each other. […]

Lol_Scout 1: Data Collection

This is the 1st of 3 blog posts about my process and discoveries working with data from Riot’s online game, League of Legends.  The code I wrote to do this can be found here. Background:Project Purpose In the 5 v 5 Videogame/Esport two teams of 5 players compete against each other. Before the ‘Game’ start each player chooses 1 of (currently 133) characters, known as a champion. No two players may play the same champions. In ranked and professional play, players may “ban” a champion and prevent either side from choosing that champion. There are […]

Kaggle_Titanic

One of the most famous modern  machine learning training datasets is to predict survival of passengers on the titanic. I used this project while experimenting with KNearestNeighbor classifiers, SVMs, LogReg, pipelines (and the problems with dummying categorical data in pandas), as well as the TFlearn front-end for Tensorflow. My sourcecode for that that project can be found here.

Anime_Rec: Generating recommended Animes based on MAL data.

Anime_Rec is a Data Science project to generate Anime recommendations based on publicly available data from the website myanimelist.net. I’m an Anime fan. In fact, I watch enough Anime to have hit that point where finding something to watch becomes difficult. As an Anime fan and Data Scientist,  the obvious solution was to build a Recommendation engine to recommend Anime for me to watch. This post explains my overall approach and architecture The first step in any machine learning or Data Science project is gathering the data, and thankfully for me, other Anime fans have done […]