This notebook aims to show the basics of:
Tensorflow 2.0 Shooter Embedding estimation for NHL Player evaluation Evaluate feasibility generating a post that switches between R and python via reticulate Demonstrate code similarity/approach in both languages side-by-side TL;DR Combine Tensorflow/Keras with R NHL Data to estimate Shooter Player Embeddings Export to Tableau for exploration (yes we could use ggplot et. al, but highlights we have other options, especially for those new to the language) R Setup # packages library(keras) suppressPackageStartupMessages(library(tidyverse)) library(reticulate) suppressPackageStartupMessages(library(caret)) # options options(stringsAsFactors = FALSE) use_condaenv("tensorflow") Python setup # imports import pandas as pd import numpy as np from sklearn.
I have been diving back into python a bit lately, and admittedly, I have yet to find a tool that fits my workflow similar to that of R and Rstudio. There are all sorts of tools out there, but in the end, it feels like I am fighting the tool, not my code.
To be honest, I really like using VSCode for other projects, but I feel like this product is aimed more at developers working on large applications, not data scientists.
I recently learned today of carbon and it is absolutely fantastic.
In my own words, carbon provides a terminal-like formatting for your code snippets, which can be included in blog posts and the like. It just makes things easier to read, in my opinion.
Where my head goes is taking a snippet that looks like this:
options(stringsAsFactors = FALSE) ## load the packages library(wakefield) ## generate a dataset of random users users = r_data_frame( n = 500, id, state, date_stamp(name="registration_date"), dob, language ) users$ID = as.
Below is a post aimed at my future self. Be forewarned.
The idea is to take an R data frame and convert it to a JSON object where each entry in the JSON is a row from my dataset, and the entry has key/value (k/v) pairs where each column is a key.
Finally, if the value is missing for an arbitrary key, remove that k/v pair from the JSON entry.
Many moons ago, I wrote some code to build a Tableau Data Extract from the work that I had munged together in python. I figured it was time to update the code since I recently discovered that the Tableau API has changed.
For a link to that old code, refer to the Jupyter Notebook in this repo.
Assumptions and Requirements First off, I am using a Macbook, and while I believe things are getting easier on Windows machines with respect to coding, I prefer to write Terminal commands over point-and-click installs.
If you have skimmed through some of my other posts on this blog, it’s probably not surprising that I love using Neo4j in my projects. While you certainly can develop and work through your ideas locally, if you are like me, you probably have a few pet projects going at once, some of which you might want to share publicly.
This post aims to highlight how quickly you can get up and running using Cloud9, a cloud-based development environment.
Below is a quick writeup on how I use R and RNeo4j to munge my data and throw “larger” datasets into Neo4j. In short, I am fairly capable in R, so I prefer to use it to do the heavy lifting.
All I am doing is calling the neo4j-shell tool via ?system command. This post runs through how I have used this approach in some of my recent projects. I used this process for a project that I am currently working on at work, where 3+ million nodes and nearly 9 million relationships.
I have been watching the DiagrammeR package for a while now, and at this stage, it’s pretty impressive. I encourage you to take a look at what is possible, but be assured the framework is there to do some really awesome things.
One use-case that applies to me is that of data modeling an app within Neo4j. There are already some tools out there, namely:
Arrows Graphgen by GraphAware And you can always use graphgists The last link above is a sample graph gist that is a decent overview.
The Prismatic Team has slowly been rolling out a very cool API. You can read all about it here. At the same time, I have been using this as an opportunity to learn how to create an R package.
After today’s API update to identify the relevant content related to a specific topic, I wanted to highlight what is possible with a few lines of code using the prismaticR package.
This is a quick document aimed at highlighting the basics of what you might want to do using MongoDB and R. I am coming at this, almost completely, from a SQL mindset.
Install The easiest way to install, I believe, is
library(devtools) install_github(repo = "mongosoup/rmongodb") Connect Below we will load the package and connect to Mongo. The console will print TRUE if we are good to go.
library(rmongodb) # connect to MongoDB mongo = mongo.