
rmongodb Tutorial
This is a quick document aimed at highlighting the basics of what you might want to do using MongoDB and R. I am coming at this, almost completely, from a SQL mindset.
Install
The easiest way to install, I believe, is
library(devtools)
install_github(repo = "mongosoup/rmongodb")Connect
Below we will load the package and connect to Mongo. The console will print TRUE if we are good to go.
library(rmongodb)
# connect to MongoDB
mongo = mongo.create(host = "localhost")
mongo.is.connected(mongo)[1] TRUE
What’s in MongoDB
Take a look at what you have. This will show the databases in my local instance of MongoDB.
mongo.get.databases(mongo)[1] "bbdi" "nhlpbp" "he_search_graph" "emchat"
[5] "twitter"
Let’s look at all of the collections (tables) in one of the db’s.
mongo.get.database.collections(mongo, db = "nhlpbp")[1] "nhlpbp.gameids" "nhlpbp.rawpbp"
Some Helper Functions
There are some basic commands that will help you manage your database. For instance, count how many documents (rows) we have in a collection.
DBNS = "nhlpbp.gameids"
mongo.count(mongo, ns = DBNS)[1] 4761
Query the data
When exploring what you have for data, it’s really helpful to use the find.one concept.
tmp = mongo.find.one(mongo, ns = "nhlpbp.gameids")
tmp _id : 7 5233cec65b5e625ad4e6e67b
seasonID : 2 20082009
gameID : 2 2008030417
homeTeam : 2 Detroit Red Wings
gameType : 2 Playoffs
awayTeam : 2 Pittsburgh Penguins
date : 2 Fri Jun 12, 2009
If tmp prints out some data, our query was successful. Check out the help for find.one if you want more info.
PROTIP: When you print a document, you will see the field: a mongo value type and the value. The mongo value type will be passed as a numeric value. To understand how Mongo stores the data, refer to the documentation. This will be a huge help when you have to build queries using the BSON buffer.
