rmongodb Tutorial
This is a quick document aimed at highlighting the basics of what you might want to do using MongoDB
and R
. I am coming at this, almost completely, from a SQL
mindset.
Install
The easiest way to install, I believe, is
library(devtools)
install_github(repo = "mongosoup/rmongodb")
Connect
Below we will load the package and connect to Mongo. The console will print TRUE
if we are good to go.
library(rmongodb)
# connect to MongoDB
= mongo.create(host = "localhost")
mongo mongo.is.connected(mongo)
[1] TRUE
What’s in MongoDB
Take a look at what you have. This will show the databases
in my local instance of MongoDB
.
mongo.get.databases(mongo)
[1] "bbdi" "nhlpbp" "he_search_graph" "emchat"
[5] "twitter"
Let’s look at all of the collections (tables) in one of the db’s.
mongo.get.database.collections(mongo, db = "nhlpbp")
[1] "nhlpbp.gameids" "nhlpbp.rawpbp"
Some Helper Functions
There are some basic commands that will help you manage your database. For instance, count how many documents (rows) we have in a collection.
= "nhlpbp.gameids"
DBNS mongo.count(mongo, ns = DBNS)
[1] 4761
Query the data
When exploring what you have for data, it’s really helpful to use the find.one
concept.
= mongo.find.one(mongo, ns = "nhlpbp.gameids")
tmp tmp
_id : 7 5233cec65b5e625ad4e6e67b
seasonID : 2 20082009
gameID : 2 2008030417
homeTeam : 2 Detroit Red Wings
gameType : 2 Playoffs
awayTeam : 2 Pittsburgh Penguins
date : 2 Fri Jun 12, 2009
If tmp
prints out some data, our query was successful. Check out the help for find.one
if you want more info.
PROTIP: When you print a document, you will see the field: a mongo value type and the value. The mongo value type will be passed as a numeric value. To understand how Mongo stores the data, refer to the documentation. This will be a huge help when you have to build queries using the BSON buffer.