R and MongoDB Walkthrough

R
MongoDB
Author

Brock Tibert

Published

December 2, 2013

R and MongoDB Walkthrough

rmongodb Tutorial

This is a quick document aimed at highlighting the basics of what you might want to do using MongoDB and R. I am coming at this, almost completely, from a SQL mindset.

Install

The easiest way to install, I believe, is

library(devtools)
install_github(repo = "mongosoup/rmongodb")

Connect

Below we will load the package and connect to Mongo. The console will print TRUE if we are good to go.

library(rmongodb)
# connect to MongoDB
mongo = mongo.create(host = "localhost")
mongo.is.connected(mongo)
[1] TRUE

What’s in MongoDB

Take a look at what you have. This will show the databases in my local instance of MongoDB.

mongo.get.databases(mongo)
[1] "bbdi"            "nhlpbp"          "he_search_graph" "emchat"         
[5] "twitter"

Let’s look at all of the collections (tables) in one of the db’s.

mongo.get.database.collections(mongo, db = "nhlpbp")
[1] "nhlpbp.gameids" "nhlpbp.rawpbp"

Some Helper Functions

There are some basic commands that will help you manage your database. For instance, count how many documents (rows) we have in a collection.

DBNS = "nhlpbp.gameids"
mongo.count(mongo, ns = DBNS)
[1] 4761

Query the data

When exploring what you have for data, it’s really helpful to use the find.one concept.

tmp = mongo.find.one(mongo, ns = "nhlpbp.gameids")
tmp
    _id : 7      5233cec65b5e625ad4e6e67b
    seasonID : 2     20082009
    gameID : 2   2008030417
    homeTeam : 2     Detroit Red Wings
    gameType : 2     Playoffs
    awayTeam : 2     Pittsburgh Penguins
    date : 2     Fri Jun 12, 2009

If tmp prints out some data, our query was successful. Check out the help for find.one if you want more info.

PROTIP: When you print a document, you will see the field: a mongo value type and the value. The mongo value type will be passed as a numeric value. To understand how Mongo stores the data, refer to the documentation. This will be a huge help when you have to build queries using the BSON buffer.

Brock Tibert

Author

Brock Tibert

Lecturer, Information Systems

Lecturer in Information Systems, Consultant, and nerd.