As the last part of the previous post-series about MongoDB and Twitter I’m about to show some plots about an initial speed comparison of the two DBs. As a result, these plots show how MongoDB can perform better than traditional SQL solutions if it comes to speed. Of course the overall picture is more sophisticated. In these cases I focused on the simplest approach possible – retrieving documents from Mongo and rows from Postgre.
Originally I wanted to write about visualization in the 2nd post (and after that in the 3rd) but that post would have been too long to read. I always loose myself if it comes to writing but nevermind, finally it’s here. So, we know how to access to the DB and we can query for interesting subsets – even in a geographic way. All we have to do is to interpret our results. I’m presenting two ways, a gif animation and a wordcloud. It’s not about reinventing the wheel but still, I believe that these are useful approaches to complement each other.
In the previous post, I’ve introduced the topic and technology. Now, it’s time to define the problem and methods. Next entries will discuss how to access to MongoDB and how to retrieve geocoded Tweets. I will focus on tweets that are somehow related to the weather using the simplest approach possible – querying their content for the keyword ‘weather’. I will create some nice visualizations later on, an animated gif and a wordcloud that can help us understand what is behind the scenes. You’ll find some code snippets and screenshots so feel free to scroll down to those if you’re not interested in long discussions. So, let’s grab the data from MongoDB and see what’s inside! There’s quite much to do.