Step 2: Building the user profiles
Now let's start building the user profiles. In this step we'll build the user profile for each user fetched in the previous step. The user profile is summarized by the statuses collected, where for each user there's a table of word frequencies. This table will allow to check if there are groups of users that frequently write about similar subjects or write in similar styles (e.g. english or portuguese languages).
For instance, consider three users: me ('marcelcaraciolo') that writes more about python and data mining, 'symbian' writes more about mobile and symbian stuff and the user 'parties' that write about beer and parties around the world. It will happen that after a clustering analysis it may be possible that me and symbian will be in the same group where 'parties' will be placed at another group due to the dissimilarity in behavior between us on Twitter.
The code provided on TwitterOrganizer.py will be responsible to open the data file with the statuses collected in the first step and build the user profiles by creating a table for each user with the word frequencies.
Now that i have my friends grouped by similar subjects posted on Twitter, it's time to make this data clustering useful. I've decided to create Twitter Lists, which is a new feature launched from Twitter recently. The idea is creating group of users whose the owner can follow and so other users if they are interested on the topics discussed on what the members of the list posted. It's a handful tool for organizing your friends in order to read their statuses based on what they post. For instance, my friends that talk about python i could group them altogether in a list with the name 'PythonUsers'.
Based on the clustering algorithm, i'll create the lists with members shown at each cluster. Since, the python-twitter library doesn't have at its last release support with Lists API. I've decided to create one on my own. The TwitterListAPI is a simple python wrapper for the Twitter API handling some operations with lists like: creating, updating and removing lists and adding/removing users from the list. You can check the implementation in the module twitterList.py.