Entropic Data

Blogging data since 1886

Sentiments in Tweets – Integrating ArkTools into the Project

I’ve refined and partially overhauled my algorithms to analyze sentiment in Tweets over the last weeks with some notable results. Here is what I came up with so far. I am starting to feel like I’m doing science instead of the tedious tasks I did over my previous semesters.

Shaping the Scoring Algorithm – Attempt No. 1

 

Reading and interpreting sentences with an algorithm instead by yourself is tough. Reading tweets is worse, so much worse. Let me tell you about some concepts I came up with.

Easy Reading – Creating a Reader that (hopefully) won't break

 

Parsing data files is always a little difficult, since you can’t be sure that your data is formatted properly. I mentioned in earlier posts that I am currently creating a Reader for my training data. Here is how I am doing.

Using UIMA Pipelines – A Quick Overview

 

I am far from creating the best code possible but last week I spent some time writing a half decent Reader for my training data sets. I will write my code in Java, since it’s my most fluent programming language. But first I’ll write some lines about the pipeline.

Crawling Twitter – A Long Story about Patience

 

Turns out, twitter.com doesn’t like it when you request tons of tweets and it took me quite a while to get a decent number of them. Here is a little update, what I received so far.

Page 3 of 4

Powered by WordPress & Theme by Anders Norén