Entropic Data

Blogging data since 1886

Tag: data-mining

SHORT – Dealing with embedded nul in string manipulation with R

The past hours I’ve been ramming my head into the same problem over and over. I had to deal with multiple strings of hexadecimal values coming from multiple sources. So far so easy, just use the iconv package… no, does not work at all for specific strings.

Read More

A Data Science consultant working at Materna. He occasionally blogs about data and related topics here and is the host of the Dortmund Data Science Meetup.

Proving Ground – Corporate Fear of Disruption by Data

While most tinkerers, makers, data scientists and other visionaries already see the potential for a connected world and their benefits of understanding it to gain better insights, many enterprises are still cautious about investing money and people to gain an advantage. Why are they not open to our ideas?

Read More

A Data Science consultant working at Materna. He occasionally blogs about data and related topics here and is the host of the Dortmund Data Science Meetup.

SpiegelMining – Entity Extraction on Spiegel Online

Also I said that I’m out of the RapidMiner – Rosette API challenge, here is what I was trying to achieve. Failing is always part of learning and I hope that the few things I learned may help you anyway.

Read More

A Data Science consultant working at Materna. He occasionally blogs about data and related topics here and is the host of the Dortmund Data Science Meetup.

2017 – New Years Resolution

 

Oh well, sickness has struck me once more and sadly I have to drop the Rosette challenge for this years end. I already mined some 200 MB of unstructured data in form of subreddit postings (/r/Netrunner)  that I’ll keep for future projects. I’d like to thank the guys from RapidMiner and RosetteAPI for the offer and will definitely stay in touch with them. Anyone left for the challenge, good luck and have fun!

So what will 2017 bring to you, Paavo?

Read More

A Data Science consultant working at Materna. He occasionally blogs about data and related topics here and is the host of the Dortmund Data Science Meetup.

Crawling Twitter – A Long Story about Patience

 

Turns out, twitter.com doesn’t like it when you request tons of tweets and it took me quite a while to get a decent number of them. Here is a little update, what I received so far.

Read More

A Data Science consultant working at Materna. He occasionally blogs about data and related topics here and is the host of the Dortmund Data Science Meetup.

Powered by WordPress & Theme by Anders Norén