Wednesday, March 28, 2007

Thesis Data Cloud

A while back, Darren mentioned wanting to make a data cloud for a presentation he was giving at Northern Voice - he wanted to find a tool that he could enter all the responses he'd gotten on his Why Do You Blog survey to generate a data cloud. This got me thinking about how data clouds are an interesting way of analyzing data - you get a visual representation of how often each word is used in your document. So then I had the bright idea that I wanted to run my thesis through a program like this - I was curious to see what words I used most often. I happened to be chatting with a friend of mine who is all computer savvy and asked if he knew of any tools that could do this (as I could only find one that required that the document in question be pretty small and my thesis may be many things, but small is not one of them). And the next thing I knew, he'd written me a program! We had to do a bit of tweaking (like not including common words such "and" and "then", not including punctuation and numbers, and, of course, I had to make it use pretty colours). And when all was said and done, it was just so friggin' pretty! I love my thesis word cloud! You can check out the whole thing here, but I've included a bit of it below, just so you can get an idea of how beautiful it is!

binge biochemistry biol biological biology birth births bk black blank blind blinded blindly blood bloom blow blue boat bodies body bone bones bonjour born both bottom bouillon boundaries boundary brain breakdown breaking breed breeding briefly bringing brown bud buds buffer bull bulletin bullock burns but c ca cage cages calcification calcified calcifies calcium calculated calculation calendar caloric calories camera camp can cannot carbohydrates cardiac cardiovascular care cartilage




Seriously, go check out the whole thing here. It's friggin' cool.

Update: OK, that looks a little f'd up, since Blogger's formattin apparently doesn't work so well with the formatting of the data cloud. I guess you'll just have to go here to see how it should look.

5 comments:

Anonymous said...

I like how the word bone is so large.

Perv.

Duane Storey said...

Breeding is rather large as well..

Anonymous said...

That's very cool. Another possibility I discovered along the way is this IBM project:

http://services.alphaworks.ibm.com/manyeyes/home

Unknown said...

prenatal bone rats? Yikes!

Anonymous said...

I thought this blog is about thesis writing :)