Wednesday, October 24, 2012

The Internet vs. Musty Old Books


Today I am going to blog about Data Mining.  How research is ever changing.  

In Googling the Victorians Patrick Leary discusses how the broad availability of information is changing how we do scholarly research.    He makes a statement that at the same time depresses and excites me.  He says “The eureka moments in the life of today’s questing scholar-adventurer are much more likely to take place in front of a computer screen.”  I think the adventurer inside of me just died a little.  However, the budding historian is excited.  This means that I will not always have to go through hours of papers and microfilm and musty documents just to get a morsel of information.  Now I can use the Internet search engines and not only get the information I was looking for but, it can connect me with people or resources I didn’t even know existed.  However, this will require a new set of skills.  No longer are we scanning columns of texts, now we are learning how to use search engines to pull what we need from the mass of information.

This ties into another article I just read.  In From Babel to Knowledge:  Data Mining Large Digital Collections by Daniel Cohen he talks (in tech speak, lots of tech speak) about how you can create search engines that are designed to create a more structured search.  By using online encyclopedia’s (including the dreaded Wikipedia) you can use certain phrases to retrieve relevant entries.  You will still have to ‘mine’ through the data but it has taken a lot of the dirt out.   Cohen also mentions that as you are digitizing items it may be better to digitize more documents at a lower quality than less at higher quality, quantity over quality if you will.  This does make sense.  It would give you more documents to search and would give you more information.     
On the Digital History Hacks  blog this entry is very interesting.  Using the search data that AOL released they were able to analyze how people search the Internet.  I think that can be very helpful today as we start using meta tags to tag our online exhibits.  We can take into account that people seem to search in the adjectival form or using a possessive form.  Also, I did not know that you couldn’t search numerical date ranges.  After reading these articles however I am not sure why I ever thought you could.

I know that (for now) there are still things that will only be found in musty books and old newspaper clippings.  Using the Internet correctly will help direct me to those research sources and hopefully make my research more efficient. But no matter how much easier it is to click a mouse… I still love the musty smell of an old book.

No comments:

Post a Comment