Today I am going to blog about Data Mining. How research is ever changing.
In Googling the Victorians Patrick Leary discusses how the broad
availability of information is changing how we do scholarly research. He makes a statement that at the same time
depresses and excites me. He says “The eureka moments in the life
of today’s questing scholar-adventurer are much more likely to take place in
front of a computer screen.” I think the
adventurer inside of me just died a little.
However, the budding historian is excited. This means that I will not always have to go through
hours of papers and microfilm and musty documents just to get a morsel of
information. Now I can use the Internet
search engines and not only get the information I was looking for but, it can connect
me with people or resources I didn’t even know existed. However, this will require a new set of
skills. No longer are we scanning
columns of texts, now we are learning how to use search engines to pull what we
need from the mass of information.
This ties into another article I just read.
In From
Babel to Knowledge: Data Mining Large Digital Collections
by
Daniel Cohen he talks (in tech speak, lots of tech speak) about how you can
create search engines that are designed to create a more structured
search. By using online encyclopedia’s
(including the dreaded Wikipedia) you can use certain phrases to retrieve relevant
entries. You will still have to ‘mine’
through the data but it has taken a lot of the dirt out. Cohen also mentions that as you are
digitizing items it may be better to digitize more documents at a lower quality
than less at higher quality, quantity over quality if you will. This does make sense. It would give you more documents to search
and would give you more information.
On the Digital History Hacks
blog this entry is very interesting. Using the search data that AOL released they were able to analyze how people search the
Internet. I think that can be very
helpful today as we start using meta tags to tag our online exhibits. We can take into account that people seem to
search in the adjectival form or using a possessive form. Also, I did not know that you couldn’t search
numerical date ranges. After reading
these articles however I am not sure why I ever thought you could.
I know that (for now) there are still things that will only be
found in musty books and old newspaper clippings. Using the Internet correctly will help direct
me to those research sources and hopefully make my research more efficient. But no
matter how much easier it is to click a mouse… I still love the musty smell of an old
book.