|
In this chapter:
Browsing
Search Tools
Search Methodology
Common
Searching Mistakes & Strategies to Avoid
Advanced Search
Techniques
Merriam-Webster defines browsing as “looking over
or through an aggregate of things casually especially in search of
something of interest.” (2003).
Browsing is a dramatically underestimated research
tool. Hyperlinking encourages browsing by design. This is the
essential functionality of the Internet. Investigators can browse
various subjects, webrings, web communities, and so on to get more
acquainted with an unfamiliar topic. However, an investigator must be
careful when browsing to stay focused on the task at hand. During a
browsing session, a user can easily be distracted or spend several hours
looking at useless or irrelevant information. Carefully controlled
browsing can be effective and should not be overlooked (Weinberger,
N.D.). The best
investigative use for browsing is to gain familiarity with a particular
web environment. For example, when locating the website of a
company or individual, it may be advantageous to browse through the
entire site rather than searching for a specific piece of information.
Sometimes this casual looking pays big dividends when an
investigator stumbles onto a previously unknown nugget of valuable
information. Fortunately,
users in search of more specific pieces of information can utilize the
various tools available to search the Internet.
Top of Page
The Internet is a vast information resource. Being
able to effectively search for information is an essential tool in the
Investigator’s toolbox. There are several different tools that are
available for searching information on the Internet. It is
important for the investigator to understand these different tools, how
they work, and how to select the proper tool for the job. After
completing this chapter, refer to the
Search Engine Showdown feature chart for a handy reference to the
various search engine settings.
Search Engines. These tools contain indexes
of the full-text of selected Web pages. They offer searching by keyword,
trying to match exactly the words in the pages. Traditional search
engines offer no browsing or subject categories and their databases or
“indexes” are compiled by automated software programs, called spiders or
crawlers, with minimal human intervention (Barker, 2003, Types).
Search engines may be general or specialized according to one category of
information.
Pros: Targeted, full-text
searching, breadth and depth of information.
Cons: No annotations or
browsing, not helpful for introductory information.
Examples:
Google,
AltaVista,
WiseNut,
Teoma,
AllTheWeb.
Meta-search Engines. These devices search
multiple search engines simultaneously and return compiled results; they
catch approximately 10% of search results in any of the search engines
they visit. Meta-search engines can be an effective tool for searching
many engines at once but usually lack the depth of results provided by
more traditional search engines (Barker, 2003, Types).
Pros: Semi-targeted, search
multiple sources at one time.
Cons: May not process search
queries correctly, lacks depth of results.
Examples:
Dogpile,
Search.com,
IxQuick,
SurfWax,
ZapMeta.
Subject Directories. These directories
include hand selected sites picked by editors, organized into
hierarchical subject categories often annotated with descriptions.
Users may browse subject categories or search using broad general
terms. No full-text search of documents is available. Users can only
search the text of the subject categories and descriptions (Barker,
2003, Types).
Pros: Annotations, well
organized topics, excellent for introductory information.
Cons: No full-text searching,
errors by human editors, maintenance of index is questionable.
Examples:
Yahoo!,
LookSmart,
Open Directory Project,
About.com.
Subject Guides. Guides are webpages
containing
collections of hypertext links on a subject, compiled by expert subject
specialists, agencies, associations, and hobbyists. Guides are useful
for getting acquainted with an unfamiliar subject or topic. They often
provide links to the most popular or utilized webpages pertaining to a
particular subject (Barker, 2003, Types). Frequently, these guides are denoted by the
titles "Links" or "Resources."
Pros: Very detailed, may have
links to otherwise unknown sources.
Cons: Usually no searching,
questionable maintenance, reliant on a single source of input.
Examples:
NHCAA Resources,
IASIU
Links,
IALEIA Links.
Specialized Databases. There is some
information available through the Internet that is not
searchable by the traditional search tools described above. This
information resides in databases made available by various data
providers (Barker, 2003, Types). These hosts provide their own search interface to this data.
For more information on this topic see the section entitled “Invisible
Web” below.
Pros: Excellent data quality,
well maintained, very targeted information.
Cons: Interfaces and
functionality vary, reliance on homegrown search function.
Examples:
LexisNexis,
ChoicePoint,
Accurint,
ISO ClaimSearch,
Ebay,
Amazon.
In the early days of web searching, these different
tools were easy to identify. However, some confusion among users
now exists because the web search industry has undergone many changes
and as a result, there are now many sites that offer combined tools.
Many Search Engines and Subject Directories in particular have
consolidated into a one-stop search tool. For more information
about specific search tools, visit
Search
Engine Showdown or
Search Engine
Watch.
Top of Page
Top of Page
Focusing on Popular Links. Everyone has different information needs and
interests. Simply because the site is “recommended” doesn’t mean it is
the best source for you to use. Recommended sites are often based on
financial considerations (those sites that pay a fee to the search engine
become “recommended”) or link popularity (Barker, 2002, Search). Keep in mind that others
might be visiting these sites for different reasons. Click carefully
and make your own evaluations.
Ignoring Stop Words. “Stop Words” are words that search engines ignore
because they are too common to be useful search criteria. Common stop
words are adverbs, conjunctions, prepositions, and all forms of the word
be (Sherman, 2002, Part 1). If you searched for:
to be or not to be
all
the words would be excluded except the word not.
The words to and
be are stop words
and would automatically be ignored. The word
or is a Boolean
operator. In this case,
not is the
only searchable word in the set. Therefore, the search engine only
searched for:
not
In this example, use of quotation marks would allow
a search for the entire phrase, including the stop words. The
proper search query would be:
"to be or not to be"
It is important to note that different
search engines treat stop words differently. Refer to the Help or
Advanced Search features on your search site for more information.
Misusing Boolean Operators. Use of Boolean logic can strengthen a search
tremendously. Unfortunately, use of Boolean operators can be confusing
for some users. To make matters worse, different search engines may
interpret the same operators in different ways (Sherman, 2002, Part 1). Be sure and learn how
Boolean operators are used by your favorite search tools. Remember to
check out what the default Boolean settings are! For more
information on Boolean operators, see
Advanced Search Techniques
below.
Ignoring Case Sensitivity. Some search engines are case sensitive. Generally,
it is best to search with all lowercase letters unless searching for a
proper noun like a name or place (Sherman, 2002, Concluded). Often, this method will result in
hits of all cases (UPPERCASE, lowercase, and Titlecase). Again, it is
important to take note of a search engines default settings with regard
to case sensitivity. To determine whether or not a search engine
is case sensitive, see the search engine's Help feature or refer to the
Search Engine Showdown's feature chart.
Poor Grammar. Unlike humans, computers have a difficult time
determining intent. They are unable to hear inflections in tone or read
body language. The only information a search engine has is the
information provided in the form of search terms. Unfortunately, the
overwhelming majority of novice and intermediate searchers fail to take
into consideration these subtleties of the English language. These
idiosyncrasies or "Seven Deadly Nyms" (Sherman, 2002) can be the death of an otherwise good search.
Contronyms – a word
that has multiple meanings that contradict the others. Examples:
Hysterical (overwhelmed with fear vs. outrageously funny). Fast (moving
quickly vs. firmly stuck in place).
Heteronyms – words that are
spelled identically but have different meanings when pronounced
differently. Examples: bow, desert, object, lead.
Polyonyms – different words that
have the same meaning. Example: Devil, Beelzebub, Lucifer, Satan.
Homonyms – words that have the
same sound but a completely different meaning (and sometimes spelling).
Example: to, two, too.
Capitonyms – words that change
pronunciation and/or meaning when capitalized. Examples: polish vs.
Polish, amber vs. Amber.
Exonym – a place name that
foreigners use instead of the name that natives use. Examples:
Cologne:Koln, Morocco:Moroc.
Top of Page
Once an Internet researcher moves beyond the novice
stage, more flexibility in searching is often desired. All the major
search engines provide an advanced search page which offers more
flexibility in search logic and allows for the change of default
settings used in the simple search. In addition to this functionality,
a researcher can take advantage of the more advanced searching functions
listed below.
Boolean Logic
Using these Boolean operators can improve the
results of your search. Remember that to use them effectively, be
sure to check on how the search tool being used interprets these
operators and whether or not it is case sensitive.
The techniques described in this section are based on the functionality
of most major search engines. However, each search engine
functions differently and readers are encouraged to consult the Help
feature on each search engine website before employing these techniques.
Power Searching
Link. The link search feature allows you to search for
all the pages linking to a particular page or domain that you specify.
For example, to find webpages that contain links to Microsoft.com, you would type link:microsoft.com
Top of Page
Proceed to Chapter 5: The Deep Web

|