Home | About | Contact | FAQ | Search | Privacy Policy | Terms & Conditions | Credits

 
Table of Contents
1 Introduction
2 Internet Investigation
3 Domains
4 Searching the Web
5 Deep Web
6 News & Newsgroups
7 Records Research
8 Organizations
9 Anonymous Investigation
References
Tools & Resources
 
   
 
2. Internet Investigation 101
 

In this chapter:

Introduction

Data vs. Information

Information Classification

Caveats

Validating Internet Sources

 

Introduction

The Internet is the largest data source on the planet and it can be an superb research tool when used correctly. The problem most people have when they begin to conduct research on the Internet is that they don’t know where to start. The key to efficient research in any medium is information management—parsing out the useful data while avoiding superfluous information. This process becomes more and more difficult as the information source gets larger. Thus, the Internet can seem like an insurmountable challenge to a novice researcher.

This handbook is designed to be a resource for investigators when conducting online research in relation to disability insurance claim fraud investigations. It is not a definitive guide on public records, research methods, or investigative strategy. This guide is designed to give investigators the background and tools they need to effectively gather online information during a fraud investigation.

Top of Page

Data vs. Information

Many people use the terms “data” and “information” as synonyms. However, there are differences and these differences are fundamental to conducting Internet research. Data consists of facts. Investigators are accustomed to gathering and reviewing all the available facts. Information results from the analysis and interpretation of data (Dictionary.com, N.D.).  Utilizing this definition, it is important to understand that data alone have no meaning. Data are only valuable when they are interpreted, combined with context, and thus turned into information.  As you read through this handbook, remember the following formula:

Data + Context + Analysis = Information

Internet researchers sometimes focus on Data without spending sufficient effort identifying context or conducting analysis.  Internet sources can help with each of these items and it is important to remember all of them are required in order to obtain useful and actionable information.  Information can be divided into the three major categories (Individual Reference Services Group, 1997) described in the section below.

Top of Page

Information Classification

Publicly available information is defined as information about an individual that is available to the general public from non-governmental sources such as telephone directories, classified ads, newspaper reports, publications, or other forms of information (Individual Reference Services Group, 1997).  An Investigator does not need any special permission or authorization to obtain this type of information.  Additionally, a large amount of publicly available information is accessible for free on the internet. Many traditional hard copy sources of non-public information such as newspapers and phonebooks now have online counterparts.  The advantage of the online version is that a user is able to search the contents of many documents at once as opposed to scouring over pages of articles or numbers by hand.  Nearly all major newspapers and many local newspapers have provided users with the ability to search their article archives for free.  Searching online for a claimant’s name or address in a local paper can provide excellent results.  For more information, see the section on News Information.

Public record information is defined as information about or related to an individual which has been obtained originally from the records of a federal, state, or local governmental entity that are open for public inspection (Individual Reference Services Group, 1997).  Contingent upon state statutes, many of the records collected by government agencies are available for public viewing.  Property deeds, motor vehicle records, liens, judgments, business credit reports, and professional licenses are some examples of public records.  Most of these records can be obtained without special authorization or notification to the claimant. Public record information is available from a variety of sources, the most reliable of which is the record generator - the government agency that procured the record.  Other sources include data aggregators who collect public records from a variety of primary sources and combine the information in easily accessible databases.  Services like ChoicePoint and Lexis-Nexis provide this type of information.  These services are less reliable than the original source because they need to collect and update their databases and updates can range from hourly to annually.  However, the advantage provided by these types of services is that a variety of information is available in a single place, usually within seconds whereas attempting to get data from a government office can take weeks or months.  These services usually require a subscription and all of them are fee-based. There are, however, some public records that have been made available online via the Internet free of charge.  In most cases, a government office has decided to provide internet accessibility to a database under it’s control.  Investigators can take advantage of these free services but caution should be used.  Free services are notorious for being out of date or otherwise disabled.  Information obtained via a free internet service should always be verified by another reputable independent source.  For more information about public records, see the section on Records Research and review the section below on Validating Internet Sources.

Non-public information is defined as information about an individual that is of a private nature and neither available to the general public nor obtainable from a public record (Individual Reference Services Group, 1997).  Examples of non-public information are medical records, tax records, employment history, and personal credit reports.  In order to legally obtain this information, an investigator must have authorization from the claimant and, for credit reports, a legitimate permissible purpose according to Fair Credit Reporting Act (2002). Non-public information is generally available through a limited number of sources.  Examples of these are:  the individual claimant themselves, the agency/organization/company who collects the information such as the Internal Revenue Service, doctor, credit agency, or a designated third party such as an attorney.

Top of Page

Caveats

The primary use for the types of records described above is independent verification of claimant provided data. Obviously, these sources can be used when the claimant is hesitant to provide this type of information on his or her own. In this case, multiple independent sources should be consulted where possible. The validity, creditability, and integrity of information is increased dramatically when multiple independent corroborating sources are used.

One of the main reasons for the explosive and continual growth of the Internet is the lack of regulation and control over it. Anyone in the world can post information to the Internet or setup a website (Cohen, 2003, Conducting). This is what makes the Internet the wonderful resource that it is. Unfortunately, it is also a cause for concern when using the Internet as a research tool. It is often difficult to verify information obtained from the Internet. It is very simple for one person with an outdated home computer to make a website that looks as professional as that of a multi-billion dollar corporation. If someone were to purposely misrepresent themselves, upon cursory examination it would be difficult to determine the subterfuge. It is extremely important to verify any information obtained from an Internet source. Investigators should be prepared to deal with two key issues when conducting research using free Internet sources:

Information Overload. There is a huge volume of information available on the Internet and even seemingly obscure topics can have large numbers of relevant websites or documents. Investigators need to be able to evaluate this information expeditiously.

Validity. Misinformation abounds on the Internet. Urban legends, fictitious accounts, ghostwriters, and various types of misleading information can confuse the researcher. Since anyone can publish information on the Web, it is important to validate everything. When validating Internet sources, the best method is to use a non-Internet source.

Dated Material. The Internet has the ability to deliver information at lightning speed. In the business world today, information flow is quick and constant. Computer users have come to rely on speedy information delivery. Users have grown accustomed to this rapid delivery of information and now there is a notion that all information on the Internet is current, up-to-date, and cutting edge. This is a dangerous and incorrect assumption. The Web is brimming with outdated articles and sites (Jesdanun, 2003).  Be careful when reviewing Internet-provided information and make sure to verify the dates!  Simply because a source contains a date does not mean that it is correct.  Attempt to verify this information by using the techniques described below.

Top of Page

Validating Internet Sources

There are several strategies that Investigators can employ when evaluating information found on the Internet.

Analyze the Domain.  Domains themselves may provide validation information about the source.  Top-level domain (TLD) sites are generally more reliable than sites that are published in user communities.  For example www.realbusiness.com is likely to be more reputable than www.geocities.com/~johnsmith/useless/index.html (Barker, 2003, Evaluating).  Also, take note of the domain type.  Common types are .com, .org, and .edu.  For more information, see Anatomy of a URL.

Determine the Author.  Not all information posted online contains a byline or other indication of who the author is.  It is important for an investigator to know where the information is coming and whether or not that source is reliable.  Look for attributions in headings, titles, or at the end of a document.  For more tips, see Determining Domain Ownership.  Also, when reviewing websites of companies or organizations, look for brick-and-mortar contact information.  Website authors who provide brick-and-mortar contact information like physical address and phone number may be more reliable.   Information provided by a verifiable authority, such as the Secretary of State, is more reliable than an aggregator or reseller of information.

Identify Bias.  Based on the analysis of the domain, determine any bias that the author may have (Kirk, 2002).  .com domains generally belong to for-profit corporations.  Does the information contain biased marketing hype? .edu domains are reserved for educational institutions.  Has the information been posted by a reputable professor or a undergraduate student?  Researchers should ask themselves, why did this person publish this material online?  Identifying the motivation for publishing material can help identify possible bias.  Biased information is not necessarily bad, but the context must be correctly understood and incorporated into any decision made based on that information.

Scan page perimeter.  When seeking identifying information on a website, look to the perimeter of the pages within that site (Barker, 2003, Evaluating).  A standard website format will include some identifying banner across the top of the page, links to other pages on the left or right-hand sides, and miscellaneous details at the bottom of the page.  Often, links to privacy policies, service listings, resumes, or other useful material will be hidden at the bottom of a page.  Don't forget to scroll down!

Additional Methods.  There are several other ways to validate internet resources.  Contacting the source directly by phone, email, or snail mail may provide the authentication required.  Also, use the advanced features of some search engines to determine the popularity of a website.  For example searching for link:www.microsoft.com with Google will show you all the sites that link to Microsoft.com. If reputable sites have chosen to link to the target site, this may be a sign of credibility (Barker, 2003, Evaluating).  Also, searching for an author or company name in a major search engine may yield information that validates the credibility of the source.  For more information about how to do this, see Searching the Web.

Top of Page

 

Proceed to Chapter 3:  Domains

 

   
  © 2003-2004 James D. Ruotolo.  All rights reserved.

last updated December, 2003