The Curious Spy
The Curious Spy
A Novel Approach to Open Source Intelligence (OSINT)
By: Thomas V. Sobczak PhD, Owner
Thomas V. Sobczak Consultants, Baldwin, N.Y.
Two Hundred Thirty (230) years ago as our nation was forming the concept of a Citizen
Soldier was both necessary and accepted. As the United States evolved, technology
replaced the early weapons that required citizen participation. Today, America's enemies
(fascist, totalitarian, terrorist) use purloined American technology against our interests.
This causes a need for the Curious Spy. Currently, the people who are curious data
miners examine technology and uses of technology without a focus. Solutions produced
by contractors do not incorporate the novel solutions defined by the curious. Note: In the
1980s the Curious were known as hackers and phreakers. These were people who
carried curiosity to an extreme bordering on the illegal.
In 2008, the identified intelligence budget is said to be approximately $46.4 Billions.
Should the government choose to involve curious researchers in a managed environment
they might harness the genius of hundreds of unfocused professional searchers capable
to generate open source intelligence located on the Internet, in Bulletin Boards and on
accessible general purpose computers. Using one thousandth of one percent ($4,600,000)
of the supposed budget a lean management team could be created to manage, catalog,
distribute and refine results accruing from topical research peaked by curiosity. Each
searcher chosen, when vetted, could be provided a high end, fully equipped Personal
Computer incorporating video conferencing, television, and DVD motion picture screening,
Integrated Operating System Software, an Professional Office suite and a Government-
sponsored connection to the Internet. Those who chose to examine Bulletin Board
Systems, and FTP sites (university and corporate) would invoice the Lean management
team for the direct dial telephone costs. The chosen could sign an agreement transferring
title to concluded searches to the Government during the initial two years of researching
specific topics (sponsor suggested). Ownership of the equipment provided will transfer to
the researcher after two years if the conditions of the agreement were reached. Salary and
expenses could be negotiated.
Most studies being performed today by curious professionals are not cataloged. Should
a secret be located, rather than eliminate the insecurity, the secret is shared. Secrets
become the currency of the expert searcher. America loses its advantage.
Based upon past experience, a general search opens vistas (not necessarily open to the
public) narrow and focus into areas thought to be secure. As an example:
Shipbuilding Identified
PEO Ships, Crane Indiana Identified
Surface Force Enterpris Identified
NAVSEA Identified
Littoral Combat Ship Identified
SPAWAR Mission Package Identified
Naval Facility Panama City - Ocean Mines Identified
EDOCORP - Unmanned Mine Detection Vehicle Identified
(Extrapolation) RF Frequencies Identifies
Yet to be specified counter force that protects the operation of the UMDV
During the nine iterations many other topics and miscellaneous facts would be located.
As an example, did you know that a Foundry on Taiwan creating chips for the JTRS has
a sister factory in Shanghai operated by the Army of the People's republic. They appear
to have copies of the NASA created algorithms that allow inter-service communication.
Defining a Universe of Data Sources
In February 2008 there were approximately 371.5 million personal computers in use, world
wide, that connect to the internet. Users requested answers to 74.4 billion questions
annually from 55 unique search engines. A typical searcher requests eight searches per
session. A session review, depending on the difficulty of the topic, lasts three to ten hours.
On average 214.8 million searches occur daily.
Expert searchers organize the studies they conduct. They create data pipes to monitor the
results produced by the search engine that was associated with the subject search. In
most cases, the pipe identifies and catalogs the topic reference and website down to the
file or page containing the sought after data. The pipe makes data collection a valued tool
for topic management. Depending upon the specificity of the search keywords and data
collected, a search might poll and the Curious Spy examine between two and one
thousand web sites before the results satisfy. Some searches identify Bulletin Boards.
During the 1980s approximately 900,000 bulletin boards were active in the United States.
Another 50,000 identified by the user community functioned outside the United States.
Each Bulletin Board was unique. It was reached using a direct dial telephone number.
About one half of one percent were full time, for profit, sites. These sites were massive,
holding more than one dozen topical areas of 300 or more Gigabytes. Another three
percent of Bulletin Boards were full time operated by universities, professional
organizations and not for profit corporations. The majority of functioning Bulletin Boards
were part time operations created by hobbyists, college, and high school students. In 2008
approximately 27,000 Bulletin Board Systems remain. They are located equally between
the United States and the remainder of the world.
In a previous study, available at http: //sobczaksays.org/DM Pecking Order, the reader can
obtain an in depth view of the type and structure of Bulletin Board System Users. Most
Bulletin Boards had a log-in member area. This area was used to both communicate and
share search results. A researcher will find that full time Boards were unknowingly hosted
on government and commercial computers. As an example: a Bulletin Board specific to
aircraft and aircraft radio frequencies was hosted unknowingly by the Fire Department of
the City of New York (FDNY). That Board was a focal point upon which interested
searchers posted results of searching, identified alternate Boards with points of contact,
and made comparisons of competitive Bulletin Boards. In a like manner, Military RF
frequencies were cataloged on a computer locate at Navy Base Norfolk. The Navy
computer was operated by bored reservists.
In 1985 accessing a commercial computer required a knowledge of IBM operating systems
and/or UNIX (a combination operating system and programming language.) In the free
world approximately ten million commercial computers were available to those who knew
how to access them. Lazy programmers left dozens of back doors and trapdoors in
thousands of supposed secure large scale computers.
As of 2002 Sobczak attempted to define the universe of general interest items archived on
private commercial devices. These included items that later were transferred to a bulletin
board archive of a large computer. Items unique to a 1980s BBS archives were:
Newsletters 40,000 titles
Scholarly periodicals 37,600 titles
Books and Catalogs 950,000 titles
Technical Periodicals 70,000 titles
Mass Media Periodicals 900,000 titles
Newspapers 25,000 titles
Recycled Soft Disks 56 million annually
Office Documents 11 billion pages
Digital Tapes 5 million
e-mail archives 44 million disks and tapes
These remain available as an accessible and preserved archive. Searchers can access
74,000 e-mail messages archived on an Army server at Fort Monmouth, New Jersey
(federal laws require the e-mails be archived). By the mid 1990s the majority of
information previously located on General Use Computers had transferred to and were
partially archived on the internet based computers. The Commercial Computer beside
being a pre-internet archive remains a place for hackers to learn and practice their skills.
Looking at data repositories from a different point of view, the reader should note that the
free world has 38.5 machines that are Internet connected versus a single machine in
aggressor nations (China, Russia, Korea, et.al). The volume of free world connected
information makes it easy for an aggressor to search for secrets. Conversely, a Free
World user has little if any access to information in an aggressor nation. He must work
harder to obtain a validated result. For the readers information, there are many individuals,
functioning outside our government, capable of accessing aggressor devices.
Digging for Data
Most individuals who search do so because of topical curiosity. They read about or see
news that mentions a term or idea that is unfamiliar. They try to learn about the topic by
searching. They use a search engine to further define the topic. This innocent act leads
to more in depth searching. (Attachment A identifies 56 "free to use" Search Engines). As
the search narrows and the sought after information is located, curiosity produces more
specific explanations that tend to define the proprietary and secrets. Most times the novice
searcher does not recognized the potential secret revealed.
In a search to define "Brigade Level Intelligence" mentioned in the Third Infantry Division
After Action Report (AAR) of its Iraq deployment, that an unnamed and unauthorized
individual posted on the Internet, we were curious about a weapon system called "Prophet"
that was the state of the art for the war fighter. A Google search of the word "Prophet"
produced 17,000 hits. Ninety-seven percent (97%) identified the prophet Mohammed.
Linking the term Prophet with the term Intelligence reduced the number of hits to 300.
Reviewing the search results produced the potential for 174 new searches. Should the
reader want to experiment, he might search to locate the mention of and copy defining the
single US Army document that funds a million dollar sole source study contract geared to
a specific large contractor with a satellite office outside Fort Monmouth, New Jersey.
To determine the ease of conducting topical searches we recommend using the same
exact term with four search engines. The four with the largest returns are Google,
Copernic, Web Ferret and Clusty. Proper structuring of search terms goes a long way to
assure suitable results.
Curiosity Searching: an Example
Curiosity sparked a search for "shipbuilding Definitions." This question came to mind to
define the terms causing the US Navy and Lockheed-Martin to dispute interpretations in
a contract for a Littoral Combat Ship (LCS). It appeared that the Lockheed contract was
cancelled because each party to the contract interpreted the requirements of the contract
differently. A web site identified by the Clusty search engine was a potential data source.
The site was located in Thailand (xxx.com.th). A review of the site identified a BLOG that
contained detailed technical information and lists of web sites that hosted information
about LCS. Curiosity caused the narrowing of the search. Why would technical detail
concerning a yet to be launched, state of the art, Naval Vessel be located on a web site
in Thailand?
The BLOG was interesting as it was geared specifically to the Littoral Combat Ship. Only
five individuals contributed to the shared information. The BLOG contained information
and verifiable data sources at NAVSEA, SPAWAR, SWE, PEO-Ships, Naval Facility -
Panama City Florida, and EDOCORP among its many references. Access did not evoke
any security warnings. No one of the five participants identified in the BLOG was
American. The participants were from Great Britain (IWAR), Germany (UKA), Cuba
(AMIGOSdePIAS), Qatar (no name) and the Netherlands (no name). Individuals
contributing information appeared to be relating data, obtained from an individual who was
currently active Navy, about the Littoral Combat Vessel and its technology.
Reading the BLOG led to a further narrowing of the search. The bloggers interest centered
on an SPAWAR Mission Package for an unmanned ocean mine detection system. The
developer was EDOCORP with references to Northrup Grumman. The US NAVY program
manager was located at Panama City, Florida. The BLOG exchanges concentrated on the
location of a RF transmitter and receiver on the upper yardarm at a location on the
starboard frame (14-30 MHz - receiver and 225-400 MHz - transmitter). The goal of their
discussion was to configure a means to disrupt RF transmissions between the LCS and
the unmanned vehicle and thereby sabotage remote controls.
The unmanned system specification, including a possible companion product (the
helicopter-based OASIS), was detailed with web site data sources referenced. Web sites
mentioned were at NAVSEA, SPAWAR, PEO-Ships among many identified. This OSINT
data was being shared outside the control of information security. It came from people with
access to it who exhibit poor or bad security habits. Navy procurement executives appear
to be preparing for life after retirement by sharing in preparation for a job.
Curiosity causes an Aggressive response
After following the BLOG participants for a time, Sobczak asked for clarification of a BLOG
comment. A message from the BLOG web master asking who, what, why and how came
back. I explained how curiosity drove my search. As we exchanged postings, my
computer as attacked. At first a simple format the hard drive command was rejected by
my home grown security. Next, a virus was embedded in a posting. Fortunately, I had
conceived and still use a security system that identifies changes and differences to files
and folders on my hard drives. Locating the virus chain allowed for its eradication.
For the next few days, monitoring my computer did not unearth problems. The next attack
occurred about one week after the first. It was an electrical surge caused by dumping
several capacitors simultaneously through my connection. Before I realized that an attack
was in progress my sound card was fried. I lost telephone service for a day. It made me
believe circuits were fried along the route to my computer. Years of studying the
hacker/phreaker community allowed me to collect tools and countermeasures. My
computer had been modified to dissipate a surge. In the 1980s we played similar
dangerous games. My security tool functions like a lightening rod on your roof to dissipate
the charge.
The web site hosting the BLOG disappeared. My methods trying to locate it failed. What
succeeded was the use of an older computer using a third generation operating system
working through a different ISP. Evaluation showed my original computer was logged and
cataloged when the initial contact occurred. The BLOG's web site was made unavailable
by the web master's initiative.
Using an old hacker trace program, I followed multiple paths to the real home of the
aggressive web site. It was on Hong Kong Island. The indicated server belonged to the
Port Manager of the Naval Service of the Army of the Peoples Republic of China.
I reported my experiences to NAVSEA, SWE, and SPAWAR. No action was taken.
Periodically I check to see if any actions are taken to close the open doors, that were
previously available for browsing. The doors remain open. I am forced to conclude that
the National Security bureaucracy is an "old boy" network that does not believe or trust an
outsider.
Conclusion
The suggested use of qualified curious Americans to spy upon the universe of freely
available information is a cost-effective means to identify national insecurity and offer
recommendations necessary to eliminate information losses. A minimum expenditure of
funds will allow the Intelligence Community to obtain an information source they currently
ignore, the professional curious Searcher. Someone might tell Mr. Rockefeller that his
Intelligence Committee should recognize the curious patriot. He could save the taxpayer
a great deal of money while tightening security.
Attachment
A survey of freely available Search Engines
(All web sites are http: //www unless otherwise specified)
A
about.com
accoona.com
accessmylibrary.com
alexa.com
altavista.digital.com
alltheweb.com
i2inc.com (analyst's notebook)
ftp-sites.otg/anonymous_ftp_sites_list_net/html (Anonymous FTP Sites)
ask.com
B
bigfoot.com
botspot.com
intelliseek.com (Bullseye)
C
clusty.com
copernic.com
cuil.com
D
deja.com
digg.com
dogpile.com
E
euroseek.net
excite.com
F
FAS.org
http://call.army.mil/call/fmso/finso.html
four11.com
thefreedictionary.com
G
google.com
H
highway61.com
hotbot.com
I
infogist.com
infoseek.com
isysdev.com
ixquick.com
K
kartoo.com
L
lexibot.com
lycos.com
M
metacrawler.com
search.msn.com (MSN Search Engine)
mooter.com
N
nbci.com (Formerly snap.com)
northernlight.com
O
dmoz.org (Open Dictionary Project)
digital.library.upenn.edu/books/
os-mosis.com (OS-Mosis)
P
pipes.yahoo.com/pipes/
profusion.com
S
kryltech.com (Subject Search Spider)
systransoft.com (language translator)
T
technorati.com
tenmax.com (allows collection for working off line)
W
ferretsoft.com (Web Ferret)
wisdombuilder.com
webcrawler.com
wisenut.com
Y
yahoo.com
Z
zylab.com