Five Online Challenges Facing Detectives and How to Overcome Them

Johnmichael O’Hare

Online sources open a new world of information which can help detectives find threat actors, speed up investigations and protect lives. But, there’s a catch.

As part of any successful investigation, police departments must effectively collect, ingest and analyze vast amounts of data. Fortunately, a data management strategy and supporting technologies provide a way to tame the data explosion.

These are the top online challenges law enforcement investigators face and how they work around them: 

Big Data and Data Analysis

Information management was arduous enough when investigators relied mostly on paper documents housed in filing cabinets. Online sources have significantly increased the amount of data potentially available to aid in an investigation. One terabyte of data – hardly a remarkable amount by today’s standards – can contain more than 80 million document pages. The paper equivalent would take up an absurd number of filing cabinets.

That’s the essence of the so-called big data problem: sifting through an over- abundance of data to identify the pivotal, high value information. And, it’s not just the volume of data that’s the issue. In addition to structured data such as database records, organizations must also cope with unstructured data such as digital photos, video, social asset data, and the aforementioned documents. In addition, there’s the gray area of semi-structured datasets which lack the structural characteristics of a database record, but include elements such as tagging which define a document.

Police detectives need both a systematic management approach and technology to process big data. On the management side, they also require compliancy policies which govern online investigations, keeping in mind the salient regulations and legal principles.

The technical underpinnings of online investigations include a repository for storing large volumes of data and specialized tools for searching the data store and analyzing the information. Natural Language Processing (NLP), a type of artificial intelligence (AI), is critical here. NLP converts human language, whether text or spoken word, into a format a computer can process.

With NLP, investigators can query the database using code words, jargon, hashtags, and keywords associated with threat actors, groups or activities – local terminology for narcotics sold within a jurisdiction, for example. This AI-based approach lets organizations sift through terabytes of data to find the information they need to advance an investigation.

De-anonymization of Online Threat Actors Across the Deep and Dark Web

When threat actors conduct business online, they enjoy a certain level of anonymity. The surface Web – the commonly used layer of social assets and indexed Web sites – provides the ability to use “handles” or create fake accounts to mask identity. This basic anonymity intensifies in the Web’s deep and dark layers which are not indexed via conventional search engines. The dark Web, in particular, enables sophisticated threat actors to conceal themselves using anonymizing routers and proxy servers, for example. Threat actors in the dark Web may traffic stolen credit card data, sell illicit drugs or cultivate extremism. The scale of the various Web layers – featuring some one billion-plus Web sites – coupled with various cloaking approaches, complicate online investigations. Where do detectives start their investigation to positively identify crooked individuals?

Threat actors, however, leave digital footprints as they traverse the Web’s layers: a dark Web site linked to a social asset on the surface web or an encryption key assigned to a regular E-mail account, for instance. De-anonymization boils down to connecting the dots among bits of information gleaned from the surface Web and the online world’s subterranean tiers. But, to succeed in identifying threat actors, investigators will need a knowledge of the dark Web, relevant Web intelligence (WEBINT) techniques and a specialized browser which can access hidden sites, forums and marketplaces. As with combing through big data, investigators should consider enlisting AI to help piece together identities. AI and the related field of machine learning can help law enforcement agencies correlate the bits of information which surface during an investigation, assisting with de-anonymization. Investigative experience and intuition remain paramount, but those AI technology can extend those human capabilities.

Time and Accuracy of Investigations

Timeliness and the reliability of information are top concerns in any investigation. Online inquiries introduce some considerations specific to electronic data gathering and analysis, chief of which is volume. Finding actionable, accurate intelligence in a sea of data takes time and effort. Resource strapped organizations relying on manual searches and data sleuthing will probably find the job simply takes too long when time is critical. Agency leadership will soon drop support for online investigations which span days and appear to produce nothing. After all, taking too long to find a threat actor could give them time to disappear or result in more crimes being committed.

WEBINT combined with AI, however, can automate searches, dramatically accelerating online investigations while also improving accuracy. This intelligent automation should, ideally, span the surface, as well as deep and dark Web layers. It should also enable investigators to create custom search parameters which could include details such as a threat group’s hashtag, terminology and location data (names of countries, cities, streets, etc.). The ability to program a holistic search and turn it loose across the Web saves time. But, investigative organizations also need the power of intelligent automation to rapidly correlate data. Relying solely on detectives to find the connections among seemingly disparate pieces of data will add hours, if not days, to an investigation. Correlation also helps unmask threat actors, as noted, leading investigators toward the data they need to maintain as evidence.

Faster investigation means faster interdiction. When searchers uncover threat actors’ plans – whether an organized retail theft or extremist action – agencies can protect property and, potentially, save lives. Automation also provides business value and return on investment for police departments. Greater investigative efficiencies will result in investigators spending fewer hours on inquires, resulting in a proportional cost savings. This ultimately helps create a positive image for police departments. As more criminal cases are solved, an enhanced and positive perception will be created that crime is being managed. Agencies can achieve a virtuous circle. 

Threat Intelligence

Getting a jump on threat actors’ plans depends on obtaining reliable threat intelligence. The threat intelligence domain aims to proactively acquire information on emerging dangers or crimes which are being planned so police officers can institute preventative strategies and tactics. The practice is often associated with financial institutions fending off cyberattacks, but it also applies to law enforcement agencies.

Indeed, law enforcement departments with poor threat intelligence can be caught off guard. A detective investigating the shipment of illegal firearms could miss “online chatter” between buyers and sellers planning to traffic the weapons and receive payment for the shipment. The problem often stems for a lack of investigative tools or approaches which are limited in scope. For example, police investigators may maintain good intelligence on individuals who sell illegal firearms at the “street level,” but lack the ability to probe what is happening in the online world where planning, pricing, logistics, and trading is discussed. Similarly, an agency using tools limited to the surface Web will find it difficult to anticipate extremist groups which plan their actions on the dark Web. In both cases, threat intelligence is poor or nonexistent.

Automated WEBINT facilitates threat intelligence. The ability to quickly aggregate and search social asset data, for example, can help expose relationships among threat actors and gain insight into their plans. Dark Web search capabilities offer an additional window into activities in the works. Overall, data driven threat intelligence can help agencies snuff out problems before they materialize, with obvious benefits for lives and property. Agencies can also make strides toward greater investigative efficiency. Threat intelligence can help investigators prioritize threats and focus their energies on the most pressing risks. That way, agencies can deploy human intelligence (HUMINT) to its greatest effect. That’s a huge plus for agencies facing staffing constraints.

Converting Online Data to Evidence

The end game of an investigation is preserving and presenting evidence which leads to indictments and convictions. Converting data gathered online to evidence requires due diligence to make sure agencies have properly identified the threat actor and the online platform used to make a threat. Investigators will send preservation letters to the relevant platforms, so the information is maintained and safeguarded. The subpoena process then follows.

Detectives often find the conversion process challenging. The issues range from learning how to request data from a hyperscale Web platform to documenting online investigative methods. But, WEBINT and intelligent automation can support this task. AI’s precision and ability to construct finely-tuned searches builds confidence in the trustworthiness of the data, expediting due diligence.

Automation also comes into play at the end of the subpoena process which can result in a data dump of staggering proportions. Investigative agencies should have an automated system on hand for ingesting and processing large data sets. Without such a mechanism, an agency’s online investigation can go to waste.

Online Investigations Require a Comprehensive, Automated Method

An online investigation can tap a rich store of information previously unavailable to law enforcement agencies. But, that potential will remain unrealized without a comprehensive strategy and appropriate technology for gathering and analyzing massive amounts of data. The melding of WEBINT, automation and time-tested HUMINT speeds up investigations and increases confidence in the data generated. AI structures precise queries which provide a wide-angle view of threat actors and increases confidence in data quality. And, data correlation pieces together informational breadcrumbs to reveal threat actor identities.

Taken together, those techniques and tools help agencies overcome the dual challenge of big data and short deadlines.

Johnmichael O’Hare is the sales and business development director of Cobwebs Technologies ( He is the former Commander of the Vice, Intelligence and Narcotics Division for the Hartford (Connecticut) Police Department. Prior to that, he was the Project Developer for the City of Hartford’s Capital City Command Center (C4), a Real-Time Crime Center (RTCC) which reaches throughout Hartford County and beyond. C4 provided real-time and investigative support for local, state and federal law enforcement partners utilizing multiple layers of forensic tools, coupled with data resources and real-time intelligence. Contact him at