McClatchy DC Logo

Data mining tells government and business a lot about you | McClatchy Washington Bureau

×
    • Customer Service
    • Mobile & Apps
    • Contact Us
    • Newsletters
    • Subscriber Services

    • All White House
    • Russia
    • All Congress
    • Budget
    • All Justice
    • Supreme Court
    • DOJ
    • Criminal Justice
    • All Elections
    • Campaigns
    • Midterms
    • The Influencer Series
    • All Policy
    • National Security
    • Guantanamo
    • Environment
    • Climate
    • Energy
    • Water Rights
    • Guns
    • Poverty
    • Health Care
    • Immigration
    • Trade
    • Civil Rights
    • Agriculture
    • Technology
    • Cybersecurity
    • All Nation & World
    • National
    • Regional
    • The East
    • The West
    • The Midwest
    • The South
    • World
    • Diplomacy
    • Latin America
    • Investigations
  • Podcasts
    • All Opinion
    • Political Cartoons

  • Our Newsrooms

Latest News

Data mining tells government and business a lot about you

Robert S. Boyd - Knight Ridder Newspapers

    ORDER REPRINT →

February 01, 2006 03:00 AM

WASHINGTON—You may never have heard the term "data mining," but it's at the core of the argument that's raging over government eavesdropping on Americans. It's also how commercial companies learn about who you are, where you go, what you eat, what you like, what you buy.

Data mining is the process of using computer technology to extract the knowledge that's buried in enormous volumes of undigested information. Trillions of bits of raw data are culled from telephone calls, e-mails, the Internet, airlines, car rentals, stores, credit card records and a myriad of other sources spawned by the information age.

"A lot can be learned about a person through the combination of massive amounts of data and the use of sophisticated analytical techniques," said Daniel Solove, an associate law professor at George Washington University in Washington.

Whenever you search for information or a product on the Internet, say on Google or Yahoo, you leave a trace.

SIGN UP

"Every single search you've ever conducted—ever—is stored on a database somewhere," said Tim Wu, a professor at Columbia Law School in New York. "There's probably nothing more embarrassing than the searches we've made."

Once it's been collected, the data harvest is stored, organized, searched and analyzed by complex computer programs called algorithms.

The programs scour the data for hidden patterns or relationships, such as a suspicious number of insurance claims by an individual or repeated phone calls between, for example, Afghanistan and Detroit.

The Senate Judiciary Committee will open hearings Monday on the Bush administration's use of wiretaps to monitor such calls without a court warrant.

Data mining turns up such potentially meaningful patterns as, say, Person A telephoned B, who e-mailed C, who met with D and E, who rented an apartment together in F-town. Someone at that apartment made a phone call to someone in Country G in the Middle East. Human investigators can take it from there.

Data miners are like gold or diamond miners, who have to burrow through tons of useless material to get the nuggets they want. They couldn't do it without modern computing systems.

"Human analysts with no special tools can no longer make sense of enormous volumes of data," says an advertisement from Megaputer Intelligence Inc., a data-mining firm in Bloomington, Ind. "Data mining automates the process of finding relationships and patterns in raw data."

In the war against terrorism, data mining is a way to "connect the dots," something the government failed to do before the Sept. 11 attacks on the World Trade Center and the Pentagon.

Jeffrey Ullman, a computer scientist who teaches a course on data mining at Stanford University in Palo Alto, Calif., offered a hypothetical example: Suppose you wanted to check a list of 10 suspected evildoers to see if any two of them spent two nights in the same hotel at the same time, perhaps to plot a terrorist attack.

According to Ullman, you'd have to search through at least 250,000 names to spot the suspicious meeting. That's too much for a human analyst but not for a computer.

"Through data mining, (government) agencies can quickly and efficiently obtain information on individuals or groups by exploiting large databases containing personal information," the Government Accountability Office, the investigative arm of Congress, said in a report to Congress last year.

"Before data aggregation and data mining came into use, personal information contained in paper records stored at widely dispersed locations, such as courthouses or other government offices, was relatively difficult to gather and analyze," the GAO said.

A GAO survey found almost 200 data-mining programs in operation or planned at 52 government agencies in 2004.

For example, the State Department draws on a Citibank system to detect fraud or waste by employees using government credit cards.

There's a "greatly increased government hunger for private information of all sorts," said Jonathan Zittrain, an expert on the social implications of the Internet at Harvard Law School in Cambridge, Mass. "As such databases grow, the government essentially possesses its own stockpile of the nation's communications on which to perform searches."

A national security data-mining operation might work like this: A search engine—perhaps similar to Google's—monitors phone calls and communications over the Internet, collecting certain key words, such as "bin Laden," "the sheik" or "nuclear plant." It stores the findings in a computer database and looks for links between the key words and other names, places or telephone numbers.

To make sense of the findings, analysts may use a "data visualization" program to create a three-dimensional map, showing the words as hills on a landscape. Higher peaks mean the words appear more frequently. Closer peaks mean the words are related in some fashion.

Data-mining tools also are used in marketing, finance and politics. Investigators detect insurance fraud. Businesses get leads on good sales prospects. Police confirm which precincts are the most crime-ridden. Political candidates learn where best to spend their time and money.

Quadstone, a data-mining firm in Boston, touts its services: "We've created software that can predict your customer's behavior. Whether you're in the banking, brokerage, insurance, retail, or telecommunications industries, we give you the ability to use past customer history as a tool to understand, predict, and influence their future behavior."

The distinction between government and private data mining is blurring.

"Agencies at all levels of government are now interested in collecting and mining large amounts of data from commercial sources," the GAO reported. "Agencies may use such data ... to perform large-scale data analysis and pattern discovery in order to discern potential terrorist activity by unknown individuals."

The FBI's Foreign Terrorist Tracking Task Force, for example, submits queries to commercial databases for information on suspected suicide bombers, which it can combine with secret government files.

Several government data-mining projects—such as Total Information Awareness and the MATRIX, an acronym for Multistate Anti-Terrorism Information Exchange—were canceled after a public uproar.

Other government data-mining projects include Talon, a program run by the Pentagon's Counterintelligence Field Activity, which collects reports on demonstrators outside U.S. military bases. Thousands of such reports are stored in a database called Cornerstone and are shared with other intelligence agencies.

The Pentagon's Advanced Research and Development Activity, based at Fort Meade, Md., runs a research program whose goal is to develop better ways to mine huge databases to "help the nation avoid strategic surprises ... such as those of September 11, 2001."

Data-mining experts make a distinction between the appropriate use of the technology to detect terrorists or catch criminals and its possible misuse to invade privacy or inhibit free communication.

"The realization that every digital movement is recorded and monitored itself will chill private behavior," Zittrain wrote in the Harvard Law Review.

But Gregory Piatetsky, a Boston-based consultant to data-mining companies, defended the technology in an e-mail interview.

"I believe that data mining technology can be useful," he said, noting its success in detecting credit card fraud and money laundering. In national security cases, he said, the government "may have linked several e-mails from a bad guy to other guys that we know nothing about. Before you can determine whether that guy is good or bad, you first need to intercept" the e-mails.

Some experts say it's all right to use data mining against terrorists, but not against domestic crooks.

"My concern is that the government can't distinguish between fighting the war against militant Islam and ordinary crimes," Stanford's Ullman said in an e-mail. "Just like bank robbery differs in degree from going through a stop sign, terrorism differs in degree from drug crime. ... It's OK to use such a system to pursue terrorists. In fact, I believe it is essential. But we need safeguards to assure it will not be used to track `ordinary' criminals."

———

For more background information, go to www.twocrows.com and click on "About Data Mining."

———

(c) 2006, Knight Ridder/Tribune Information Services.

Need to map

  Comments  

Videos

Lone Sen. Pat Roberts holds down the fort during government shutdown

Suspects steal delivered televisions out front of house

View More Video

Trending Stories

Cell signal puts Cohen outside Prague around time of purported Russian meeting

December 27, 2018 10:36 AM

Ted Cruz’s anti-Obamacare crusade continues with few allies

December 24, 2018 10:33 AM

California Republicans fear even bigger trouble ahead for their wounded party

December 27, 2018 09:37 AM

Sources: Mueller has evidence Cohen was in Prague in 2016, confirming part of dossier

April 13, 2018 06:08 PM

Hundreds of sex abuse allegations found in fundamental Baptist churches across U.S.

December 09, 2018 06:30 AM

Read Next

Lone senator at the Capitol during shutdown: Kansas Sen. Pat Roberts
Video media Created with Sketch.

Congress

Lone senator at the Capitol during shutdown: Kansas Sen. Pat Roberts

By Andrea Drusch and

Emma Dumain

    ORDER REPRINT →

December 27, 2018 06:06 PM

The Kansas Republican took heat during his last re-election for not owning a home in Kansas. On Thursday just his wife, who lives with him in Virginia, joined Roberts to man the empty Senate.

KEEP READING

MORE LATEST NEWS

Does Pat Roberts’ farm bill dealmaking make him an ‘endangered species?’

Congress

Does Pat Roberts’ farm bill dealmaking make him an ‘endangered species?’

December 26, 2018 08:02 AM
‘Remember the Alamo’: Meadows steels conservatives, Trump for border wall fight

Congress

‘Remember the Alamo’: Meadows steels conservatives, Trump for border wall fight

December 22, 2018 12:34 PM
With no agreement on wall, partial federal shutdown likely to continue until 2019

Congress

With no agreement on wall, partial federal shutdown likely to continue until 2019

December 21, 2018 03:02 PM
‘Like losing your legs’: Duckworth pushed airlines to detail  wheelchairs they break

Congress

‘Like losing your legs’: Duckworth pushed airlines to detail wheelchairs they break

December 21, 2018 12:00 PM
Trump’s prison plan to release thousands of inmates

Congress

Trump’s prison plan to release thousands of inmates

December 21, 2018 12:18 PM
Why some on the right are grateful to Democrats for opposing Trump’s border wall

Immigration

Why some on the right are grateful to Democrats for opposing Trump’s border wall

December 20, 2018 05:12 PM
Take Us With You

Real-time updates and all local stories you want right in the palm of your hand.

Icon for mobile apps

McClatchy Washington Bureau App

View Newsletters

Subscriptions
  • Newsletters
Learn More
  • Customer Service
  • Securely Share News Tips
  • Contact Us
Advertising
  • Advertise With Us
Copyright
Privacy Policy
Terms of Service


Back to Story