ARLINGTON, Virginia — Long before generative AI’s boom, a Silicon Valley firm contracted to collect and analyze non-classified data on illicit Chinese fentanyl trafficking made a compelling case for its embrace by U.S. intelligence agencies.
The operation’s results far exceeded human-only analysis, finding twice as many companies and 400% more people engaged in illegal or suspicious commerce in the deadly opioid.
Excited U.S. intelligence officials touted the results publicly — the AI made connections based mostly on internet and dark-web data — and shared them with Beijing authorities, urging a crackdown.
One important aspect of the 2019 operation, called Sable Spear, that has not previously been reported: The firm used generative AI to provide U.S. agencies — three years ahead of the release of OpenAI’s groundbreaking ChatGPT product — with evidence summaries for potential criminal cases, saving countless work hours.
“You wouldn’t be able to do that without artificial intelligence,” said Brian Drake, the Defense Intelligence Agency’s then-director of AI and the project coordinator.
The contractor, Rhombus Power, would later use generative AI to predict Russia’s full-scale invasion of Ukraine with 80% certainty four months in advance, for a different U.S. government client. Rhombus says it also alerts government customers, who it declines to name, to imminent North Korean missile launches and Chinese space operations.
U.S. intelligence agencies are scrambling to embrace the AI revolution, believing they’ll otherwise be smothered by exponential data growth as sensor-generated surveillance tech further blankets the planet.
But officials are acutely aware that the tech is young and brittle, and that generative AI — prediction models trained on vast datasets to generate on-demand text, images, video and human-like conversation — is anything but tailor-made for a dangerous trade steeped in deception.
Analysts require “sophisticated artificial intelligence models that can digest mammoth amounts of open-source and clandestinely acquired information,” CIA director William Burns r ecently wrote in Foreign Affairs. But that won’t be simple.
The CIA’s inaugural chief technology officer, Nand Mulchandani, thinks that because gen AI models “hallucinate” they are best treated as a “crazy, drunk friend” — capable of great insight and creativity but also bias-prone fibbers. There are also security and privacy issues: adversaries could steal and poison them, and they may contain sensitive personal data that officers aren’t authorized to see.
That’s not stopping the experimentation, though, which is mostly happening in secret.
An exception: Thousands of analysts across the 18 U.S. intelligence agencies now use a CIA-developed gen AI called Osiris. It runs on unclassified and publicly or commercially available data — what’s known as open-source. It writes annotated summaries and its chatbot function lets analysts go deeper with queries.
Mulchandani said it employs multiple AI models from various commercial providers he would not name. Nor would he say whether the CIA is using gen AI for anything major on classified networks.
“It’s still early days,” said Mulchandani, “and our analysts need to be able to mark out with absolute certainty where the information comes from.” CIA is trying out all major gen AI models – not committing to anyone — in part because AIs keep leapfrogging each other in ability, he said.
Mulchandani says gen AI is mostly good as a virtual assistant looking for “the needle in the needle stack.” What it won’t ever do, officials insist, is replace human analysts.
Linda Weissgold, who retired as deputy CIA director of analysis last year, thinks war-gaming will be a “killer app.”
During her tenure, the agency was already using regular AI — algorithms and natural-language processing — for translation and tasks including alerting analysts during off hours to potentially important developments. The AI wouldn’t be able to describe what happened — that would be classified — but could say “here’s something you need to come in and look at.”
Gen AI is expected to enhance such processes.
Its most potent intelligence use will be in predictive analysis, believes Rhombus Power’s CEO, Anshu Roy. “This is probably going to be one of the biggest paradigm shifts in the entire national security realm — the ability to predict what your adversaries are likely to do.”
Rhombus’ AI machine draws on 5,000-plus datastreams in 250 languages gathered over 10-plus years including global news sources, satellite images and data cyberspace. All of it is open-source. “We can track people, we can track objects,” said Roy.
AI bigshots vying for U.S. intelligence agency business include Microsoft, which announced on May 7 that it was offering OpenAI’s GPT-4 for top-secret networks, though the product must still be accredited for work on classified networks.
A competitor, Primer AI, lists two unnamed intelligence agencies among its customers — which include military services, documents posted online for recent military AI workshops show. It offers AI-powered search in 100 languages to “detect emerging signals of breaking events” of sources including Twitter, Telegram, Reddit and Discord and help identify “key people, organizations, locations.” Primer lists targeting among its technology’s advertised uses. In a demo at an Army conference just days after the Oct. 7 Hamas attack on Israel, company executives described how their tech separates fact from fiction in the flood of online information from the Middle East.
Primer executives declined to be interviewed.
In the near term, how U.S. intelligence officials wield gen AI may be less important than counteracting how adversaries use it: To pierce U.S. defenses, spread disinformation and attempt to undermine Washington’s ability to read their intent and capabilities.
And because Silicon Valley drives this technology, the White House is also concerned that any gen AI models adopted by U.S. agencies could be infiltrated and poisoned, something research indicates is very much a threat.
Another worry: Ensuring the privacy of “U.S. persons” whose data may be embedded in a large-language model.
“If you speak to any researcher or developer that is training a large-language model, and ask them if it is possible to basically kind of delete one individual piece of information from an LLM and make it forget that — and have a robust empirical guarantee of that forgetting — that is not a thing that is possible,” John Beieler, AI lead at the Office of the Director of National Intelligence, said in an interview.
It’s one reason the intelligence community is not in “move-fast-and-break-things” mode on gen AI adoption.
“We don’t want to be in a world where we move quickly and deploy one of these things, and then two or three years from now realize that they have some information or some effect or some emergent behavior that we did not anticipate,” Beieler said.
It’s a concern, for instance, if government agencies decide to use AIs to explore bio- and cyber-weapons tech.
William Hartung, a senior researcher at the Quincy Institute for Responsible Statecraft, says intelligence agencies must carefully assess AIs for potential abuse lest they lead to unintended consequences such as unlawful surveillance or a rise in civilian casualties in conflicts.
“All of this comes in the context of repeated instances where the military and intelligence sectors have touted “miracle weapons” and revolutionary approaches — from the electronic battlefield in Vietnam to the Star Wars program of the 1980s to the “revolution in military affairs in the 1990s and 2000s — only to find them fall short,” he said.
Government officials insist they are sensitive to such concerns. Besides, they say, AI missions will vary widely depending on the agency involved. There’s no one-size-fits-all.
Take the National Security Agency. It intercepts communications. Or the National Geospatial-Intelligence Agency (NGA). Its job includes seeing and understanding every inch of the planet. Then there is measurement and signature intel, which multiple agencies use to track threats using physical sensors.
Supercharging such missions with AI is a clear priority.
In December, the NGA issued a request for proposals for a completely new type of generative AI model. The aim is to use imagery it collects — from satellites and at ground level – to harvest precise geospatial intel with simple voice or text prompts. Gen AI models don’t map roads and railways and “don’t understand the basics of geography,” the NGA’s director of innovation, Mark Munsell, said in an interview.
Munsell said at an April conference in Arlington, Virginia that the U.S. government has currently only modeled and labeled about 3% of the planet.
Gen AI applications also make a lot of sense for cyberconflict, where attackers and defenders are in constant combat and automation is already in play.
But lots of vital intelligence work has nothing to do with data science, says Zachery Tyson Brown, a former defense intelligence officer. He believes intel agencies will invite disaster if they adopt gen AI too swiftly or completely. The models don’t reason. They merely predict. And their designers can’t entirely explain how they work.
Not the best tool, then, for matching wits with rival masters of deception.
“Intelligence analysis is usually more like the old trope about putting together a jigsaw puzzle, only with someone else constantly trying to steal your pieces while also placing pieces of an entirely different puzzle into the pile you’re working with,” Brown recently wrote in an in-house CIA journal. Analysts work with “incomplete, ambiguous, often contradictory snippets of partial, unreliable information.”
They place considerable trust in instinct, colleagues and institutional memories.
“I don’t see AI replacing analysts anytime soon,” said Weissgold, the former CIA deputy director of analysis.
Quick life-and-death decisions sometimes must be made based on incomplete data, and current gen AI models are still too opaque.
“I don’t think it will ever be acceptable to some president,” Weissgold said, “for the intelligence community to come in and say, ‘I don’t know, the black box just told me so.'”