Every analyst has a story, likely many stories, of the painful uncertainty they faced upon receiving reports about a developing high-risk situation. It takes time to verify the reliability of a source and trace the origin of a key datapoint. Once the report is confirmed, further time and resources are still required to produce a concise and actionable report.
Reports that are false negatives can lead to severe consequences, whereas false positives lead to alert fatigue that may encourage stakeholders to dismiss warnings about real threats. This is a persistent issue that impacts intelligence cycle outcomes, an issue that is exacerbated when AI is incorrectly implemented as the primary tool for intelligence collection and analysis. When directed toward dubious sources or flooded with extraneous data, AI will produce a report that is, at best, noisy and unfocused. Unverified data published by unverified sources enters an information ecosystem, serves as an input for AI-generated content, negatively influences the subsequently generated analysis, and then delays or even misleads human action.
The “Trust-Gap” In AI-Generated Intelligence
The prevalence of this trust-gap was on full display in the hours that immediately followed the September 10th assassination incident at Utah Valley University. CBS reported that multiple AI-driven tools and platforms had published uncritical content on social media, wrongly dismissing the ongoing incident as a hoax or falsely claiming authorities had already apprehended the shooter. A fictional AI-generated video of the suspected shooter was shared on X shortly after the incident. One AI tool then corroborated it as verified footage, which may have encouraged many to believe it to be authentic. This included a local law enforcement entity that used it as an image of the yet-to-be apprehended suspect, only later adding a disclaimer that the image was not real.
Fixing the Root Cause
Seerist’s Research and Collections Team proactively anticipates and addresses this issue with our AI-generated intelligence products. The solution is to closely guard and assess the sources that feed into our AI-generated intelligence products; vetting information before it becomes a datapoint for our models ensures analysis remains timely, focused, and always accurate.
Identifying High-Value Information
The Research and Collections team is constantly evaluating the list of sources that we pull in for users. We work quickly to adjust or expand coverage in response to major developments, such as the recent Nepalese protests, and we also diligently work with clients to execute bespoke collection plans around their needs. The team scours the internet for countless sources to execute a collection strategy that facilitates comprehensive coverage across continents, languages, social media platforms, and issue areas. This thoroughness enables long-term monitoring and trend analysis as well as effective situational awareness and responses to dynamic news cycles.
We frequently revisit our technical collection capabilities on social media platforms to brainstorm with Seerist’s innovative data scientists and developers about new ways to fully utilize sources. Above all, the team’s guiding principle is to maximize the signal-to-noise ratio for users; we bring in all the valuable data that we can get our hands on and only that. Maximizing this signal-to-noise ratio also plays an important role in protecting Seerist’s AI-generated intelligence products from falling victim to the trust-gap problem.
The massive quantity of sources brought on by our team is the first step in maximizing this ratio. Once they identify a plausibly high-value source, our analysts carry out a series of research tasks to develop a deeper understanding of each the source. The resulting findings enable us to enrich the source with additional data and context in our products and for user awareness.
Enhancing Source Context
The methodology underpinning how Seerist identifies and prioritizes high-reliability sources similarly guides practices to enhance a given source’s context. In addition to informing AI-generated intelligence content, the overarching goal of enhancing source context is to provide users with a holistic understanding of a source through verified facts that may impact how that source behaves and how to best utilize it.
These contextual enhancements are thoroughly researched assessments related to things like a source’s potential susceptibility to misinformation, the legal and institutional media environment it operates within, its ownership and physical location, and its structure and motivations for publishing content. The resulting findings are analyst-augmented metadata that is then attached to each source for users to see and take into consideration.
The categories shown below comprise some of this augmented metadata. A complete overview of Seerist’s methodology and source metadata categories is available on the Seerist Platform.
- Source Name
- Source Type
- Publishing Platform
- Language
- Reliability
- Susceptibility to Misinformation
- Region
- Country of Origin
- Country of Focus
- Administrative Division
- City
Once one of our analysts finalizes their research and assessments for a given source, everything is put through a systematic review. Other members of the Research and Collections team will review every source to scrutinize each assessment and researched detail. Disagreements and contradictory findings are put forth to the whole team before a final determination is reached. In addition to guaranteeing each source on the platform corresponds with the methodology and analytical standards we have established, this system encourages a high degree of consistency in the evaluation process and uniformity in assessment criteria – especially for source reliability – regardless of who is working on a source.
Reliability as a Key Differentiator
In conjunction with the sophisticated models that Seerist constantly develops and improves, our rigorous source vetting methodology ensures that critical information – and only critical information – is put in front of our clients. The cornerstone of this success is derived from our source reliability assessments.
The reliability score for a given source is comprised of several carefully researched evaluations about its capacity and willingness to publish content that is accurate and veracious. More specifically, we determine reliability by looking at:
- The proportion of factual reporting, as opposed to opinion or analytical content
- The use of emotive language
- The degree of factual accuracy
- The resilience of the source and its ability to maintain its standards
Source reliability is then assigned as High, Medium, or Low based on the descriptions below.
- High: Large proportion of factual content with high degree of factual accuracy. The use of emotive language is rare. The source is highly resilient.
- Medium: Mixed factual and opinion content, and/or with medium to high factual accuracy. The use of emotive language may be common. The source may have a medium-to-high level of resilience.
- Low: Low proportion of factual content, and/or low degree of factual accuracy. The use of emotive language may be common. The source may have a low to medium level of resilience.
Preventing “Garbage in, Garbage Out”
By restricting a model’s data inputs to only include meticulously researched high-reliability sourcing, Seerist ensures that only accurate and actionable AI-generated intelligence products reach users. This supports a shorter time-to-decision metric without negatively impacting how quickly users receive alerts. Seerist’s AI-generated intelligence can provide rapid analysis of ongoing incidents while avoiding trust-gap issues. The system prioritizes information positively confirmed by high-reliability sources and, if necessary, supplements that with content from sources that are less reliable, which is reflected in the resulting analysis and always clearly marked for the user on the platform.
After the Seerist platform alerts a user to a developing situation, the platform can present a curated selection of high-reliability sourcing that it then swiftly synthesizes into an AI-generated intelligence product tailored to the user’s information requirements. Importantly, the Research and Collections Team’s detailed research, assessment, and review process ensures that the final product is something that will always earn trust from our users.