
Campbell Brown, the former television journalist and former head of news partnerships at Facebook, is building a new company focused on evaluating the accuracy and reliability of AI models in areas such as geopolitics, mental health, finance, and hiring, arguing that current systems still produce unreliable and incomplete information in high-stakes situations.
Brown discussed her company, Forum AI, during a recent appearance with TechCrunch editor Tim Fernholz at a StrictlyVC event in San Francisco.
Founded 17 months ago in New York City, Forum AI evaluates how foundation AI models perform in situations where answers are complex and subjective rather than factual or binary.
Brown said the company focuses on “high-stakes topics” where responses require nuance and context instead of straightforward right-or-wrong answers.
Forum AI Uses Experts To Build Evaluation Benchmarks
Forum AI’s approach involves recruiting specialists to create evaluation benchmarks and then training AI judges to assess model responses at scale.
For the company’s geopolitics benchmark work, Brown said Forum AI recruited experts including historian Niall Ferguson, journalist Fareed Zakaria, former U.S. Secretary of State Tony Blinken, former U.S. House Speaker Kevin McCarthy, and cybersecurity official Anne Neuberger.
The company’s goal is to train AI evaluation systems until they reach roughly 90% agreement with the human experts. Brown said Forum AI has been able to achieve that level of consensus.
Brown said the company emerged from concerns she developed while working at Meta around the time ChatGPT became publicly available.
“I remember really shortly after realizing this is going to be the funnel through which all information flows,” Brown said. “And it’s not very good.”
She added that the issue felt personally important because of concerns about how younger users would consume information through AI systems.
Brown Says Accuracy Is Not The Industry’s Main Focus
According to Brown, many foundation model developers remain primarily focused on coding and mathematics benchmarks instead of improving the quality of information-related outputs.
She said evaluating news, political topics, and nuanced discussions is more difficult than measuring coding performance, but argued that these areas remain critical.
Forum AI’s testing of leading AI models identified several recurring problems, according to Brown. She cited cases where Google Gemini referenced Chinese Communist Party websites in unrelated stories and said most major models display left-leaning political tendencies.
Brown also described subtler issues, including missing context, omitted viewpoints, and arguments that misrepresent opposing positions without acknowledging those omissions.
“There’s a long way to go,” Brown said. “But I also think that there are some very easy fixes that would vastly improve the outcomes.”
Brown Connects AI Concerns To Her Facebook Experience
Brown said her experience overseeing Facebook’s news initiatives influenced her thinking about AI systems and information quality.
During her time at Facebook, Brown led the platform’s fact-checking efforts, which no longer operate in the same form today.
“We failed at a lot of the things we tried,” Brown said during the discussion.
She argued that social media platforms often optimized for engagement metrics instead of informational quality, contributing to poorer public understanding.
According to Brown, AI companies now face a similar choice between maximizing user engagement and prioritizing truthful and accurate responses.
She acknowledged that building AI systems optimized for truthful information may appear idealistic, but said enterprise customers could push the industry in that direction because of legal and financial risks.
Businesses using AI systems in lending, insurance, hiring, and credit decisions are likely to prioritize accuracy due to liability concerns, she said.
Forum AI Targets Enterprise Compliance Market
Forum AI is positioning itself around that enterprise demand for AI oversight and compliance testing.
Brown said many existing compliance and auditing systems remain inadequate. She pointed to New York City’s hiring bias law, where the state comptroller later found that more than half of audited systems contained undetected violations.
According to Brown, meaningful AI evaluation requires subject-matter experts capable of identifying edge cases and subtle failures rather than relying solely on general-purpose audits.
“Smart generalists aren’t going to cut it,” she said.
Forum AI raised $3 million last fall in a funding round led by Lerer Hippeau.
Brown also described what she sees as a gap between the AI industry’s public messaging and the actual experience of many users interacting with chatbots.
“You hear from the leaders of the Big Tech companies, ‘This technology is going to change the world,’ ‘it’s going to put you out of work,’ ‘it’s going to cure cancer,’” Brown said. “But then to a normal person who’s just using a chatbot to ask basic questions, they’re still getting a lot of slop and wrong answers.”
Featured image credits: Magnific.com
For more stories like it, click the +Follow button at the top of this page to follow us.
