DMR News

Advancing Digital Conversations

Sony Launches New Benchmark Dataset to Measure Fairness and Bias in AI Models

ByJolyen

Nov 6, 2025

Sony Launches New Benchmark Dataset to Measure Fairness and Bias in AI Models

Sony AI has introduced a new benchmark designed to evaluate how fairly artificial intelligence systems treat people, setting what it calls a new standard for ethical AI development. The dataset, known as the Fair Human-Centric Image Benchmark (FHIBE) — pronounced “Phoebe” — is described as the first publicly available, globally diverse, consent-based human image dataset for testing bias across a broad range of computer vision tasks.

Announced alongside a paper published in Nature on Wednesday, FHIBE is built from images of nearly 2,000 paid participants representing more than 80 countries. Every individual provided consent for the use of their likeness, and participants retain the right to withdraw their images at any time — a notable departure from the AI industry’s common practice of scraping large amounts of web data without permission. Each image includes annotations describing demographic and physical characteristics, environmental context, and even camera settings, enabling fine-grained bias analysis.

Sony said the benchmark aims to address ongoing concerns about algorithmic bias and fairness in AI. During testing, FHIBE “affirmed previously documented biases” found in major AI models while offering deeper insight into their causes. For instance, the benchmark revealed reduced accuracy for individuals using “she/her/hers” pronouns, with hairstyle variability emerging as a contributing factor — a nuance previously overlooked in most evaluations.

The dataset also identified stereotype reinforcement when AI models were prompted with neutral questions about occupations. In several cases, models described individuals from certain pronoun and ancestry groups as sex workers, drug dealers, or thieves. When asked to infer what crimes a person might have committed, the systems were more likely to produce toxic or harmful outputs for individuals of African or Asian ancestry, those with darker skin tones, and those identifying as “he/him/his.”

Sony AI said FHIBE demonstrates that ethical, diverse, and fair data collection is achievable, providing researchers and developers with a transparent foundation for improving AI behavior. The benchmark is freely available to the public and will be updated over time as new data and findings emerge.


Featured image credits: xsix via Flickr

For more stories like it, click the +Follow button at the top of this page to follow us.

Jolyen

As a news editor, I bring stories to life through clear, impactful, and authentic writing. I believe every brand has something worth sharing. My job is to make sure it’s heard. With an eye for detail and a heart for storytelling, I shape messages that truly connect.

Leave a Reply

Your email address will not be published. Required fields are marked *