The first publicly available multimodal dataset of image-text pairs containing both real-world and synthetically generated checkworthy and non-checkworthy claims. We source real claims from datasets like 5Pils, Multiclaim, Flickr30K, and SentiCap. Synthetic images and text are generated using Flux, StableDiffusion 3.5, Llava, and BLIP. This dataset can be used as a benchmark for checkworthiness detection models.
Photo of a flooded Ahmedabad International Airport.
Llava-generated claim:The image shows a flooded airport with several airplanes parked in the water. There are five airplanes in total, with one of them being a large jetliner. The airplanes are parked in a row, with some of them partially submerged in the water. The scene appears to be a mix of a flooded airport and a beach, with the airplanes serving as a unique and unexpected sight.
BLIP-generated claim:Arafly parked airplanes are lined up in a row at an airport
A woman in shorts and sandals is being pulled by a small child as a subway train goes by.
Llava-generated claim:The woman and the girl are walking down the subway platform.
BLIP-generated claim:Woman and child walking on platform next to train with american flag