771 to 780 of 802 Results
JSON - 4.4 KB -
MD5: de58c46c974a33f1dc5ab73e1467134a
|
MPEG-4 Video - 74.9 MB -
MD5: 50961763f42526b1965c26e0d348c935
|
JSON - 2.8 KB -
MD5: 2adbe9e899547d74b487bd0e5f7ca07c
|
JSON - 10.2 KB -
MD5: 29ed798532d61610200c8cd64be37b08
|
JSON - 3.5 KB -
MD5: 06e1c5da875329c6ce528423121b8cac
|
MPEG-4 Video - 49.5 MB -
MD5: 65096843dec4210fc6e0992a45816194
|
Dec 10, 2025
Chinchure, Aditya; Ravi, Sahithya; Ng, Raymond; Shwartz, Vered; Li, Boyang; Sigal, Leonid, 2025, "Replication Data for: Black Swan: Abductive and Defeasible Video Reasoning in Unpredictable Events", https://doi.org/10.21979/N9/HOAFUL, DR-NTU (Data), V1
BlackSwanSuite is a benchmark for evaluating VLMs’ ability to reason about unexpected events through abductive and defeasible tasks. The tasks either artificially limit the amount of visual information provided to models while questioning them about hidden unexpected events, or p... |
Dec 10, 2025
Zhang, Wenyu; Ng, Wei En; Ma, Lixin; Wang, Yuwen; Zhao, Junqi; Koenecke, Allison; Li, Boyang; Wang, Lu, 2025, "Replication Data for: SPHERE: A Hierarchical Evaluation on Spatial Perception and Reasoning for Vision-Language Models", https://doi.org/10.21979/N9/HI9OFD, DR-NTU (Data), V2
SPHERE (Spatial Perception and Hierarchical Evaluation of Reasoning) is a hierarchical evaluation framework built on a new human-annotated dataset of 2,285 question–answer pairs. It systematically probes models across increasing levels of complexity, from fundamental skills to mu... |
Dec 10, 2025 -
Replication Data for: SPHERE: A Hierarchical Evaluation on Spatial Perception and Reasoning for Vision-Language Models
Unknown - 5.7 MB -
MD5: 3277c7f7f1bcf236b6a0bd41c97c8739
|
Dec 10, 2025 -
Replication Data for: SPHERE: A Hierarchical Evaluation on Spatial Perception and Reasoning for Vision-Language Models
Unknown - 5.4 MB -
MD5: 1fffcaa0a5e9f1a85dc45180af7cab7f
|
