glitchbench.github.io - GlitchBench

Description: GlitchBench: Can large multimodal models detect video game glitches?

clip (629) gpt (221) llm (144) llama (137) gpt-4 (80) vit (56) representation learning (18) gpt-4v (5) llama-20. llava (1) resnet50 (1)

Example domain paragraphs

Large multimodal models (LMMs) have evolved from large language models (LLMs) to integrate multiple input modalities, such as visual inputs. This integration augments the capacity of LLMs in tasks requiring visual comprehension and reasoning. However, the extent and limitations of their enhanced abilities are not fully understood. To address this gap, we introduce Glitch Bench , a novel benchmark designed to test and evaluate the common-sense reasoning and visual recognition capabilities of large multimodal

A person stuck in a piece of furniture

Two people driving an invisible car

Links to glitchbench.github.io (3)