dreamllm.github.io - DreamLLM: Synergistic Multimodal Comprehension and Creation

Example domain paragraphs

Paper Project Code Abstract This paper presents DreamLLM, a learning framework that first achieves versatile Multimodal Large Language Models (LLMs) empowered with frequently overlooked synergy between multimodal comprehension and creation. DreamLLM operates on two fundamental principles. The first focuses on the generative modeling of both language and image posteriors by direct sampling in the raw multimodal space. This approach circumvents the limitations and information loss inherent to external feature

Oil-on-canvas painting of a blue night sky with roiling energy. A fuzzy and bright yellow crescent moon shining at the top. Below the exploding yellow stars and radiating swirls of blue, a distant village sits quietly on the right. Connecting earth and sky is a flame-like cypress tree with curling and swaying branches on the left. A church spire rises as a beacon over rolling blue hills.

The webpage template is borrowed from DreamFusion . We thank the authors for their codebase.

Links to dreamllm.github.io (4)

ericyi.github.io Li Yi
qizekun.github.io Zekun Qi
yuangpeng.com Yuang Peng
varybase.github.io Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models