zixuanlan@uchicago.edu, lanzixuan521@gmail.com
Hello! I'm an independent researcher. I received my Master's degree from the University of Chicago, where I worked with Professor Joe Zhou from Stony Brook University (life-long collaborator and advisor). My research focuses on improving model efficiency through algorithmic innovations. I am currently working on Large Language Models and Vision-Language Models, with interests spanning information compression (including token compression), architectural design, model interpretability, and Triton/CUDA optimization. I am actively learning GPU programming and am most familiar with the Ampere architecture.
Yanhong Li*, Zixuan Lan*, Joe Zhou. · EMNLP 2025
*Equal contribution
Large language models (LLMs) and their multimodal variants can now process visual inputs, including images of text. This raises an intriguing question: can we compress textual inputs by feeding them as images to reduce token usage while preserving performance? In this paper, we show that visual text representationsare a practical and surprisingly effective form of input compression for decoder LLMs. Weexploit the idea of rendering long text inputs as a single image and provide it directly to the model. This leads to dramatically reduced number of decoder tokens required, offering a new form of input compression.