8B param text-to-image model trained on structured JSON captions up to 1,000+ words. [non-commercial license] [arxiv] [model] [code]