Bytedance has open-sourced the large language model Seed-OSS-36B
On August 21st, ByteDance's Seed team released its latest open-source large language model, SEED-OSS-36B, on the AI code sharing platform Hugging Face. The architecture of Seed-OSS-36B combines a variety of common design options, including causal language modeling, Grouped Query Attention, SwiGLU activation functions, RMSNorm, and RoPE position encoding. Each model contains 36 billion parameters, distributed across a 64-layer network, and supports a 155,000-word list. This new model has a maximum context length of up to 512k tokens, enabling it to handle extremely long documents and inference chains without sacrificing performance
.