On February 12, according to the news of the Doubao Large Model team, the ByteDance Doubao Large Model Foundation team recently proposed UltraMem, a sparse model architecture that also decoupled the calculation and parameters, and solved the problem of inference access under the premise of ensuring the model effect. According to reports, the architecture effectively solves the high memory access problem of MoE reasoning, the reasoning speed is increased by 2-6 times compared with the MoE architecture, and the reasoning cost can be reduced by up to 83%
The DouBao big model team proposed the sparse model architecture UltraMem
On February 12, according to the news of the Doubao Large Model team, the ByteDance Doubao Large Model Foundation team recently proposed UltraMem, a sparse model architecture that also decoupled the calculation and parameters, and solved the problem of inference access under the premise of ensuring the model effect. According to reports, the architecture effectively solves the high memory access problem of MoE reasoning, the reasoning speed is increased by 2-6 times compared with the MoE architecture, and the reasoning cost can be reduced by up to 83%
.