- NVIDIA发布Nemotron-3-Super-120B-A12B-NVFP4模型,总参数量1200亿,激活参数120亿,支持百万级上下文窗口及多语言处理。该模型基于LatentMoE架构,并在预训练阶段首次采用NVFP4精度技术,为开源模型领域首创。NVIDIA同步公开了详细技术报告、预训练与后训练数据集,其中大部分数据为开源提供。此举提升了中等规模模型的性能边界,推动高效能、低成本模型发展。
NVIDIA推出1200亿参数开源模型
首次应用NVFP4精度技术
支持百万上下文与多语言
- CohereLabs发布cohere-transcribe-03-2026语音转文本模型,基于conformer架构,与NVIDIA Parakeet类似。该模型具备14种不同配置或语言支持(具体细节未完全披露),专注于高精度音频转录任务。作为开源语音模型的新成员,其发布丰富了多模态开源生态,尤其在语音处理领域提供新选择,有助于推动语音技术在研究与产业中的普及应用。
Cohere发布conformer架构转录模型
支持14种配置或语言
强化开源语音处理能力
- 本期开源模型发布涵盖OCR、RAG搜索、音频转录、计算机操作、代码编辑、数学定理证明等多个应用场景,模型来源广泛,不再集中于少数大型厂商。相较于以往以Qwen、DeepSeek等为主的趋势,此次更多中小型组织参与,体现开源生态的多样性与专业化发展。尽管高端模型关注度仍高,但细分领域模型的涌现对行业具有长期推动作用。
开源模型覆盖多应用场景
中小组织参与度提升
推动领域专用模型发展
- NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 by NVIDIA
NVIDIA has released Nemotron-3-Super-120B-A12B-NVFP4, a mid-sized open model with 120 billion total parameters and 12 billion active parameters, featuring a 1-million-token context window and support for multiple languages. The model is based on the LatentMoE architecture and utilizes NVFP4 precision during pre-training—a first among open models. NVIDIA has provided a detailed technical report and released most of the pre-training and post-training datasets publicly. This release marks a significant advancement in efficient, high-capacity open models, particularly in leveraging novel quantization techniques and mixture-of-experts design. The model aims to balance performance and accessibility, offering strong capabilities without requiring the full parameter count to be active during inference. Its open data release supports reproducibility and further research. This development signals NVIDIA’s growing role in the open model ecosystem, complementing its hardware leadership with accessible AI software.
Key Takeaways:
NVIDIA introduces a 120B-parameter open model with 12B active parameters
Model uses LatentMoE and NVFP4, a first in open models
1M context window and multilingual support enhance usability
Most training data and tech report are openly released
Source: Original Article
- cohere-transcribe-03-2026 by CohereLabs
CohereLabs has launched cohere-transcribe-03-2026, a speech-to-text model based on the conformer architecture, similar to NVIDIA’s Parakeet. The model supports 14 different languages and is optimized for accuracy and efficiency in audio transcription tasks. While detailed performance metrics and training data specifics are limited, the release reflects Cohere’s expansion into multimodal open models beyond text generation. This model contributes to the growing ecosystem of open, domain-specific tools for audio processing, particularly in multilingual environments. Its release aligns with broader industry trends toward accessible, specialized models that complement larger, general-purpose systems.
Key Takeaways:
CohereLabs releases multilingual speech-to-text model with 14 languages
Model uses conformer architecture, similar to NVIDIA’s Parakeet
Supports efficient audio transcription for diverse linguistic contexts
Detailed information limited but adds to open audio model diversity
Source: Original Article
查看原文 →
View Original →