DeepSeek系列模型解析与微调实战码士集团官网

章节1:DeepSeek V1 论文领读 (5节)

课时01

01_DeepSeekV1 摘要与介绍

更新时间：2025-12-02

26分12秒

课时02

02_DeepSeekV1 架构和预训练

更新时间：2025-12-02

29分30秒

课时03

03_关于 Scaling Laws 的新发现

更新时间：2025-12-02

37分25秒

课时04

04_DeepSeekV1 后训练与评估

更新时间：2025-12-02

31分0秒

课时05

05_针对 DeepSeekV1 的探讨与总结

更新时间：2025-12-02

22分3秒

章节2:DeepSeek V2 论文剖析 (11节)

课时06

06_DeepSeekV2 整体介绍

更新时间：2025-12-02

38分41秒

课时07

07_DeepSeek 之多头潜在注意力机制 MLA

更新时间：2025-12-02

34分36秒

课时08

08_多头潜在注意力中的矩阵融合、RoPE解耦

更新时间：2025-12-02

29分58秒

课时09

09_通过图示讲解 MLA 整体流程

更新时间：2025-12-02

7分19秒

课时10

10_比较不同注意力机制对于 KV cache 的占用

更新时间：2025-12-02

8分18秒

课时11

11_DeepSeek 架构之混合专家系统 MOE

更新时间：2025-12-02

30分11秒

课时12

12_DeepSeekMOE 中的设备受限路由机制

更新时间：2025-12-02

12分38秒

课时13

13_DeepSeekV2 中添加的额外损失、Token 丢弃策略

更新时间：2025-12-02

39分51秒

课时14

14_DeepSeekV2 模型的预训练、长上下文扩展

更新时间：2025-12-02

35分2秒

课时15

15_DeepSeekV2 模型的后训练以及其中用到的GRPO算法

更新时间：2025-12-02

39分37秒

课时16

16_关于 DeepSeekV2 的探讨和总结

更新时间：2025-12-02

11分5秒

章节3:DeepSeek V3 论文精讲 (7节)

课时17

17_DeepSeekV3 技术报告的详细介绍

更新时间：2025-12-02

41分27秒

课时18

18_对V2模型MOE的变化、Loss-Free 负载均衡、额外损失、节点受限路由

更新时间：2025-12-02

31分54秒

课时19

19_DeepSeekV3 引入多Token预测机制 MTP

更新时间：2025-12-02

22分45秒

课时20

20_DeepSeekV3 之 FP8 混合精度训练

更新时间：2025-12-02

59分41秒

课时21

21_LLM推理部署之预填充和解码 Prefill&Decoding

更新时间：2025-12-02

30分29秒

课时22

22_DeepSeekV3 预训练参数设置与长上下文扩展

更新时间：2025-12-02

18分9秒

课时23

23_DeepSeekV3 后训练步骤、讨论、总结

更新时间：2025-12-02

43分26秒

章节4:DeepSeek R1 论文解析 (4节)

课时24

24_DeepSeekR1 论文摘要和整体介绍

更新时间：2025-12-02

28分25秒

课时25

25_DeepSeekR1-Zero模型和有趣的顿悟时刻

更新时间：2025-12-02

20分29秒

课时26

26_DeepSeekR1 模型4阶段训练、基于它的知识蒸馏

更新时间：2025-12-02

38分19秒

课时27

27_DeepSeekR1 讨论与总结

更新时间：2025-12-02

14分9秒

章节5:DeepSeek 在医疗问答场景下的监督微调实战 (6节)

课时28

28_创建节点并下载模型

更新时间：2025-12-03

6分36秒

课时29

29_运行环境的安装和模型加载

更新时间：2025-12-03

10分55秒

课时30

30_微调前使用模型进行推理

更新时间：2025-12-03

19分44秒

课时31

31_训练数据的处理与加载

更新时间：2025-12-03

16分15秒

课时32

32_对模型进行有监督微调

更新时间：2025-12-03

22分16秒

课时33

33_使用训练后模型进行推理预测

更新时间：2025-12-03

3分54秒

章节6:GRPO 面向智慧医疗的偏好对齐微调实战 (7节)

课时34

34_准备训练模型与动态给模型添加GRPO组件

更新时间：2025-12-02

16分1秒

课时35

35_HF数据集本地使用、训练数据集的预处理

更新时间：2025-12-02

36分57秒

课时36

36_自定义多种评分函数 Reward Functions

更新时间：2025-12-02

33分19秒

课时37

37_GRPO 超参数设置与模型训练

更新时间：2025-12-02

17分37秒

课时38

38_GRPO 算法微调可用于非 DeepSeek 模型

更新时间：2025-12-02

6分52秒

课时39

39_训练阶段整合 vllm 推理引擎

更新时间：2025-12-02

39分47秒

课时40

40_训练后模型预测、模型保存与参数合并

更新时间：2025-12-02

19分32秒

章节7:DeepSeek mHC 深度解析 (4节)

课时41

Residual Connection 及其优势

更新时间：2026-01-31

26分23秒

课时42

Hyper Connection 网络结构

更新时间：2026-01-31

16分34秒

课时43

mHC 基于 Hyper Connection 的改进

更新时间：2026-01-31

27分38秒

课时44

pytorch 实现 mHC 网络结构

更新时间：2026-01-31

15分9秒