pytorch 初级爬坑 – 源码巴士

pytorch 初级爬坑–论pytorch 训练中死机

**
第一次跑pytorch，遇到了各种各样的问题，还浪费好几天停滞在同一个问题点上，不多说，切入重点。

torch模型，跑机器学习异常现象：程序一跑，系统卡死，鼠标动弹不得，内存狂飙。
torch模型，未到规定epoch，抛出错误： Process finished with exit code 137 (interrupted by signal 9: SIGKILL) ，cpu torch 基本上是电脑内容问题。
总结：程序运行过程产生太多Variable，占用了太多的空间，不要轻易，不负责任地使用，append，必须要搞清楚，数据类型，及特点

出错code：

loss_records.append(G_loss_D)
loss_records.append(D_loss)

修改后code

loss_records.append(G_loss_D.item())
loss_records.append(D_loss.item())

loss 是Variable型变量，带来巨大的内存负担,原文引用：
I think I see the problem. You have to remember that loss is a Variable, and indexing Variables, always returns a Variable, even if they’re 1D! So when you do total_loss += loss[0] you’re actually making total_loss a Variable, and adding more and more subgraphs to its history, making it impossible to free them, because you’re still holding a reference. Just replace total_loss += loss[0] with total_loss += loss.data[0] and it should be back to normal.

原文链接：https://blog.csdn.net/weixin_42410103/article/details/89319257