Skip to content

Conversation

@pengcheng888
Copy link
Collaborator

@pengcheng888 pengcheng888 commented Jan 7, 2026

目标版本
main
功能描述

python的nn.Module中添加 to函数:
实现将model权重to到gpu上的操作
model.to("cuda")

测试结果
该功能的测试脚本,测试通过
Screenshot from 2026-01-07 15-14-27

修改后infinilm中的llama模型,测试通过
Screenshot from 2026-01-07 15-03-52

@pengcheng888 pengcheng888 linked an issue Jan 7, 2026 that may be closed by this pull request
if (param.shape == input_param.shape) and (
param.dtype == input_param.dtype
):
param.copy_(input_param)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

因为,infinicore的两个tensor之间的copy操作,是支持从 cpu直接拷贝到gpu的。

现在的权重加载判断是: 模型weight和权重文件,二者shape和dtype同一样时,可以拷贝数据,否则报错。

pass
raise KeyError("not support")

def _apply(self, fn, recurse=True):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to函数的部分参考了torch的写法

@pengcheng888 pengcheng888 marked this pull request as ready for review January 7, 2026 10:07
@pengcheng888 pengcheng888 requested a review from a team January 7, 2026 10:07
raise KeyError("not support")
for key, param in self._parameters.items():
if param is not None:
setattr(self, key, fn(param))
Copy link
Collaborator Author

@pengcheng888 pengcheng888 Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

使用=符号,赋值不成功,不知为何。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

最后用了setattr(self, key, fn(param))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[DEV] python的nn.Module中添加 to函数

2 participants