将自己训练过的word嵌入模型保存为使用word2vec保存的Google word2vec和Glove的相同格式,可以按照以下步骤进行:
import numpy as np
from gensim.models import KeyedVectors
# 假设自己训练的模型保存为"my_word2vec_model.bin"
my_model = KeyedVectors.load_word2vec_format('my_word2vec_model.bin', binary=True)
# 获取词汇表和词向量
vocab = my_model.vocab
vectors = my_model.vectors
# 保存为与Google word2vec相同格式的文件
with open('my_word2vec_model.txt', 'w', encoding='utf-8') as f:
f.write(f"{len(vocab)} {len(vectors[0])}\n")
for word, vector in zip(vocab, vectors):
vector_str = ' '.join(str(num) for num in vector)
f.write(f"{word} {vector_str}\n")
# 保存为与Glove相同格式的文件
with open('my_glove_model.txt', 'w', encoding='utf-8') as f:
for word, vector in zip(vocab, vectors):
vector_str = ' '.join(str(num) for num in vector)
f.write(f"{word} {vector_str}\n")
以上代码将自己训练过的word嵌入模型保存为与Google word2vec和Glove相同格式的文件。注意,需要根据实际情况修改文件名和路径。
领取专属 10元无门槛券
手把手带您无忧上云