torch.tensor([row[2] for row in [x_bounds, y_bounds, z_bounds]]),用python复现
时间: 2023-10-18 09:04:21 浏览: 41
可以使用以下代码复现:
```
import torch
x_bounds = [1, 2, 3]
y_bounds = [4, 5, 6]
z_bounds = [7, 8, 9]
bounds_tensor = torch.tensor([row[2] for row in [x_bounds, y_bounds, z_bounds]])
print(bounds_tensor)
```
输出结果为:
```
tensor([3, 6, 9])
```
相关问题
def calculate_birds_eye_view_parameters(x_bounds, y_bounds, z_bounds): """ Parameters ---------- x_bounds: Forward direction in the ego-car. y_bounds: Sides z_bounds: Height Returns ------- bev_resolution: Bird's-eye view bev_resolution bev_start_position Bird's-eye view first element bev_dimension Bird's-eye view tensor spatial dimension """ bev_resolution = torch.tensor([row[2] for row in [x_bounds, y_bounds, z_bounds]]) bev_start_position = torch.tensor([row[0] + row[2] / 2.0 for row in [x_bounds, y_bounds, z_bounds]]) bev_dimension = torch.tensor([(row[1] - row[0]) / row[2] for row in [x_bounds, y_bounds, z_bounds]], dtype=torch.long) return bev_resolution, bev_start_position, bev_dimension这段代码完成的事情
这段代码是定义了一个名为calculate_birds_eye_view_parameters的函数,该函数的目的是计算鸟瞰图的参数。其中,x_bounds、y_bounds和z_bounds分别表示鸟瞰图的前、左、右、上和下边界,bev_resolution表示鸟瞰图的分辨率,bev_start_position表示鸟瞰图的起始位置,bev_dimension表示鸟瞰图的空间维度。具体而言,bev_resolution是一个包含三个元素的张量,分别表示在x、y和z方向上的分辨率;bev_start_position是一个包含三个元素的张量,分别表示在x、y和z方向上的起始位置;bev_dimension是一个包含三个元素的张量,分别表示在x、y和z方向上的鸟瞰图空间维度。
import jieba import torch from sklearn.metrics.pairwise import cosine_similarity from transformers import BertTokenizer, BertModel seed_words = ['姓名'] # 加载微博文本数据 text_data = [] with open("output/weibo1.txt", "r", encoding="utf-8") as f: for line in f: text_data.append(line.strip()) # 加载BERT模型和分词器 tokenizer = BertTokenizer.from_pretrained('bert-base-chinese') model = BertModel.from_pretrained('bert-base-chinese') seed_tokens = ["[CLS]"] + seed_words + ["[SEP]"] seed_token_ids = tokenizer.convert_tokens_to_ids(seed_tokens) seed_segment_ids = [0] * len(seed_token_ids) # 转换为张量,调用BERT模型进行编码 seed_token_tensor = torch.tensor([seed_token_ids]) seed_segment_tensor = torch.tensor([seed_segment_ids]) with torch.no_grad(): seed_outputs = model(seed_token_tensor, seed_segment_tensor) seed_encoded_layers = seed_outputs[0] jieba.load_userdict('data/userdict.txt') # 构建隐私词库 privacy_words = set() for text in text_data: words = jieba.lcut(text.strip()) tokens = ["[CLS]"] + words + ["[SEP]"] token_ids = tokenizer.convert_tokens_to_ids(tokens) segment_ids = [0] * len(token_ids) # 转换为张量,调用BERT模型进行编码 token_tensor = torch.tensor([token_ids]) segment_tensor = torch.tensor([segment_ids]) with torch.no_grad(): outputs = model(token_tensor, segment_tensor) encoded_layers = outputs[0] # 对于每个词,计算它与种子词的相似度 for i in range(1, len(tokens)-1): word = tokens[i] if word in seed_words: continue word_tensor = encoded_layers[0][i].reshape(1, -1) seed_tensors =seed_encoded_layers[0][i].reshape(1, -1) # 计算当前微博词汇与种子词的相似度 sim = cosine_similarity(word_tensor, seed_tensors, dense_output=False)[0].max() print(sim, word) if sim > 0.5 and len(word) > 1: privacy_words.add(word) print(privacy_words) 上述代码运行之后有错误,报错信息为:Traceback (most recent call last): File "E:/PyCharm Community Edition 2020.2.2/Project/WordDict/newsim.py", line 397, in <module> seed_tensors =seed_encoded_layers[0][i].reshape(1, -1) IndexError: index 3 is out of bounds for dimension 0 with size 3. 请帮我修改
根据报错信息,问题出在获取seed_encoded_layers的第三个元素时,超过了张量的维度范围。可以通过打印输出seed_encoded_layers的形状,来确定张量的维度情况,看看是否与预期一致。
同时,可以检查一下seed_tokens和seed_token_ids的值,确保它们正确。
下面是修改后的代码:
```
import jieba
import torch
from sklearn.metrics.pairwise import cosine_similarity
from transformers import BertTokenizer, BertModel
seed_words = ['姓名']
# 加载微博文本数据
text_data = []
with open("output/weibo1.txt", "r", encoding="utf-8") as f:
for line in f:
text_data.append(line.strip())
# 加载BERT模型和分词器
tokenizer = BertTokenizer.from_pretrained('bert-base-chinese')
model = BertModel.from_pretrained('bert-base-chinese')
seed_tokens = ["[CLS]"] + seed_words + ["[SEP]"]
seed_token_ids = tokenizer.convert_tokens_to_ids(seed_tokens)
seed_segment_ids = [0] * len(seed_token_ids)
# 转换为张量,调用BERT模型进行编码
seed_token_tensor = torch.tensor([seed_token_ids])
seed_segment_tensor = torch.tensor([seed_segment_ids])
with torch.no_grad():
seed_outputs = model(seed_token_tensor, seed_segment_tensor)
seed_encoded_layers = seed_outputs[0]
jieba.load_userdict('data/userdict.txt')
# 构建隐私词库
privacy_words = set()
for text in text_data:
words = jieba.lcut(text.strip())
tokens = ["[CLS]"] + words + ["[SEP]"]
token_ids = tokenizer.convert_tokens_to_ids(tokens)
segment_ids = [0] * len(token_ids)
# 转换为张量,调用BERT模型进行编码
token_tensor = torch.tensor([token_ids])
segment_tensor = torch.tensor([segment_ids])
with torch.no_grad():
outputs = model(token_tensor, segment_tensor)
encoded_layers = outputs[0]
# 对于每个词,计算它与种子词的相似度
for i in range(1, len(tokens)-1):
word = tokens[i]
if word in seed_words:
continue
word_tensor = encoded_layers[0][i].reshape(1, -1)
seed_tensors = seed_encoded_layers[0][0].reshape(1, -1) # 修改此处,取第一个元素
# 计算当前微博词汇与种子词的相似度
sim = cosine_similarity(word_tensor, seed_tensors, dense_output=False)[0].max()
print(sim, word)
if sim > 0.5 and len(word) > 1:
privacy_words.add(word)
print(privacy_words)
```
相关推荐
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
![docx](https://img-home.csdnimg.cn/images/20210720083331.png)