table_dict = [] for table in targetTables: grants = spark.sql(f'show GRANT on {table}').collect() for grant in grants: table_dict.append(f"{json_dict['orgId']}|{''}|{table}|{grant['Principal']}|{grant['ActionType']}|{grant['ObjectType']}|{grant['ObjectKey']}") 请修改为打印targetTables数量和目前循环到第几个
时间: 2023-12-20 14:05:27 浏览: 17
可以在循环前添加一个计数器变量,并在循环内部打印出来。修改后的代码如下:
```
table_dict = []
table_count = len(targetTables)
for i, table in enumerate(targetTables):
grants = spark.sql(f'show GRANT on {table}').collect()
for grant in grants:
table_dict.append(f"{json_dict['orgId']}|{''}|{table}|{grant['Principal']}|{grant['ActionType']}|{grant['ObjectType']}|{grant['ObjectKey']}")
print(f"Processed {i+1}/{table_count} tables")
```
这样在循环期间就会打印出当前处理的表的数量和总表数。
相关问题
给下列代码加注释: def merge_accumulate_client_update(self, list_num_proc, list_state_dict, lr): total_num_proc = sum(list_num_proc) # merged_state_dict = dict() dict_keys = list_state_dict[0].keys() for state_dict in list_state_dict[1:]: assert state_dict.keys() == dict_keys # accumulate extra sgrad and remove from state_dict if self.use_adaptive and self.is_adj_round(): prefix = "extra." for state_dict in list_state_dict: del_list = [] for key, param in state_dict.items(): if key[:len(prefix)] == prefix: sgrad_key = key[len(prefix):] mask_0 = self.model.get_mask_by_name(sgrad_key) == 0. dense_sgrad = torch.zeros_like(mask_0, dtype=torch.float) dense_sgrad.masked_scatter_(mask_0, param) # no need to divide by lr self.control.accumulate(sgrad_key, dense_sgrad) del_list.append(key) for del_key in del_list: del state_dict[del_key]
```python
def merge_accumulate_client_update(self, list_num_proc, list_state_dict, lr):
total_num_proc = sum(list_num_proc)
# merged_state_dict = dict()
dict_keys = list_state_dict[0].keys()
# Check if all state dicts have the same keys
for state_dict in list_state_dict[1:]:
assert state_dict.keys() == dict_keys
# accumulate extra sgrad and remove from state_dict
if self.use_adaptive and self.is_adj_round():
prefix = "extra."
for state_dict in list_state_dict:
del_list = []
for key, param in state_dict.items():
# Check if the key starts with 'extra.'
if key[:len(prefix)] == prefix:
# Get the corresponding sgrad key
sgrad_key = key[len(prefix):]
# Create a mask of zeroes
mask_0 = self.model.get_mask_by_name(sgrad_key) == 0.
# Create a dense tensor and fill it with values from param based on the mask
dense_sgrad = torch.zeros_like(mask_0, dtype=torch.float)
dense_sgrad.masked_scatter_(mask_0, param)
# Accumulate the dense sgrad without dividing by lr
self.control.accumulate(sgrad_key, dense_sgrad)
# Add the key to the delete list
del_list.append(key)
# Remove the keys from the state_dict
for del_key in del_list:
del state_dict[del_key]
```
这段代码实现了一个`merge_accumulate_client_update`方法,主要功能是合并和累加`list_state_dict`中的状态字典。以下是对代码的注释:
- `total_num_proc`:所有进程数的总和。
- `dict_keys`:状态字典的键列表。
- 检查所有状态字典是否具有相同的键。
- 如果使用自适应且处于调整轮次,则累加额外的`sgrad`并从状态字典中删除。
- `prefix`:额外`sgrad`的前缀。
- 对于每个状态字典,遍历键和参数。
- 如果键以`prefix`开头,则获取相应的`sgrad`键。
- 创建一个零填充的掩码。
- 创建一个稠密张量,并根据掩码从参数中填充值。
- 累加不除以`lr`的稠密`sgrad`。
- 将键添加到删除列表。
- 从状态字典中删除键。
如何把#对微调数据进行分词处理 train_seg = [] for line in train: seg_list = seg.cut(line.strip()) train_seg.append(' '.join(seg_list)) #print(train_seg) #加载标签数据 with open("D:\用来微调的模型\分词后贵港市港南区:高质量推动农业机械化.txt", 'r', encoding='utf-8') as f: label = f.readlines() #将标签数据转换为数字标签 label_dict = {} num_labels = 0 for line in label: if line.strip() not in label_dict: label_dict[line.strip()] = num_labels num_labels += 1 y_train = [label_dict[line.strip()] for line in label] 训练数据和标签数据的数量变得一致
可以使用以下代码将训练数据和标签数据的数量变得一致:
```
# 对训练数据进行分词处理
train_seg = []
for line in train:
seg_list = seg.cut(line.strip())
train_seg.append(' '.join(seg_list))
# 加载标签数据
with open("D:\用来微调的模型\分词后贵港市港南区:高质量推动农业机械化.txt", 'r', encoding='utf-8') as f:
label = f.readlines()
# 将标签数据转换为数字标签
label_dict = {}
num_labels = 0
for line in label:
if line.strip() not in label_dict:
label_dict[line.strip()] = num_labels
num_labels += 1
# 确保训练数据和标签数据数量一致
if len(train_seg) > len(y_train):
train_seg = train_seg[:len(y_train)]
else:
y_train = y_train[:len(train_seg)]
```