for start_city, end_cities in around.items(): if start_city in grouped.groups: if start_city not in add_cities: orders_start = grouped.get_group(start_city) add_cities.append(start_city) if end_cities: for end_city in end_cities: if end_city in grouped.groups and end_city not in add_cities: orders_end = grouped.get_group(end_city) add_cities.append(end_city) orders_around = pd.concat([orders_start, orders_end]) result = pd.concat([result, orders_around]) else: if end_cities: for end_city in end_cities: if end_city in grouped.groups and end_city not in add_cities: orders_end = grouped.get_group(end_city) add_cities.append(end_city) result = pd.concat([result, orders_end])这段代码导致result里有重复数据,错误在哪
时间: 2024-02-10 14:08:26 浏览: 63
这段代码中,可能会导致result里有重复数据的原因是在两个地点之间存在多个订单,这些订单被合并到一个DataFrame中后,可能会存在重复的行。为了避免这种情况,可以在合并DataFrame之前,先将其去重。可以使用pandas.DataFrame.drop_duplicates()方法去重,例如在代码中可以像这样添加.drop_duplicates()方法:
```
orders_around = pd.concat([orders_start, orders_end]).drop_duplicates()
result = pd.concat([result, orders_around]).drop_duplicates()
```
这样可以确保result中不会存在重复的行。
相关问题
import pandas as pd # 假设orders是已知的订单DataFrame,around是每个起始城市周围的城市 orders = pd.DataFrame({'订单编号': order_code, '起始城市': s, '目的城市': e}) grouped = orders.groupby('起始城市') result = pd.DataFrame(columns=['订单编号', '起始城市', '目的城市']) for start_city, end_city in around.items(): # 找到该城市和周围城市的订单 orders_around = pd.concat([grouped.get_group(start_city), grouped.get_group(end_city)]) # 将订单合并成一个新的DataFrame result = pd.concat([result, orders_around]) # 将合并后的数据保存到文件中 result.to_csv('new_orders.csv', index=False)在代码中加上判断订单数据有以起始地为起始城市的订单
可以在代码中加上以下判断语句来筛选以起始地为起始城市的订单:
```
orders_starting_from_s = orders[orders['起始城市'] == s]
```
然后在合并订单数据的代码中将该筛选出的订单数据加入到合并后的DataFrame中:
```
orders_around = pd.concat([grouped.get_group(start_city), grouped.get_group(end_city), orders_starting_from_s])
```
def parse_constellation_from_lla(): lla_data_filename = data_folder_path + constellation_name + '-Current-Constellation-LLA.txt'; satellite_trace_grouped_by_time = {}; months = sp_utils.sp_month_map(); id = 0; with open(lla_data_filename, errors='ignore') as file: lla_data_list = []; lla_data_per_satellite_list = []; for line in file: # LLA location data of each satellite starts with a line with "Time (UTCG)" if ("Time (UTCG)" in line): # save LLA data already parsed, and start a new list for next satellite if (len(lla_data_per_satellite_list)): print("Save %s samples for satellite %s" % (str(len(lla_data_per_satellite_list)), str(id))); lla_data_list.append(copy.deepcopy(lla_data_per_satellite_list)); write_satellite_lla_to_csv(lla_data_per_satellite_list, id); lla_data_per_satellite_list.clear(); id = id + 1; continue; # Time (UTCG) Lat (deg) Lon (deg) Alt (km) Lat Rate (deg/sec) Lon Rate (deg/sec) Alt Rate (km/sec) # 7 Jul 2020 19:00:00.000 -52.162 166.811 570.070856 -0.013114 0.095196 0.005696 line = line.split(); if (len(line) == 10): sample = sp_lla_trace(); sample.time = line[2] + "-" + str(months[line[1]]) + "-" + line[0] + "-" + line[3] sample.time = sample.time.replace(":", "-"); sample.time = sample.time.replace(".000", ""); sample.latitude = line[4]; sample.longitude = line[5]; sample.attitude = line[6]; sample.id = id; lla_data_per_satellite_list.append(copy.deepcopy(sample)); # append satellite LLA location to a certain time slot. if (sample.time not in satellite_trace_grouped_by_time.keys()): satellite_trace_grouped_by_time[sample.time] = []; satellite_trace_grouped_by_time[sample.time].append(copy.deepcopy(sample)); # save the last satellite. if (len(lla_data_per_satellite_list)): print("Save %s samples in for satellite %s" % (str(len(lla_data_per_satellite_list)), str(id))); lla_data_list.append(copy.deepcopy(lla_data_per_satellite_list)); write_satellite_lla_to_csv(lla_data_per_satellite_list, id); lla_data_per_satellite_list.clear(); print("Extract LLA location of %s satellites in total." % str(id)); # save LLA location trace grouped by time slots all_time_slots = satellite_trace_grouped_by_time.keys(); print("Save LLA location by time slot."); for time_slot in all_time_slots: write_satellite_lla_by_time(time_slot, satellite_trace_grouped_by_time[time_slot]); print("Saving LLA location in %s." % time_slot); print("LLA location saved to files.");
这段代码是在解析卫星轨迹的经纬度和高度数据。首先会根据数据文件的路径和星座名称生成文件名,然后将卫星轨迹按时间分组。代码中使用了一个月份的映射表,同时设定一个id。之后通过打开LLA数据文件(忽略错误),将读取的数据存入列表中。代码中还有一个卫星轨迹列表,并逐行读取文件中数据作为每个卫星的轨迹。
阅读全文