基于PyTorch的MADDPG多智能体强化学习复现

版权申诉
5星 · 超过95%的资源 14 下载量 160 浏览量 更新于2024-11-07 1 收藏 1.8MB RAR 举报
资源摘要信息:"can_work_MADDPG.rar" 标题解析: 文件标题"can_work_MADDPG.rar"表明这是一个可以通过解压缩进行使用的资源包。"MADDPG"指的是多智能体深度确定性策略梯度(Multi-Agent Deterministic Policy Gradient),这是一类用于解决多智能体环境下的强化学习算法。"rar"是一种文件压缩格式,通常用于减少文件大小以便于存储和传输。在这个上下文中,它包含了实现多智能体强化学习算法的代码和相关文件。 描述解析: 描述说明这个资源包是论文《Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments》(多智能体在混合合作与竞争环境下的Actor-Critic算法)的PyTorch实现。这篇论文由OpenAI发表,并提出了MADDPG算法,这是一个结合了Actor-Critic架构的深度强化学习算法。MADDPG算法的目标是在具有多个智能体的环境中工作,这些智能体可以同时进行合作与竞争。描述中提到的“开源环境Multi-Agent Particle Environment”是一个由作者提供的用于训练和测试多智能体学习算法的模拟环境。运行描述中提到的"main.py"文件,是启动该环境并开始训练或测试模型的入口脚本。 标签解析: - MADDPG(多智能体深度确定性策略梯度):指的是一种用于多智能体系统学习的算法,它结合了深度学习和策略梯度方法,适用于复杂的多智能体环境。 - 多智能体深度强化学习:是深度强化学习的一个子领域,专注于拥有多个智能体的系统,这些智能体需要在环境交互中学习有效的策略。 - 强化学习:是机器学习的一个分支,涉及到智能体通过试错的方式在环境中学习如何取得最大奖励。 - Actor-Critic:一种强化学习算法架构,其中“Actor”负责决策制定,“Critic”负责评估当前策略的价值。 - 深度学习:一种通过人工神经网络来学习和改进任务的机器学习方法,通常用于非结构化数据。 文件名称列表: MADDPG-master:这一项表明资源包包含了一个名为"MADDPG-master"的文件夹,这可能是源代码的根目录。在版本控制系统如Git中,"master"通常表示主分支,意味着这个文件夹包含的是主版本的代码。通过查看这个文件夹,用户可以获取到实现MADDPG算法的完整代码库,包括初始化环境、训练模型、评估性能等相关模块。 综合上述信息,MADDPG算法是一个强大的工具,它通过深度学习和策略梯度方法,使得多个智能体能在复杂的多智能体环境中学会合作与竞争。这个算法尤其适用于那些需要多个代理共同完成任务的情况,例如在自动驾驶、机器人协作、智能电网等领域。MADDPG算法的PyTorch实现,为研究者和开发者提供了一个实验和创新的平台,他们可以在这个平台上进一步探索和改进多智能体学习方法。

优化这段代码: IF VR(v_alarm1).0 <> ax_alarm.ax_dial THEN VR(v_alarm1).0 = ax_alarm.ax_dial IF VR(v_alarm1).1 <> ax_alarm.ax_scr1_updown THEN VR(v_alarm1).1 = ax_alarm.ax_scr1_updown IF VR(v_alarm1).2 <> ax_alarm.ax_scr1_halftone THEN VR(v_alarm1).2 = ax_alarm.ax_scr1_halftone IF VR(v_alarm1).3 <> ax_alarm.ax_scr1_scraper THEN VR(v_alarm1).3 = ax_alarm.ax_scr1_scraper IF VR(v_alarm1).4 <> ax_alarm.ax_scr2_updown THEN VR(v_alarm1).4 = ax_alarm.ax_scr2_updown IF VR(v_alarm1).5 <> ax_alarm.ax_scr2_halftone THEN VR(v_alarm1).5 = ax_alarm.ax_scr2_halftone IF VR(v_alarm1).6 <> ax_alarm.ax_scr2_scraper THEN VR(v_alarm1).6 = ax_alarm.ax_scr2_scraper IF VR(v_alarm1).7 <> ax_alarm.ax_scr3_updown THEN VR(v_alarm1).7 = ax_alarm.ax_scr3_updown IF VR(v_alarm1).8 <> ax_alarm.ax_scr3_halftone THEN VR(v_alarm1).8 = ax_alarm.ax_scr3_halftone IF VR(v_alarm1).9 <> ax_alarm.ax_scr3_scraper THEN VR(v_alarm1).9 = ax_alarm.ax_scr3_scraper IF VR(v_alarm1).10 <> ax_alarm.ax_goin_spin THEN VR(v_alarm1).10 = ax_alarm.ax_goin_spin IF VR(v_alarm1).11 <> ax_alarm.ax_output_spin THEN VR(v_alarm1).11 = ax_alarm.ax_output_spin IF VR(v_alarm1).12 <> ax_alarm.ax_tl THEN VR(v_alarm1).12 = ax_alarm.ax_tl IF VR(v_alarm1).13 <> ax_alarm.ax_work1 THEN VR(v_alarm1).13 = ax_alarm.ax_work1 IF VR(v_alarm1).14 <> ax_alarm.ax_work2 THEN VR(v_alarm1).14 = ax_alarm.ax_work2 IF VR(v_alarm1).15 <> ax_alarm.ax_work3 THEN VR(v_alarm1).15 = ax_alarm.ax_work3 IF VR(v_alarm2).0 <> ax_alarm.ax_work4 THEN VR(v_alarm2).0 = ax_alarm.ax_work4 IF VR(v_alarm2).1 <> ax_alarm.ax_work5 THEN VR(v_alarm2).1 = ax_alarm.ax_work5 IF VR(v_alarm2).2 <> ax_alarm.ax_work6 THEN VR(v_alarm2).2 = ax_alarm.ax_work6 IF VR(v_alarm2).3 <> ax_alarm.ax_work7 THEN VR(v_alarm2).3 = ax_alarm.ax_work7 IF VR(v_alarm2).4 <> ax_alarm.ax_work8 THEN VR(v_alarm2).4 = ax_alarm.ax_work8 IF VR(v_alarm2).5 <> ax_alarm.ax_work9 THEN VR(v_alarm2).5 = ax_alarm.ax_work9 IF VR(v_alarm2).6 <> ax_alarm.ax_work10 THEN VR(v_alarm2).6 = ax_alarm.ax_work10 IF VR(v_warn1).0 <> ax_warn.ax_dial THEN VR(v_warn1).0 = ax_warn.ax_dial IF VR(v_warn1).1 <> ax_warn.ax_scr1_updown THEN VR(v_warn1).1 = ax_warn.ax_scr1_updown IF VR(v_warn1).2 <> ax_warn.ax_scr1_halftone THEN VR(v_warn1).2 = ax_warn.ax_scr1_halftone IF VR(v_warn1).3 <> ax_warn.ax_scr1_scraper THEN VR(v_warn1).3 = ax_warn.ax_scr1_scraper IF VR(v_warn1).4 <> ax_warn.ax_scr2_updown THEN VR(v_warn1).4 = ax_warn.ax_scr2_updown IF VR(v_warn1).5 <> ax_warn.ax_scr2_halftone THEN VR(v_warn1).5 = ax_warn.ax_scr2_halftone IF VR(v_warn1).6 <> ax_warn.ax_scr2_scraper THEN VR(v_warn1).6 = ax_warn.ax_scr2_scraper IF VR(v_warn1).7 <> ax_warn.ax_scr3_updown THEN VR(v_warn1).7 = ax_warn.ax_scr3_updown IF VR(v_warn1).8 <> ax_warn.ax_scr3_halftone THEN VR(v_warn1).8 = ax_warn.ax_scr3_halftone IF VR(v_warn1).9 <> ax_warn.ax_scr3_scraper THEN VR(v_warn1).9 = ax_warn.ax_scr3_scraper IF VR(v_warn1).10 <> ax_warn.ax_goin_spin THEN VR(v_warn1).10 = ax_warn.ax_goin_spin IF VR(v_warn1).11 <> ax_warn.ax_output_spin THEN VR(v_warn1).11 = ax_warn.ax_output_spin IF VR(v_warn1).12 <> ax_warn.ax_tl THEN VR(v_warn1).12 = ax_warn.ax_tl IF VR(v_warn1).13 <> ax_warn.ax_work1 THEN VR(v_warn1).13 = ax_warn.ax_work1 IF VR(v_warn1).14 <> ax_warn.ax_work2 THEN VR(v_warn1).14 = ax_warn.ax_work2 IF VR(v_warn1).15 <> ax_warn.ax_work3 THEN VR(v_warn1).15 = ax_warn.ax_work3 IF VR(v_warn2).0 <> ax_warn.ax_work4 THEN VR(v_warn2).0 = ax_warn.ax_work4 IF VR(v_warn2).1 <> ax_warn.ax_work5 THEN VR(v_warn2).1 = ax_warn.ax_work5 IF VR(v_warn2).2 <> ax_warn.ax_work6 THEN VR(v_warn2).2 = ax_warn.ax_work6 IF VR(v_warn2).3 <> ax_warn.ax_work7 THEN VR(v_warn2).3 = ax_warn.ax_work7 IF VR(v_warn2).4 <> ax_warn.ax_work8 THEN VR(v_warn2).4 = ax_warn.ax_work8 IF VR(v_warn2).5 <> ax_warn.ax_work9 THEN VR(v_warn2).5 = ax_warn.ax_work9 IF VR(v_warn2).6 <> ax_warn.ax_work10 THEN VR(v_warn2).6 = ax_warn.ax_work10

2023-03-08 上传

为什么下面的sql语句会输出重复的结果:SELECT tp.parent_production_orders AS parent_production_orders, tp.production_orders AS production_orders, tp.work_order AS work_order, tp.contract AS contract, tp.sbbh AS sbbh, tp.batch_num AS batch_num, tp.product_code AS product_code, tp.product_number AS product_number, tp.product_name AS product_name, to_char( middle.create_date, 'yyyy-mm-dd' ) AS issued_date, to_char( to_timestamp( tp.delivery_time / 1000 ), 'yyyy-mm-dd' ) AS delivery_time, middle.line_code AS work_area_code, middle.line_name AS work_area_name, tp.workorder_number AS workorder_number, tp.complete_number AS complete_number, tp.part_unit AS part_unit, middle.work_time_type AS work_time_type, middle.process_time AS process_time, CASE WHEN sc.totalSubmitHours IS NULL THEN 0 ELSE sc.totalSubmitHours END AS submit_work_hours, CASE WHEN middle.process_time > 0 AND sc.totalSubmitHours IS NOT NULL THEN round( ( sc.totalSubmitHours / middle.process_time ), 2 ) * 100 ELSE 0 END plan_achievement_rate, CASE WHEN sc.totalSubmitHours IS NULL THEN 0 ELSE round( CAST ( sc.totalSubmitHours AS NUMERIC ) / CAST ( 60 AS NUMERIC ), 1 ) END AS submit_work_hours_h, round( CAST ( middle.process_time AS NUMERIC ) / CAST ( 60 AS NUMERIC ), 1 ) AS process_time_h, pinfo.material_channel AS material_channel FROM hm_model_work_order_report_middle middle LEFT JOIN hm_model_trc_plan tp ON middle.work_order = tp.work_order LEFT JOIN ( SELECT oro.work_order AS orderNo, oro.work_area_code AS lineCode, SUM ( submit_work_hours ) AS totalSubmitHours, '自制' AS workHourType FROM hm_model_trc_order_report_operation_u orou LEFT JOIN hm_model_trc_order_report_operation oro ON orou.work_order_process_id = oro.ID WHERE orou.work_order_process_id IS NOT NULL AND oro.work_area_code IS NOT NULL GROUP BY oro.work_order, oro.work_area_code UNION all SELECT ohs.work_order_no AS orderNo, ohs.line_code AS lineCode, SUM ( receiving_hour ) AS totalSubmitHours, '外委' AS workHourType FROM hm_model_outsourcing_hour_statistics ohs GROUP BY ohs.work_order_no, ohs.line_code ) sc ON middle.work_order = sc.orderNo AND middle.line_code = sc.lineCode AND middle.work_time_type = sc.workHourType LEFT JOIN hm_model_part_info AS pinfo ON tp.product_number = pinfo.part_code WHERE middle.process_time > 0 AND tp.delivery_time IS NOT NULL AND tp.production_orders LIKE'FJ2023051100286' ORDER BY to_char( to_timestamp( tp.delivery_time / 1000 ), 'yyyy-mm-dd' ) DESC, tp.parent_production_orders DESC, tp.node_level ASC

2023-06-06 上传