电力电子设计智能化：深度强化学习和大语言模型协作框架及应用

陈宇; 商毅; 莫远豪; 金翔

doi:10.11985/2025.05.003

电力电子设计智能化：深度强化学习和大语言模型协作框架及应用

Intelligent Power Electronics Design: A Collaborative Framework of Deep Reinforcement Learning and Large Language Models with Applications

摘要

摘要: 电力电子设计是电力电子装备研发的起点，实现电力电子设计智能化可以显著提升研发效率，实现降本增效。然而，电力电子设计任务具有明显的多步决策特征，这对算法设计提出更高要求。本文提出了一种融合深度强化学习(Deep reinforcement learning,DRL)与大语言模型(Large language model,LLM)的协同设计方法，该方法首先利用LLM解析自然语言撰写的设计需求，进而生成供DRL进行训练的初始设计状态；然后通过DRL的奖励驱动机制在初始设计状态上进行试错迭代，最终获得奖励最大化(即完全满足设计需求和规则)的设计结果。上述协同框架实现了两类技术的优势互补，同时解决了LLM垂直领域知识不足、生成结果不完全准确的问题，以及DRL搜索空间大、训练效率低、收敛难度大的问题。将上述框架应用于电力电子拓扑生成和电路布局布线两类典型设计问题中，设计结果亦初步验证了所提框架的有效性。此框架下的AI算法如何优化、算法和任务如何适配以及算法局限性等问题值得进一步研究。

Abstract: Power electronics design serves as the foundation for power electronic equipment development. The implementation of intelligent power electronics design can significantly improve research and development efficiency while reducing costs. However, power electronics design tasks inherently involve multi-step decision-making characteristics, which puts forward higher requirements for the design of algorithms. A collaborative design framework that integrates deep reinforcement learning（DRL） with large language models（LLMs） is proposed. The methodology first employs LLMs to interpret natural language design requirements and generate initial states for DRL. Subsequently, through DRL’s reward-driven mechanism, the system performs trial-and-error iterations on these initial states to ultimately obtain design solutions that maximize rewards, which fully satisfies design requirements and constraints. This synergistic framework achieves complementary advantages by addressing the limitations of LLMs in vertical domain knowledge and generation accuracy, while simultaneously mitigating the challenges of large search spaces, low training efficiency, and convergence difficulties in DRL. The framework is validated through two typical applications: power electronics topology generation and layout/routing design, with experimental results demonstrating its effectiveness. Further research should explore algorithmic optimization, task-algorithm alignment, and inherent limitations to enhance the framework’s robustness and scalability.

HTML全文

参考文献(36)

施引文献

资源附件(0)