中国农业科技导报 ›› 2023, Vol. 25 ›› Issue (4): 100-109.DOI: 10.13304/j.nykjdb.2023.0121

• 智慧农业 农机装备 • 上一篇    

需量电费影响下的CCHP系统深度强化学习运行优化

高文忠(), 张毅()   

  1. 上海海事大学商船学院,上海 201306
  • 收稿日期:2023-02-23 接受日期:2023-04-06 出版日期:2023-04-01 发布日期:2023-06-26
  • 通讯作者: 张毅
  • 作者简介:高文忠 E-mail:Wzgao@shmtu.edu.cn
  • 基金资助:
    上海市科学技术委员会资助项目(18040501800)

Operational Optimization of CCHP Systems on Deep Reinforcement Learning Under Influence of Demand Charge

Wenzhong GAO(), Yi ZHANG()   

  1. Merchant Marine College,Shanghai Maritime University,Shanghai 201306,China
  • Received:2023-02-23 Accepted:2023-04-06 Online:2023-04-01 Published:2023-06-26
  • Contact: Yi ZHANG

摘要:

在全球能源紧张的趋势下,冷热电三联供(combined cold, hot and power, CCHP)系统因能源可梯级利用和一次能源利用率高的优势日益受到重视。然而,由于影响因素复杂、多变,特别是需量电费的存在,CCHP系统以现有控制手段和实时满足用户侧供能需求的前提下难以以经济性目标运行。为了在考虑需求电费的条件下最大限度地降低运行成本,提出基于TD3算法的CCHP系统控制策略优化方法,对系统的各个设备进行建模,将CCHP系统运行优化问题转化为马尔卡夫决策问题,利用TD3算法求解,并进行实例验证分析。结果表明,考虑需量电费的TD3代理良好地平衡了需量电费和实时运行费用,且具有泛化性;相较于历史运行策略和不考虑需量电费的TD3代理运行策略,总运行成本分别降低了41.5%和8.6%。研究结果为减少农业供能成本、提高经济性提供了新的解决方案。

关键词: 冷热电三联供系统, 需量电费, 深度强化学习, 双延迟深度确定性策略梯度

Abstract:

In the present time of global energy stress, the combined cold, hot and power (CCHP) system is gaining importance due to the advantages of energy cascade utilization and high primary energy utilization. However, due to the complex and variable influencing factors, especially the presence of the demand charge, the CCHP system is difficult to operate in real time with the existing control methods to meet the energy requirements on the customer side under economic objectives. To minimize operating costs with the demand charge, a control strategy optimization of the system based on the TD3 algorithm was proposed. The individual units of the system were modelled and the CCHP system operation optimization problem was transformed into a Markov decision problem, and solved using the TD3 algorithm. Then, the experimental verification analysis was carried out. The results showed that the TD3 agent with demand charge balanced the demand charges and real-time operating costs well, and was generalizable, which reduced the total operating costs by 41.5% and 8.6% compared to the historical operating strategy and the TD3 agent without demand charge, respectively. Above results provided a new solution for reducing the cost and improving the economics of agricultural energy supply.

Key words: combined cold, hot and power system, demand charge, deep reinforcement learning, twin delayed deep deterministic policy gradient

中图分类号: