Self-Learning Controllers in the Oil and Gas Industry
DOI:
https://doi.org/10.52716/jprs.v11i1.427Abstract
Recently, solving the optimization-control problems by using artificial intelligence has widely
appeared in the petroleum fields in exploration and production. This paper presents the stateof-
the-art reinforcement-learning algorithm applying in the petroleum optimization-control
problems, which is called a direct heuristic dynamic programming (DHDP). DHDP has two
interactive artificial neural networks, which are the critic network (provider a
critique/evaluated signal) and the actor network (provider a control signal). This paper focuses
on a generic on-line learning control system in Markov decision process principles.
Furthermore, DHDP is a model-free learning design that does not require prior knowledge
about a dynamic model; therefore, DHDP can be appllied with any petroleum equipment or
devise directly without needed to drive a mathematical model. Moreover, DHDP learns by
itself (self-learning) without human intervention via repeating the interaction between an
equipment and environment/process. The equipment receives the states of the
environment/process via sensors, and the algorithm maximizes the reward by selecting the
correct optimal action (control signal). A quadruple tank system (QTS) is taken as a benchmark
test problem, that the nonlinear model responses close to the real model, for three reasons:
First, QTS is widely used in the most petroleum exploration/production fields (entire system or
parts), which consists of four tanks and two electrical-pumps with two pressure control valves.
Second, QTS is a difficult model to control, which has a limited zone of operating parameters
to be stable; therefore, if DHDP controls on QTS by itself, DHDP can control on other
equipment in a fast and optimal manner. Third, QTS is designed with a multi-input-multi
output (MIMO) model for analysis in the real-time nonlinear dynamic system; therefore, the
QTS model has a similar model with most MIMO devises in oil and gas field. The overall
learning control system performance is tested and compared with a proportional integral
derivative (PID) via MATLAB programming. DHDP provides enhanced performance
comparing with the PID approach with 99.2466% improvement.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2021 Journal of Petroleum Research and Studies
This work is licensed under a Creative Commons Attribution 4.0 International License.