A model - based machine learning to develop a PLC control system for Rumaila degassing stations

: Degassing station breakdowns can be dangerous to the operator health and the environment. Programmable logic controllers (PLCs) are key modules of manufacturing control systems that are applied in the complex oil and gas units to reduce manpower and unnecessary faults. However, feeding a PLC with data is a difficult part due to the need of system log files which records all events that occur in the oil fields and provide visibility to a given environment. Moreover, most critical chemical processing plants and oil distributions are visualized and inspected by Supervisory Control and Data Acquisition Systems (SCADA). These systems have been focused on safety, and there are issues that they could be the target of worldwide terrorists. Along with the frequently rising internet-related attacks, there is indication that our degassing stations may similarly be susceptible; for that reason, it is essential to secure PLC and SCADA from undesired incidents. Recently, machine learning (ML) has been increasing interest in industrial systems to detect, identify, and store information. Therefore, we propose to apply an advance ML based on deep neural networks to the PLC system with the purpose of: 1) detecting anomalous or irregular PLC actions; 2) Optimizing the operation of systems and its facilities; 3) allowing the equipment to respond to changing and novel scenarios; 4)


Introduction
The industry generally utilizes automatic machine and automated operations for separate the item of production and continuous method. Machines were in the starting mechanically controlled, after that, they were electromechanically controlled (such as a relay or contactor), and nowadays they are controlled by electrical or electronic within computers or control systems. A control system is a series of actions, preserve several variables constant, or track a number of specified changes [1]. The main function of the controller is to preserve the performance of the field on the required value. For example, the conveyor belt used to control the quantity of objects and forward them into a packing case. The control inputs may derive from switches being shut or unlocked; for example, some sensors used to measure the temperature of flow rates. The controller device might be needed to handle an engine to shift a target at a particular position or to turn a valve on or off. Originally, the rules leading the control system are defined by the wiring. When these rules are changed, the wiring has to be changed that is a time-consuming [2].
Choice of the controller and parameters detection are essential tasks in typical control approaches [3]. Controller parameters that are obtained cannot permanently offer the stability of a system to the preferred extent because of issues; for instance, modeling mistakes of system, disruptive effects, and variations in the controlled parameters. The majority of the systems behavior employed in the industry currently are nonlinear time-delay. These schemes have too much overshoot, abnormal settling times, and are not stable. Basically, it is very hard to model a controller using simple approaches. However, it can be modeled by using different schemes if mathematical functions of the controller can epitomize the system No.29-(12) 2020 Journal of Petroleum Research & Studies (JPRS) E3 behavior thoroughly [4] [5]. Nevertheless, it is quite challenging to have a practical mathematical model.
To overcome this problem, we need a high-performance control system to accept field wiring to be ended on input/output (I/O) terminals and to be re-programmed by workers and each time the machine process is changed, no wiring changes are required. A Programmable Logic Controller (PLC) is a computer-type device for applying certain functions (for instance arithmetic, timing, or sequencing) to control over digital or analog terminals different kinds of processes [6] [7]. Initially, the PLC was used to replace relay logic, but it is found now in more complex applications that require high-reliability operations, particularly in oil and gas systems. The PLC is intended for numerous data sources and yield sequences of action, immunity to electrical noise, and resistance to vibration [8]. The high number of these I/O terminals of the PLC will enable this single system to control a large number of pumps or motors and any failure can also be monitored [9] [10].
Since processes are becoming more complex [11], and the functionality of machines are growing, the operator needs a powerful tool to control and monitor production systems. A Human Machine Interface (HMI) system represents the interface between the human being (operator) and the process (machine) [12]. It lets the operators to quickly understand current operating conditions to successfully monitor and control such a process. Besides, HMI can receive views from other applications and have graphing and trending capabilities. Recently, PLCs have shown their ability to achieve significant improvements in real time processes [13] [14], ranging from basic process control to complicated maintenance and data management applications. In [6], a new control module based on a PLC and fuzzy approaches has been proposed for hybrid wind power generation system to enhance its fullload power factor. Moreover, a production line controlled by a solenoid valve based on PLC is shown in [15] [16].
Despite all of these standard control techniques, Artificial Neural Networks (ANN) began to be applied in the area of control as they have the proficiency to train, generalize, and to form arithmetic expressions [17] [18]. In the sense of biological analogy, a neural network tries to follow the human brain's skill to learn from data, and simplify models. Neural networks have been designed to solve non-linear complex applications [19]. Engineering No.29-(12) 2020 Journal of Petroleum Research & Studies (JPRS) E4 control frameworks, which are usually utilized in the crucial infrastructure, add safety and suitability in modern society. These frameworks have provided consistently for long times; however, a varying technological conditions is subjecting them to hazards that they were not planned to manage [20]. Especially, their confidence in networking technologies, supporting remote access and control over the connectivity of Internet, extensively rise the possibility of threats. A typical method while examining attacks connecting an industrial control system is to focus on the central server or SCADA system [21]. These servers normally employ operating systems, allowing the use of standard scientific instruments. But, the field of physical devices, for example remote terminal units (RTUs) usually depend on protective embedded hardware systems, and consequently, need arithmetical scientific methods [22].
Inappropriately, these methods are very inadequate in their functionality. Besides, PLCs which cooperate with sensors and actuators, are significant elements of industrial systems.
Moreover, the current interconnectivity of PLCs and SCADA systems with shared networks and Internet has notably enhanced the threats to critical infrastructure [23]. Accordingly, they are appealing objects for attackers. An important example is the Stuxnet malware that affected PLC Siemens worked in Iran's uranium hexafluoride centrifuges [24]. The malware reprogrammed the PLC data to produce failures and destruction while delivering faked data to the operators to hide the attacks [25]. Unlike conventional digital forensics, no classic rules, techniques, and devices are accessible for making PLC forensics. A crucial test is the require of system logs for forensic investigations.
In this paper, we propose a novel model, which allows modelling of more complex oil and gas processes. We apply machine learning approaches to the logged data to detect strange or anomalous PLCs processes and internet network. The proposed system is applied to the modern Siemens SIMATIC S7-1214 CPU. The main goal of this work is to design a controller proficient of preparing the best possible decision for a Rumaila degassing stations.
It is likewise feasible to observed breakdown of sensors, actuators, and failures in measurement; hence avoiding the field from being cooperated.
The rest of this paper is organized as follows: Section 2 explains the novel methods we used based machine learning. Then, proposed philosophy and experimental conditions are showed in Section 3. Evaluation and discussion are presented in Section 4. Finally, Section 5 concludes the contributions of this paper. The main task of any control system is to automatically adjust the output and preserve it at the preferred value. If the input is altered, the output must react and adjust to the new set value. If something occurs to interrupt the output without an adjustment to the input, the output must go back to the correct value. To control this output, the error between output and input must be computed. This paper proposes a new approach for controlling industrial processes by using a neural network. It aims to build a new control system help to reduce the error between desired and actual signals, while it reacts speedily with any change in the input and dealing with the disturbance in the system. Artificial Neural network (ANN) has the ability to supervise, handle any data, make a combination between neural network and PLC.
ANN learn the industrial process and test it every time when PLC program is running, and can work as a controller for some tasks inside the process. ANN may separate between PLC and the process and run the process by its learning knowledge when necessary [26].
The algorithm is applied in a PLC for real-time operation. The system execution is evaluated with each one of the input training and test sets. This system is a multi-layer recurrent neural network (RNN), which consist of an input, hidden, and output layers. The hidden layer takes nonlinear sigmoid function, whereas the output layer takes linear function. The suggested recurrent model is operating fully connected nodes with feedback path from the hidden layer to the input layer. Figure  Bidirectional long short-term memory (biLSTM) is investigated in this study as a main topology for the RNN. The principal idea of the biLSTM was firstly proposed in [27]. For a given input vector sequence , a regular RNN based biLSTM calculates hidden state vector sequence and outputs vector sequence .
More specifically, biLSTM splits the state neurons in a forward state sequence , and backward state sequence ; which indicates that the output of the forward and backward states are not connected. This can be viewed in Figure (2). Alternatively, the process of the biLSTM can be expressed here as: where is the weights matrix between two layers, is the bias vectors, and represents an activation function expressed as [28]: writes violence [29]. With the purpose of discover these types of anomalous actions, we perform the followings. We primarily take applicable values of memory addresses operated by PLC system in normal conditions. Then, the secured values are applied to train a model for the desired performance of the PLC using RNN based machine learning. After that the trained model can be used to verify whether the PLC events are in regular process or anomalous.

HMI monitoring system
The Human Machine Interfaces (HMI) is driven in this paper by SIMATIC KTP600 Basic color 5.7 inches, TFT-LCD (thin-film-transistor liquid-crystal display) offering 256 colors, one Ethernet interface (TCP/IP) or one RS 485/422 interface, and touch screen with six physical function keys. It is a straightforward and accessible illustration of process values [12]. The KTP600 gives HMI main functionality (alarms, trend curves, recipes) with 500 tags.
HMI structure in this work is executed with the industrial software SIMATIC WinCC, which is an element of STEP 7 v10.5. WinCC is the software for process visualization we use to handle all necessary configuring tasks. If a critical event happens in the process, an alarm is activated automatically; for instance, if a predefined limit is exceeded the trip point. Machine learning on any system makes an ability to record the measured data from the inputs and execute complex computation for modeling purposes. HMI allows a single operator to detect and control other complicated process. In this work, the HMI has been upgraded and trained with event logging adequacy and visualization of the process flow, as shown in Figure (4). This logging adequacy will be used for RNN creation. Thus, HMI can accept a single operator to monitor thousands of controls systems.

Experimental settings:
Since the concept of deep learning is defined as a sequence of layers, we will make a sequential model and add layers one at a time until we are satiated with our network topology. In our paper, the number of inputs will be the same number of instruments in the field, in addition to the last input, which indicates if the packet is normal or malicious.
The circuit created with a PLC function block diagram (FBD) as a software programming is designed to provide a switching selection for running; for example, some mechanical pumps as shown in Figure (4). An HMI configuration is designed as per the condition of the FBD software package, and so that would be visibly recognized by the operator who is sitting in the control room to administer and manage any actions remotely. In SCADA/HMI, visualization tool is prepared to model the process that will be comprehensible and practical to understand the ongoing activity by monitoring in the screen. With the development in technology of touch screen, the input data can be fed through the PLC, and also if there any modifications in the data can similarly be revised online. Correspondingly, it will be specified and showed in the HMI. The experiment was executed on an NVidia Titan X GPU.
The total time for learning (Training + Testing) was 2 hours.

No.29-(12) 2020 Journal of Petroleum Research & Studies (JPRS)
E11 Once the proposed system has been trained, it can be expended to make controls and estimates. We can do predictions on test or validation data to approximate the system behavior. The RNN topology and the final group of weights are all that we require to keep from the network system. Predictions are created by delivering the input to the system and making a sequence-to-sequence letting it produce an output that we can utilize as a prediction. Neural network training settings are shown in Figure (5).

Results and discussion:
When the model is prepared, we can approximate the performance of the model on the train and test datasets. Then, we can generate predictions to get a visual indication of the ability of this model through processing samples. Once the training finish, we can accept the generated data as a reference model output compared to the reference model input, as seen in Figure (6). Here, we examine the effect of the number of sampled data on the performance of our method. As expected, it shows that the constructed proposed model has excellent predictability when the simulation concluded. Therefore, the training data is acceptable.
To measure the PLC accuracy based on new acceptable trained data, four different connectivity metrics were used to discover modeling errors, and to better understand a  Figure 8 shows the output from the PLC neural network for training, validation, and testing data sets.
It can be seen that the output created from our proposed model is matched the optimal output of the plant with an accuracy of almost 0.99987, and neglect of all the fake data. Similarly, we test our model with real data; and the result was also very similar to reality as viewed in To sum up, the modeling technique performance is assessed with training and test sets.
Useful result has been achieved in Figures (7-9), in which the proposed system guarantees its efficiency to work accurately. Most importantly, to test our proposed system on malicious packets, we trained some anomalous data with different IP addresses, ports, and values; and injected with PLC system. The machine learning model detects all normal data with a high accuracy of 99.9% (please see Figure 8). Hence, this yields that our PLC model based biLSTM neural network is beneficial in the Rumaila oil field.

Conclusions:
The programmable logic controller is a manufacturing platform for developing systems and applying advanced control in real-time. This paper proposed a method for using machine learning based recurrent neural network in a PLC at Rumaila degassing stations. Our proposed neural network based biLSTM is a successful tool in implementing, modeling, and optimization a discrete event system controller. The proposed model confirms its effectiveness to perform correctly even in the case of some sensor faults. This technique might be extremely beneficial in decreasing the difficulty in PLC diagnosis as a consequence of the RNN model. The results have shown that the neuro-controller (PLC-RNN) model can be attractive for us. Thus, the system is made more secure, reliable; it cuts the production costs, increases the quality, and highly efficient by means of the proposed system.
For future work, Authors plan to test this model for larger systems with numerous input transmitters including pressure, temperature, and vibration sensors.