Institutionen för systemteknik Department of Electrical Engineering Adaptive Cruise Control for Heavy Vehicles Hybrid Control and MPC Examensarbete utfört i Reglerteknik vid Tekniska Högskolan i Linköping av Daniel Axehill Johan Sjöberg Reg nr: LiTH-ISY-EX-3416-2003 Linköping 2003 TEKNISKA HÖGSKOLAN LINKÖPINGS UNIVERSITET Department of Electrical Engineering Linköping University S-581 83 Linköping, Sweden Linköpings tekniska högskola Institutionen för systemteknik 581 83 Linköping Adaptive Cruise Control for Heavy Vehicles Hybrid Control and MPC Examensarbete utfört i Reglerteknik vid Linköpings Tekniska Högskola av Daniel Axehill Johan Sjöberg Reg nr: LiTH-ISY-EX-3416-2003 Handledare: PhD Michael Blackenfelt (Scania) Lic. Eng. Kristian Lindqvist (Scania) Lic. Eng. Johan Löfberg (LiTH) Examinator: Prof. Torkel Glad Linköping 2003-02-24 Avdelning, Institution Division, Department Datum Date 2003-02-13 Institutionen för Systemteknik 581 83 LINKÖPING Språk Language Svenska/Swedish X Engelska/English Rapporttyp Report category Licentiatavhandling X Examensarbete C-uppsats D-uppsats ISBN ISRN LITH-ISY-EX-3416-2003 Serietitel och serienummer Title of series, numbering ISSN Övrig rapport ____ URL för elektronisk version http://www.ep.liu.se/exjobb/isy/2003/3416/ Titel Title Adaptiv farthållning för tunga fordon - hybrid reglering och MPC Adaptive Cruise Control for Heavy Vehicles - Hybrid Control and MPC Författare Author Daniel Axehill och Johan Sjöberg Sammanfattning Abstract An Adaptive Cruise Controller (ACC) is an extension of an ordinary cruise controller. In addition to maintaining a desired set velocity, an ACC can also maintain a desired time gap to the vehicle ahead. For this end, both the engine and the brakes are controlled. The purpose with this thesis has been to develop control strategies for an ACC used in heavy vehicles. The focus of the work has been the methods used for switching between the use of engine and brake. Two different methods have been studied, a hybrid controller and an MPC-controller. For the hybrid controller, the main contribution has been to use the influence of the surroundings on the acceleration of the truck. This consists of several parts such as wind drag, road slope and rolling resistance. The estimated influence of the surroundings is used as a switch point between the use of engine and brakes. Ideally, these switch points give bumpless actuator switches. The interest in the MPC-controller as an alternative solution was to achieve automatic actuator switching, thus with no explicitly defined switch points. The MPC-controller is based on a model of the system including bounds on the control signals. Using this knowledge, the MPC-controller will choose the correct actuator for the current driving situation. Results from simulations show that both methods solve the actuator switch problem. The advantages with the hybrid controller are that it is implementable in a truck with the hardware used today and that it is relatively simple to parameterise. A drawback is that explicit switch points between the uses of the different actuators have to be included. The advantages with the MPC-controller are that no explicit switch points have to be introduced and that constraints and time delays on signals in the system can be handled in a simple way. Among the drawbacks, it can be mentioned that the variant of MPC, used in this thesis, is too complex to implement in the control system currently used in trucks. One further important drawback is that MPC demands a mathematical model of the system. Nyckelord Keyword hybrid control, MPC, ACC, Adaptive cruise control, truck, actuator switching, switch strategies, state machine Abstract An Adaptive Cruise Controller (ACC) is an extension of an ordinary cruise controller. In addition to maintaining a desired set velocity, an ACC can also maintain a desired time gap to the vehicle ahead. For this end, both the engine and the brakes are controlled. The purpose with this thesis has been to develop control strategies for an ACC used in heavy vehicles. The focus of the work has been the methods used for switching between the use of engine and brake. Two different methods have been studied, a hybrid controller and an MPCcontroller. For the hybrid controller, the main contribution has been to use the influence of the surroundings on the acceleration of the truck. This consists of several parts such as wind drag, road slope and rolling resistance. The estimated influence of the surroundings is used as a switch point between the use of engine and brakes. Ideally, these switch points give bumpless actuator switches. The interest in the MPC-controller as an alternative solution was to achieve automatic actuator switching, thus with no explicitly defined switch points. The MPC-controller is based on a model of the system including bounds on the control signals. Using this knowledge, the MPC-controller will choose the correct actuator for the current driving situation. Results from simulations show that both methods solve the actuator switch problem. The advantages with the hybrid controller are that it is implementable in a truck with the hardware used today and that it is relatively simple to parameterise. A drawback is that explicit switch points between the uses of the different actuators have to be included. The advantages with the MPC-controller are that no explicit switch points have to be introduced and that constraints and time delays on signals in the system can be handled in a simple way. Among the drawbacks, it can be mentioned that the variant of MPC, used in this thesis, is too complex to implement in the control system currently used in trucks. One further important drawback is that MPC demands a mathematical model of the system. I II Sammanfattning En adaptiv farthållare (ACC) är en utökning av en vanlig farthållare. Förutom att hålla en inställd hastighet, kan en ACC också hålla en önskad tidslucka till framförvarande fordon. För att kunna uppfylla ovanstående önskemål regleras både motor och bromsar. Syftet med det här examensarbetet har varit att utveckla reglerstrategier för en ACC, använd i tunga fordon. Fokus med arbetet har varit att utveckla metoder som hanterar bytet mellan gas och broms. Två olika metoder har studerats, dels en hybridregulator och dels en MPCregulator. Huvudbidraget med hybridregulatorn har varit att använda omgivningens påverkan på lastbilens acceleration. Den består av flera olika komponenter såsom luftmotstånd, väglutning och rullmotstånd. Den skattade påverkan från omgivningen används som brytpunkt mellan användandet av gas och broms. Idealt kommer denna brytpunkt att åstadkomma stötfria aktuatorbyten. Intresset för MPC-regulatorn som en alternativ lösning var att bytet mellan gas och broms skulle kunna skötas automatiskt, utan några explicit definierade brytpunkter. Detta är möjligt eftersom MPC-regulatorn har tillgång till en modell över fordonet och de intervall inom vilka styrsignalerna måste befinna sig. Resultat från simuleringar visar att båda metoderna löser problemet med att byta aktuator. Fördelar med hybridregulatorn är att den är implementerbar i en lastbil redan med dagens hårdvara samt att den är relativt enkel att parameterisera. En nackdel är att explicita brytpunkter mellan användandet av de olika aktuatorerna måste införas. Fördelar med MPC är att inga explicita brytpunkter måste införas samt att bivillkor och tidsfördröjningar på signaler i systemet kan hanteras på ett enkelt sätt. Bland nackdelarna kan nämnas att den variant av MPC som har använts i detta examensarbete är alltför beräkningskrävande för att kunna implementeras i det styrsystem som används i lastbilar idag. Ytterligare en viktig nackdel är att MPC kräver en matematisk modell av systemet. III IV Acknowledgements We would like to express our gratitude to our supervisors at Scania in Södertälje, Kristian Lindqvist and Michael Blackenfelt, for excellent guidance and many inspiring discussions. We would also like to thank our supervisor Johan Löfberg from Linköping University for his good advises and help through the work. Furthermore, the group RESC at Scania deserves our best thanks for making our time at Scania to a great final of our undergraduate studies. During our time at the company many people not explicitly mentioned above also have supported us. A thousand thanks to you all! Last, but not least, warm thanks are dedicated to our families for their love and support throughout the university studies. Linköping, February 2003 Daniel Axehill and Johan Sjöberg V VI Contentsroblem Description........................................................................................................................2 1.4.2 ISO Standard Performance Limitsistance and Velocity Sensor...........................................................................................................4 2.2.2 Yaw-rate Sensorinearisation................................................................................................................................. 11 3.3.2 Conversion of Time Continuous Models to Time Discrete Models................................................... 12 3.3.3 Extended Kalman Estimator .......................................................................................................... 13 3.4 MODEL PREDICTIVE CONTROL – MPC ................................................................................................... 15 3.4.1 MPC Basics................................................................................................................................... 15 3.4.2 Extending the Framework to Handle Measurable Disturbances...................................................... 20 3.4.3 More about Constraints................................................................................................................. 21 3.4.4 Using a Model Linearised around a Non-Equilibrium Point ........................................................... 22 3.4.5 Choosing the Type of Control Signal in the Prediction ................................................................... 23 3.4.6 Soft Constraints............................................................................................................................. 26 3.4.7 Handling of Time Delays ............................................................................................................... 28 3.4.8 Dynamic State Weights .................................................................................................................. 30 3.4.9 Terminal State Weight ................................................................................................................... 30 3.4.10 Using MPC through an Already Existing Controller..................................................................... 32 3.4.11 Avoiding Simultaneous Use of Control Signals............................................................................. 32 4 SYSTEM MODELLING............................................................................................................................ 35 4.1 TRUCK .................................................................................................................................................. 35 4.1.1 Vehicle .......................................................................................................................................... 35 4.1.2 Engine........................................................................................................................................... 37 4.1.3 Wheel Brakes ................................................................................................................................ 38 4.1.4 Retarder........................................................................................................................................ 39 4.2 DRIVING ENVIRONMENT ........................................................................................................................ 43 4.2.1 Road Description........................................................................................................................... 43 4.2.2 Traffic Situation ............................................................................................................................ 43 5 STATE MACHINE COMBINED WITH TRADITIONAL CONTROL.................................................. 45 5.1 INTRODUCTION...................................................................................................................................... 45 5.2 ESTIMATIONS AND CALCULATIONS ........................................................................................................ 47 5.2.1 The Desired Distance to the Lead Vehicle ...................................................................................... 47 5.2.2 The Acceleration From the Surroundings, aenvironment ....................................................................... 48 5.2.3 Curve Detectiono Car Ahead................................................................................................................................ 57 5.5.2 Follow Mode ................................................................................................................................. 58 5.5.3 Urgent Operation .......................................................................................................................... 66 5.5.4 Lost Target.................................................................................................................................... 67 6 STATE MACHINE COMBINED WITH MPC......................................................................................... 69 6.1 INTRODUCTION...................................................................................................................................... 69 6.2 LINEARISATION OF THE SYSTEM MODEL ................................................................................................ 69 6.3 CONVERSION OF THE TIME CONTINUOUS SYSTEM MODEL TO A TIME DISCRETE MODEL .......................... 73 6.4 INTRODUCING INTEGRAL ACTION........................................................................................................... 74 6.5 STATE ESTIMATION ...............................................................................................................................77 6.6 STATE MACHINE ................................................................................................................................... 78 6.6.1 Introduction .................................................................................................................................. 78 6.6.2 States ............................................................................................................................................ 78 6.6.3 Other Logicomments to simulation results...................................................................................................... 95 8.2 DESCRIPTION OF SCENARIO 2 ................................................................................................................. 97 8.2.1 Comments to simulation results...................................................................................................... 97 9 CONCLUSIONS ........................................................................................................................................ 99 10 PROPOSITIONS TO FUTURE WORK ............................................................................................... 101 10.1 EXPLICIT MPC .................................................................................................................................. 101 10.2 INTERIOR POINT SOLVER .................................................................................................................... 101 10.3 MORE ADVANCE REFERENCE SIGNAL PRE-FILTERING ........................................................................... 101 11 BIBLIOGRAPHY .................................................................................................................................. 103 APPENDIX A.............................................................................................................................................. 105 A.1 A.2 A.3 A.4 A.5 A.6 A.7 A.8 HYBRID CONTROLLER PLOT.......................................................................................................... 106 MPC-CONTROLLER PLOT ............................................................................................................. 107 ROBUSTNESS TEST ....................................................................................................................... 108 ROBUSTNESS TEST ....................................................................................................................... 109 ROBUSTNESS TEST ....................................................................................................................... 110 ROBUSTNESS TEST ....................................................................................................................... 111 HYBRID CONTROLLER PLOT: CUT-IN SITUATION ............................................................................ 112 MPC-CONTROLLER PLOT: CUT-IN SITUATION ............................................................................... 113 APPENDIX B .............................................................................................................................................. 115 B.1 B.2 B.3 RELINEARISATION ....................................................................................................................... 115 NUMBER OF SLACK VARIABLES AND THEIR RESPECTIVE PENALTY ................................................ 115 PREDICTION HORIZON AND CONTROL HORIZON ............................................................................ 116 VIII 1.1 Background 1 1 Introduction 1.1 Background The background to this thesis is the modern driver’s growing demand of comfort increasing driving assistance. By the introduction of the ordinary cruise controller, a new aid for relaxed and comfortable driving became available to the driver. As long as the road has high standard and the traffic flows, it does its job quite well. Everybody who has driven a cruise controller equipped car knows that it also has its limitations. The most important one is of course that it does not know anything about the surroundings, which makes it impossible to adjust the speed of the vehicle to the actual driving situation. As sensors have become smaller, more reliable and cheaper, it is today possible to provide the ordinary cruise controller with necessary information about the surroundings. The most desirable variable to control is the distance to the vehicle ahead. The extended controller with this property is called an Adaptive Cruise Controller (ACC). To control the distance both the engine and the brakes are used. In the truck application, not only engine brake and wheel brakes are used, but also a so-called retarder. The retarder is a hydraulic auxiliary brake, which is used to relieve the wheel brakes at softer brakings. 1.2 Purpose The purpose with this thesis is to design control strategies that makes it possible to use ACC in heavy trucks. The focus in this work is to develop strategies for handling the switching between the usage of the engine, the wheel brakes and the retarder. Two methods are studied. The first method is to use a so-called hybrid controller and the objective is to develop a state machine and the associated control strategies. Since this partly already has been done at Scania, our focus is to find and test new ideas. The second method is to use an advanced control strategy called Model Predictive Control (MPC). The purpose is to find out if an MPC-controller can switch between engine and brakes without any explicitly defined switch points. Furthermore, it is interesting to find out if the state machine can be reduced, compared to the hybrid solution. 1.3 Method First old reports, mainly from Scania, were examined for ideas and important basic knowledge. Then the existing Simulink truck model was refined and extended to fulfil our needs. To be able to simulate realistic driving situations road and traffic scenarios have been modelled. All sensors used by the ACC-system have also been included in the vehicle model. The next part in the work was to draw up the guidelines for the ACC logic. These were implemented in the Simulink toolbox Stateflow. In the MPC-part of the thesis, the controller was written as a Matlab S-function. In this part, also the state machine was written as an S-function and Stateflow was thus not used at all. All results in this thesis originate from simulations in Matlab and Simulink, i.e. no practical tests have been performed. 2 Introduction 1.4 Introduction to the ACC Problem 1.4.1 Problem Description To be able to introduce the reader to the problem, first the main variables in the problem need to be defined. These variables, and their explanation, are given in table 1.1. The problem is also visualised in figure 1.1. v Lead vehicle vlead distref ACC equipped truck dist Figure 1.1: Definition of variables in the ACC-problem. Table 1.1: Variable description Variable v dist vrel arel ψ& Physical description ACC-equipped truck velocity. Distance to the lead vehicle. Relative velocity between the truck and the lead vehicle. Relative acceleration between the truck and the lead vehicle. Yaw-rate From these measurements the lead vehicle velocity can be calculated as vlead = v + vrel (1.1) The control objective in the ACC-problem can be described as controlling v in a way that dist equals distref. This also implies that v should equal vlead under static conditions. The reference distance distref is calculated from the desired time gap thw, which is chosen by the truck driver. The control objective is supposed to be met in presence of external disturbances as lead vehicles, road slope, wind drag, rolling resistance and curves. In addition, some constraints need to be met. First, the truck velocity is limited to the one set by the driver. This upper limit is in this thesis often referred to as the ACC set speed, vref,ACC. Second, there are some formal limits defined by the ISO-standard [20]. 1.4.2 ISO Standard Performance Limits Some limits on the system performance are given by the ISO-standard for an ACC-system [20]. In this thesis, only the constraints regarding the acceleration are of interest. The important constraints regarding the acceleration when the ACC-system is enabled are • • • The maximum mean retardation shall be limited to 3.0 m/s2 (average over 2 s). The maximum mean retardation rate shall not exceed 2.5 m/s3 (average over 1 s). The maximum acceleration shall not exceed 2.0 m/s2. 2.1 Vehicle 3 2 System Description 2.1 Vehicle The adaptive cruise controller is designed for a Scania 164 truck (See figure 2.1). If nothing else is mentioned, the truck mass is in this thesis assumed 25000 kg. Except for the standard equipment, the truck used is assumed to have an electronically controlled brake system (EBS) and a so-called retarder. This is an auxiliary brake system that is supposed to be used in noncritical situations. More information about the retarder can be found in section 2.5. Other nonstandard equipment assumed to be present on the truck is the combined distance and velocity sensor, and the yaw-rate sensor. These are further described in section 2.2.1 and 2.2.2 respectively. Figure 2.1: This figure shows the Scania 164 truck. In the number 164, 16 stands for a 16-litre engine and 4 stands for the truck generation (4 is the latest generation). In the perspective of an adaptive cruise controller, a truck differs mainly in two important ways compared to an ordinary car, the mass of the vehicle and the number of braking systems. The large mass affects the control design basically in two ways. First, it is more important to keep a longer distance to a vehicle in front of the truck. This gives extra time for smooth braking and extends the possibilities for the control system to handle more situations without involving the driver. This makes it possible to brake the truck in time to avoid a collision and it also reduces excessive braking that can fade the brakes and consume a lot of unnecessary fuel. Second, the large mass stores a great amount of motion energy. This energy originates, in most cases, from the fuel. This means that it is important to use it in the best possible way. For example, instead of keeping the set cruise speed, when driving downhill it could be desirable to increase the speed a little. This means that the stored energy from this extra velocity can be used when the road flattens out and the wind drag and the rolling resistance again are the dominating external forces in steady state. The other property, which makes the control design harder on a truck, is that it possesses different kinds of braking systems. Particularly on the truck used in this thesis there are four kinds of braking possibilities. The first way to apply a negative force to the truck is to stop the injection of fuel to the cylinders. This kind of braking is called engine braking, because the engine friction and pump losses are used. Normally on a gasoline engine the motor brake is quite effective. This is highly dependent on the vacuum pressure that is built up in the intake 4 System Description manifold when the throttle is at idle position. When this is the case the pump losses are significant and the engine is braking. On a diesel engine there is no throttle plate that controls the airflow in the intake manifold. This makes the braking effect when the throttle is released very small, compared to a spark-ignited engine. To compensate for this drawback it is common to place a so-called exhaust brake in the exhaust pipe. This device is a throttle valve in the exhaust pipe that can create a high exhaust pressure that brakes the engine. The third brake system is the so-called retarder. This is a hydraulic brake and is further described in section 2.5. Finally, a truck is equipped with a pneumatic so-called foundation brake. This brake system is more thoroughly described in section 2.4. 2.2 Sensors The truck contains several different types of sensors already in the standard configuration. These are used for example in the engine control system, electronic stability system (ESP), anti-lock braking system (ABS) etc. The ACC system needs some extra sensors to measure parameters regarding the surrounding driving situation like distance to the vehicle ahead and the curvature of the road. A possibility is to measure the curvature of the road using the already existing yaw-rate sensor included in the ESP. 2.2.1 Distance and Velocity Sensor When an ACC system is implemented, there are several alternative sensors available to measure the distance between the ACC vehicle and the vehicle ahead. The two most relevant sensor types are based on laser or radar technology. Differences in performance between these two types occur when the weather is bad. The performance of the laser sensor is highly degraded by dirt, snow or mud on the back of the target vehicle. The same problem also arises if the lens of the laser sensor is dirty [14]. The radar sensor is not affected by bad weather or by plastic components that are hiding the sensor in the front. This makes it possible to mount a radar sensor behind a protective plastic cover integrated in the bumper [14]. Finally, the laser can run into problems in intensive sunlight [14]. The radar sensor and the laser sensor share some basic physical limitations. The two most important are that they may lose the target when the curvature of the road is too high or the slope changes too fast. These problems occur because of the limited field of sight of the sensors and the fact that they cannot see through the ground. In this thesis, the distance sensor used is assumed to report the target that is the one closest to the truck in its predicted driving lane. Further, all signal processing of the raw data is assumed to be performed in the distance sensor. The output signals from the unit are the distance to the lead vehicle, the relative velocity and the relative acceleration. All these signals are quantised. 2.2.2 Yaw-rate Sensor Yaw-rate is a measure of how fast the truck rotates around its vertical axis. Normally, it is given in degrees per second or radians per second. A yaw-rate sensor measures the yaw-rate usually by using a small double-ended tuning fork. The tuning fork is made of quartz which makes it is possible to get it to oscillate electronically. When the oscillating tuning fork is rotated around its axis of symmetry, the Coriolis force generates a small torque that is proportional to the yaw-rate. The torque is then converted to a DC-voltage that can be measured. 2.3 Engine 5 The yaw-rate sensor is for example used to provide the radar unit with truck yaw-rate data. The radar uses this data when it tries to decide whether vehicles ahead are in the same driving lane as the ACC-equipped truck or not. The sensor data is also possible to use in the ACC control unit for other purposes. For example, it can be used to adjust the control strategy when a curve is detected (see section 5.2). Another possible usage of this information is to lower the set speed of the vehicle if the curve is to sharp in relation to the velocity of the truck. This feature makes the ride safer and more comfortable. 2.3 Engine The engine used in this thesis is Scania’s second strongest, which is a 16-litre turbo charged V8 diesel engine (see figure 2.2). It delivers a maximum torque of 2300 Nm at 1100-1300 rpm and a maximum power of 480 hp at 1900 rpm [15]. Figure 2.2: This figure shows Scania’s second strongest engine, which is a 16-litre turbo charged V8 diesel engine. In this thesis, the engine is controlled either through the cruise controller or through injecting a desired amount of fuel per stroke, mfi,desired. The main reason not to control mfi,desired directly is that it is a hard job to find the parameters to each engine and truck configuration. By using the cruise controller the already found parameters are reused. However, in the MPC-part of this thesis the engine is controlled directly by sending the amount of fuel to be injected. This choice is mainly done since it is very difficult to use an MPC-controller through another controller if it contains non-linearities. The cruise controller has several nonlinear features and some logic that are hard to transfer to the model in the MPC-controller. Even if a proper model could be found, it would be a hard job to adjust the model whenever the cruise controller logic is changed. 2.4 Wheel Brakes The wheel brakes are the main brakes and are optimised to handle hard brakings. They are designed as disc brakes, which can be seen in figure 2.3. 6 System Description Figure 2.3: Disc brake The brakes are driven by compressed air that pushes the brake linings towards the discs. The applied air pressure is calculated by the Electronic Brake System (EBS). The previously used all-pneumatic system, where the foot pedal directly affects a valve which gives a brake pressure, is still left as backup because of legal demands. The EBS consists of several units. One unit is the foot brake module (FBM). It senses how hard the driver pushes the brake pedal and sends the request to the main Electronic Control Unit for the brakes (ECUbrake). In the ECUbrake, the requested brake pressure for each wheel is calculated based on the axle load, the wheel speeds etc. The calculated pressures are sent to the Pressure Control Modules (PCM) that are located near each wheel. These units are electro-pneumatic actuators where each unit has an integrated ECU and some sensors. One of these sensors is the pressure sensor, which is used to control the brake pressure in a closedloop. Another sensor is the wheel speed sensor. It sends data to the ECUbrake. The reason is that the Anti-lock Braking System (ABS) is integrated in the EBS. Hence, it is the ECU brake that controls if the angular velocity of the wheels, ω, corresponds to the velocity of the truck, v. This is done by calculating the wheel slip, σ, for each wheel according to the expression below σ= v −ω ⋅r v (2.1) where v – velocity for the truck ω – angular velocity for each wheel, measured by the different PCM r – wheel radius A large slip for a wheel, i.e. a slip close to 1, indicates that the wheel has lost its grip. The brake pressure for that wheel must then be decreased to regain the grip. In that case the ECUbrake tells the PCM to perform the pressure decrease. 2.5 Retarder 7 In this thesis, an external controller will control the EBS. The control signal sent to the EBS is a retardation request. This request is then effectuated if possible. The appropriate brake pressure for each wheel is calculated in the ECUbrake by a conversion from the requested retardation using a conversion factor, κ. This factor is dependent of for example the axle load, and is retrieved from a table. The conversion can be expressed as Pbrake = κ (axle load , slip, etc ) ⋅ rdesired (2.2) where Pbrake – air pressure to the brakes κ − conversion factor rdesired – desired retardation After the braking manoeuvre has finished, the measured retardation is compared to the desired and if a difference is detected the κ value is corrected [12]. A small exception from this rule is made if the difference becomes large during braking. In that case, κ is corrected instantly. When controlling the EBS, a phenomenon called glazing has to be taken into consideration. Glazing means that a covering is formed at the discs, because of too much gentle braking. This will decrease the braking capacity permanently. To avoid gentle breaking, the retarder should be used instead of the EBS when a small retardation is needed. 2.5 Retarder The retarder is an auxiliary brake, which is attached to the propeller shaft. In figure 2.4 a picture of a retarder can been seen. Figure 2.4: Retarder The retarder uses viscous damping to produce a braking torque. It is for example used when driving downhill to keep the velocity of the truck constant because if the wheel brakes were 8 System Description used, they might become over-heated. In order to avoid glazing, the retarder is also used when small retardation is requested (see section 2.4). The retarder consists of two turbine shovels, called the rotor and the stator, and around these, there is a circular cavity called the torus. The rotor is connected to the propeller shaft via a gear that multiplies the propeller shaft speed by two, and the stator is connected to the chassis of the truck. To produce braking torque the torus is filled with oil by an oil pump, which is also connected to the propeller shaft. Each time the retarder has been inactivated for a while, a quite long delay occurs before the torque reaches the specified value, especially if the propeller shaft speed is low. The reason is the saturation in oil flow from the pump. In order to decrease the delay, an oil accumulator is filled when the maximum capacity of the pump is not used. When the torus should be filled with oil, oil is taken from both the oil accumulator and the pump. To control the braking torque from the retarder, the oil pressure in the torus has to be controlled. The oil pressure increases if the oil flow to the torus is increased. Because of that, there is a piston that controls how much of the oil flow that goes back to the oil tank, and how much that goes to the torus. The oil flow produces a force at the top of the piston that wants to close the channel to the torus. To open the channel, the control system puts an air pressure at the bottom of the piston. In this way, the piston goes into equilibrium and the retarder gives a constant braking torque. When the retarder is used, it produces a lot of heat, so the oil has to be cooled. This is done by a heat exchanger, which is connected to the ordinary engine cooling system of the truck. The fact that the retarder uses the ordinary cooling system results in that the braking torque has to be limited when the cooling system is close to overheating. When the cooling water in the radiator reaches 95°C the braking torque is limited and at 110°C the retarder is shut down. 3.1 Hybrid Systems 9 3 Control Theory 3.1 Hybrid Systems Hybrid systems are systems that can have both continuous and discrete states. The discrete states are sometimes also referred to as modes. A general description of a hybrid system is x& (t ) = f ( x (t ), m(t ), u (t ) ) m + (t ) = φ ( x (t ), m(t ), u (t ),σ (t )) y (t ) = g ( x(t ), m(t ), u (t ) ) (3.1) o(t ) = ϕ ( x (t ), m(t ), u (t ),σ (t )) where x(t ) – continuous state variables m(t ) – discrete state variables u (t ) – continuous inputs σ (t ) – discrete inputs y (t ) – continuous outputs o(t ) – discrete outputs Hybrid systems can be divided into two different sorts, true or false. True hybrid systems are system where φ cannot be written as a function of only x and u. This means that m must be left as a memory variable to achieve a complete model description. Typical true hybrid systems can for example be systems, which contain hysteresis. False hybrid systems are of course systems where the discrete variables are not needed as memory variables. Examples of false hybrid systems can be systems that change dynamics for different values of the continuous state variables as can be seen below x& = A1 x + B1u, if x& = A2 x + B2 u, if x<c x≥c (3.2) False hybrid systems can be used in a way that the continuous states correspond to different continuous controllers and the discrete states correspond to a logical part. The logical part is often referred to as a state machine. It chooses which one of the time continuous controller to be used as can be seen in figure 3.1. 10 Control Theory Logic Parameters to the controller Reference signal, r + — Time continuous controller Control signal, u System • Output signal, y Figure 3.1: A hybrid controller (illustrated by a dashed box) where logic chooses the parameters to be used in a time continuous controller. A problem that can arise in hybrid systems is chattering. It means that fast oscillation between the different modes, for example x < c and x ≥ c, comes up. This phenomenon may be avoided by introducing hysteresis. Stability for some hybrid systems can be shown by using Lyapunov theory. More about stability and hybrid systems in general can be found in [7]. 3.2 Differential PI-controller A great problem when dealing with integrating controllers is their tendency to wind up their Ipart. This typically occurs when the reference signal is that high that the output signal saturates and the integrator continues to integrate [16]. When the reference signal later becomes lower the proportional part is too small compared to the integrator part, which means that the output still is saturated. If the integrator has a high value, it will take a very long time for the controller to produce a non-saturated output again. During this time, the controller sometimes produces a large overshoot. The solution to the windup problem is to limit the size of the integrator part if the controller saturates. This can be done in several ways, as described in [16]. In this thesis, the choice was to implement the PI-controller on so-called differential form [16]. An ordinary time discrete PI-controller is easily rewritten on differential form by taking the difference ∆vn between two in time adjacent desired output signals. In formulas this can be written as [16]: ∆vn = v n − vn−1 = K (en − en−1 ) + I n − I n−1 + K Td (en − 2en−1 + en−2 ) = Ts T T = K (en − en−1 + s en + d (en − 2en−1 + en− 2 )) Ti Ts (3.3) This difference is then added to the previous, possibly saturated, output signal. If the new output signal hits the saturation limit, the saturated output value is returned and stored as the last output value. 3.3 Linearisation of Model and State Estimation T T ∆vn = K en − en−1 + s en + d (en − 2en−1 + en−2 ) Ti Ts if u n-1 + ∆v n > u max u max , u n = u n−1 + ∆vn , if u min ≤ u n-1 + ∆v n ≤ u max u , if u n-1 + ∆v n < u min min 11 (3.4) where en – control error in time step n un – control signal in time step n umax – higher output saturation limit umin – lower output saturation limit K – controller constant Ts – sample time Ti – integrator time Td – derivative time Another nice property with the differential implementation is that bumpless transfers between different controllers are rather easy. If the new controller uses the same control signal as the old one, the new controller inherits the old controller’s un-1 and continues the incrementation from that point. 3.3 Linearisation of Model and State Estimation 3.3.1 Linearisation Most physical systems contain some sort of non-linearity. How this is handled depends on how much it affects the dynamics of the system. If the influence is small, it is often possible to neglect the non-linearity and assume the system linear. Then, ordinary linear methods to design observers and controllers can be used. These methods often have rather well defined approaches, which imply that they in many cases can be done, more or less, automatically by some computer program. If the influence is large, methods that handle non-linearities are necessary to use. One way is to use methods, which handles nonlinear systems without any approximations, for example feedback linearisation. Drawbacks with these methods are that not all nonlinear systems are possible to handle and that the methods often imply use of nonlinear observers and nonlinear state feedbacks. These observers and state feedbacks are in many cases more complicated to derive, than the linear ones. Another way is to locally approximate the nonlinear system with a linear system around a linearisation-point. In that way, the ordinary linear methods are possible to use. The approximated system is equal to the first-order Taylor-expansion of the nonlinear system around some linearisation point and the derivation can be seen below. Assume that the originally nonlinear system looks like x& = f ( x, u , d ) z = g ( x, u ) y = h( x, u ) (3.5) where x is the system state, u is the control input, d is a non-measurable disturbance, z is the control objective and y is the measured signals. Because the system is nonlinear, at least one 12 Control Theory of f, h or g is a nonlinear function. Using the first-order Taylor expansion, (3.5) can be approximated in the point x = x0, u = u0 by the following expression ∆x& = A∆x + B∆u + N∆d + f 0 ∆z = M∆x + Dz ∆u (3.6) ∆y = C∆x + D y ∆u where ∆x = x − x0 , ∆u = u − u 0 , ∆z = z − z 0 , ∆y = y − y 0 , ∆d = d − d 0 , ∂f (x, u , d ) ∂x x = x0 , u = u 0 , d = d 0 ∂g ( x, u ) M= ∂x x = x0 , u=u0 ∂h( x, u ) C= ∂x x= x0 , u =u0 A= ∂f ( x , u , d ) ∂u x = x0 , u =u0 ,d = d 0 ∂g ( x , u ) Dz = ∂ u x = x0 , u = u 0 ∂ h( x , u ) Dy = ∂u x = x0 , u =u0 B= N= ∂f (x, u , d ) ∂d x = x0 , u = u 0 , d = d 0 f 0 = f ( x0 , u 0 , d 0 ) The constant term f 0 corresponds to the derivative in the linearisation point, i.e. (x& )0 . In literature, the linearisation-point ( x0 , u0 , d 0 ) is often assumed to be a stationary point, which yields that f0 will equal zero. As was mentioned earlier, the linearisation is a local approximation of the nonlinear system around the linearisation point. If some of the states, the control signals, the control objective, the outputs or the disturbances takes values in a large interval, the approximation may not be accurate enough. Then, it might be necessary to approximate the nonlinear system around a new point. This is called relinearising. How often the relinearisation has to be performed, depends both on how nonlinear the system is and on how large the changes in the different signals are. If the system is fairly linear and the changes in the signals are quite small, it can be enough to linearise around one point and therefore no relinearisation is needed. On the other hand, if the system is rather nonlinear relinearisation must be performed. One method is to relinearise the model in each sample time. Another method, which is a special case of the first, is linearise the system in some points before it is used (off-line linearisation). The models, corresponding to the different linearisation points, are then switched between as the signals in the model change. This will yield that if the signals are rather constant, no switching will occur. This method will therefore be less complex to calculate. In cases when the linearisation point is changed often, e.g. when a new linearisation is done in each sample time, it may be necessary to take this into consideration when designing the observer (see section 3.3.3). 3.3.2 Conversion of Time Continuous Models to Time Discrete Models In section 3.4.1, it will be seen that MPC uses a time discrete model of the controlled system. If the original system is time continuous, it must be converted to discrete time. Assume that the time continuous system is given as 3.3 Linearisation of Model and State Estimation 13 x& (t ) = Ax(t ) + Bu (t ) + N d d (t ) + f 0 z (t ) = Mx(t ) y (t ) = Cx (t ) (3.7) Then, it can be transformed into a time discrete system by solving the system of differential equations between two sample times, i.e. kTs ≤ t < (k + 1)Ts (3.8) where k – sample number Ts – sample time With some obvious extensions of the method presented in [16] the solution is x((k + 1)Ts ) = e ATs x (kTs ) + ∫ e A(Ts −t ) (Bu (t + kTs ) + N d d (t + kTs ) + f 0 )dt Ts 0 z (kTs ) = Mx(kTs ) (3.9) y (kTs ) = Cx (kTs ) Under the assumption that the control signals and the disturbances are piecewise constant between the samples, i.e. looks like the one in figure 3.2, (3.9) can be rewritten as x(( k + 1)Ts ) = Ad x( kTs ) + Bd u (kTs ) + N dd d (kTs ) + f 0 d z (kTs ) = M d x (kTs ) (3.10) y (kTs ) = C d x (kTs ) where Ad = e ATs Md = M Ts Bd = ∫ e A(Ts −t ) Bdt 0 Cd = C Ts N dd = ∫ e A(Ts −t ) N d dt 0 Ts f 0 d = ∫ e A(Ts −t ) dt 0 The states in the time discrete model will have exactly the same values as the time continuous model in the sample moments, if the assumption above holds. The reason can be found in [2]. Ts 2Ts 3Ts 4Ts Time Figure 3.2: Example of a piecewise constant signal. 3.3.3 Extended Kalman Estimator Controllers that are based on state feedback, such as LQ-controllers or MPC-controllers, require that all states are measurable. If not, the states have to be estimated from a model of the system. This is done by an observer. Assume that the system, which shall be controlled, can be described by the following discrete time model 14 Control Theory x(k + 1) = Ad x(k ) + Bd u (k ) + N d d (k ) + f 0 d + N v1 ,d v1 (k ) (3.11) y ( k ) = C d x ( k ) + v2 ( k ) where v1 and v2 are white noises. A typical discrete time observer can then, according to [2], be expressed as xˆ (k + 1) = Ad xˆ (k ) + Bd u (k ) + N d d (k ) + f 0 d + K ( y (k ) − yˆ (k )) yˆ (k ) = C d xˆ (k ) (3.12) where K is the observer gain and x̂ are the estimated states. The observer gain decides how much the observer should rely on the measured values and on the model respectively. A small K gives little feedback from the measurements and hence x̂ will mostly rely on the model and vice versa if K is large. A common and good method to choose the observer gain is to choose it equal to the Kalman gain. This choice gives a good balance between trusting the model and the measurements, depending on how much process noise, v1, and measurement noise, v2, that is present. Large v1 relative to v2 gives a large K and vice versa. The ordinary method to calculate K is to solve the stationary Riccati equation, which can be found in [2]. It looks like ( )( Pcov = Ad Pcov AdT + N v1 ,d R1 N vT1 ,d − Ad Pcov C dT + N v1 ,d R12 C d Pcov C dT + R2 ( )( K = Ad Pcov C + N v1 ,d R12 C d Pcov C + R2 T d T d ) (A P −1 d cov C dT + N v1 ,d R12 ) −1 ) T (3.13) where Pcov – covariance matrix of the prediction error, x(k ) − xˆ (k ) R1 – intensity of noise v1 R2 – intensity of noise v2 R12 – cross-spectral density between v1 and v2 This can be done by using the command dlqe in Matlab. A limitation of this method is that it assumes that stationarity for Pcov is achieved. This happens, at least approximately, if the model is linearised just once or quite seldom. If the model is relinearised often, another method, which not assumes that stationarity is reached, must be used. It is called Extended Kalman and is built upon that the Kalman gain, K (k ) , and the variance of the prediction error, Pcov (k ) , are calculated recursively. Again, assume that the system looks like in (3.11), but with all system matrices time varying, i.e. x(k + 1) = Ad (k ) x (k ) + Bd (k )u (k ) + N d (k )d (k ) + f 0,d (k ) + N v1 , d (k )v1 (k ) y ( k ) = C d ( k ) x( k ) + v2 ( k ) (3.14) and that xˆ (0) = 0 . If the system can be described by (3.14), the Kalman gain can be calculated by the expressions found in [19]. These expressions look like 3.4 Model Predictive Control – MPC 15 Re (k ) = R2 (k ) + C (k ) Pcov (k )C (k )T ( ) K (k ) = A(k ) Pcov (k )C (k )T + N v1 ,d (k ) R12 (k ) ⋅ Re (k ) −1 (3.15) Pcov (k + 1) = A(k ) Pcov (k ) A(k )T + N v1 ,d (k ) R1 (k ) N v1 ,d (k )T − K (k ) Re (k ) K (k )T where Pcov (0) – covariance matrix of x0 The covariance matrix, Pcov (0) , should be chosen large if the initial state is unknown. The optimal gain is now achieved in each time instant. A possible problem is that Pcov, which should be symmetrical, becomes non-symmetrical because of numerical noise. This can be solved by letting Pcov P + Pcov = cov 2 T (3.16) More about stationary Kalman filters can be found in [2] and more about Kalman filters in general, i.e. both stationary and extended, can be found in [19]. 3.4 Model Predictive Control – MPC 3.4.1 MPC Basics Model Predictive Control (MPC) is an advanced modern control strategy, which has had an enormous breakthrough in the industry the past 10-20 years. The main reason why MPC has gained such an interest is that it can explicitly handle constraints on control signals and on linear combinations of states. This property makes it easy to handle for example control signal limitations and safety limits on signals in the controlled plant. It is also said to be an intuitive method, which is easy to understand with only limited control theory knowledge. A common approach to a multivariable control problem is the LQ-controller. In this theory, a linear quadratic criterion like (3.17) is to be minimised. ∑ x( k ) ∞ min u k =0 2 Q1 + u (k ) 2 Q2 (3.17) The algebraic solution to this minimisation problem is a linear state feedback u(k ) = − Lx(k ) . The minimisation is performed off-line by solving a time discrete Riccati equation. By treating the weight matrices Q1 and Q2 in the criterion as design variables, an appropriate behaviour of the system can be achieved by repeated simulations and modifications of these parameters. As mentioned above the solution to the unconstrained problem can be solved exactly and the solution is the linear state feedback control law presented above. When constraints on control signals and on linear combinations of states are introduced, the minimisation problem is no longer of the type unconstrained optimisation. The global analytic solution from above can no longer be used. The problem to find an algebraic solution u(x) (state feedback) to this problem is more or less unsolved [16]. Another approach is to treat the problem as an 16 Control Theory optimisation problem in the variables u(k ) . A problem here is that the time horizon is infinite, which in turn means that it contains an infinite amount of optimisation variables u(k ) . This problem is in MPC solved by truncating the infinite sum to a finite one. The number of discrete time instants used in the sum is called the prediction horizon and is here labelled Hp. Normally this constant is chosen in a way that an ordinary settling is covered. This can intuitively be motivated by thinking that the MPC-controller has to “see” the whole settling to be able to choose the optimal control signal. Now the problem in time step k has the structure as in formula (3.18). ∑ x(k + j + 1) H p −1 min u ≤ulimit j = 0 2 Q1 + u (k + j ) 2 Q2 (3.18) In each sample time, the optimisation problem has to be solved. It now contains a finite number of optimisation variables, but is still dependent of the initial condition x(k ) of the system. When the optimisation problem is solved, only the first control signal u (k ) is used and the rest of the control signals returned by the optimisation are discarded. A simple algorithm for MPC is as follows [16]: 1. 2. 3. 4. 5. Measure or estimate the current state of the system x(k ) . Calculate the control signal sequence by solving (3.18). Apply the first element u (k ) of the control signal sequence to the system. Time update: k := k + 1 . Go to step 1 and repeat. Because of the strategy to solve the problem for a finite time horizon, which is moving in time as new sampling events occur, these problems are usually called receding horizon problems. The truncated time horizon can also bring new problems. Some of them can be solved by assuming that no constraints are active after the end of the time horizon and then the usual LQ-solution can be used to find a final weight to compensate for the time horizon truncation. This is further described in section 3.4.9. Today there is research on so-called explicit MPC, which means that the MPC-optimisation is performed off-line [10], [11]. The solution found from these calculations is stored in a table and the on-line processing is reduced to find the control signal that corresponds to the current state of the system in the table. See also section 10.1. All facts in this section, if nothing else mentioned, originate from [16]. Prediction of Future States The name model predictive control originates from the fact that the model of the system is used in a very explicit manner in an MPC-controller. It is used in the optimisation to predict the future states of the system. As mentioned above, the optimisation variables are the control signals in the sequence u(k ) to u (k + H p -1) . First, the model needs to be written on the time discrete state space form 3.4 Model Predictive Control – MPC 17 x(k + 1) = Ax(k ) + Bu(k ) (3.19) If this model is used recursively, it can be seen that a prediction of the state in time kdesired is only dependent on the initial state of the system and of the control signals applied to it from time k to k + k desired − 1 . The two-step prediction is then x(k + 2) = Ax(k + 1) + Bu (k + 1) = A2 x(k ) + ABu(k ) + Bu (k + 1) (3.20) When this is generalised to the Hp-step prediction, it is convenient to write the problem on vector form. This notation also structures the problem and prepares it to be rewritten on a standard form for optimisation. If the control signals are stacked in the vector U and the state predictions in the vector X, these vectors can be defined as U= u (k ) u (k + 1) , M u ( k + H p − 1) X= x ( k + 1) x(k + 2) M x( k + H p ) (3.21) Note that it is possible that every u in U is a vector containing several control signals (for each time instant). To finalise the matrix notation of the predictions also the A-, A2-, AB-matrices and so on have to be stacked. These matrices are then called the H- and S-matrices and are defined as H A 2 A , = Hp A M S= 0 B AB H p −1 B A B M M A H p −2 B L 0 L 0 O M L B (3.22) With the use of these notations the prediction of the future states can be written as X = Hx (k ) + SU (3.23) Expression (3.23) is in this text sometimes referred to as the MPC-simulation. This is because the vector X is the simulation of the system that MPC use in the optimisation. The next step is to rewrite (3.18) in the variables X and U. This can be done by recalling that a quadratic sum can be written as a scalar product of two vectors. With formulas this can be written as ∑ H p −1 j=0 ~ ~ 2 2 x(k + j + 1) Q + u (k + j ) Q = X T Q1 X + U T Q2U = 1 2 ~ T ~ = (Hx(k ) + SU ) Q1 (Hx (k ) + SU ) + U T Q2U ~ ~ where Q1 and Q2 are defined as the following block diagonal matrices (3.24) 18 Control Theory ~ Q1 Q1 = Q1 Q1 O ~ Q2 Q2 = Q2 Q2 O (3.25) In (3.25) the Q1- and Q2- matrices are repeated along the diagonal Hp times respectively. The notation can easily be extended to handle the case when a reference signal is present. The cost function is then given by ∑ r (k + j + 1) − z(k + j + 1) H p −1 j =0 2 Q1 + u (k + j ) 2 (3.26) Q2 where z denotes the control objective. To enable the use of the practical matrix notation some new vectors have to be added to handle the reference signal and the control objective vector z R= r ( k + 1) r (k + 2) , r ( k + H p ) M Z= Mx( k + 1) Mx(k + 2) Mx( k + H p ) M = M M O X M ~ = MX (3.27) With these definitions in mind the cost function can be written as (M~ (Hx(k ) + SU ) − R ) Q~ (M~ (Hx(k ) + SU ) − R) + U T 1 T ~ Q 2U (3.28) According to the definition of the reference vector R, future reference signals are used in the optimisation. Either this feature of the MPC-framework can be ignored, and a reference signal r ( k + 1) = = r ( k + H p ) ≡ rconstant ( k + 1) that is constant through the optimisation can be used, K or it can be used to inform the controller what reference signals that are expected in future. In many cases, the future reference signals are not previously known. This is for example the case when the reference signal comes from an operator. If the future reference signals are unknown, an alternative can be to try to make “an intelligent guess” what will happen in the future. It is finally worth mentioning that if the system is unconstrained, has a finite horizon, and the ability of previewing future reference signals is desired, there exists an analytic solution to the problem. The basic information in this section is taken from [16]. Solving the Optimisation Problem If the cost function (3.28) is expanded it can be seen that the problem can be written as a socalled QP-problem, which is an optimisation problem with a quadratic cost function and linear constraints [16]. 3.4 Model Predictive Control – MPC ∑ z(k + j + 1) − r(k + j + 1) H p −1 j =0 ( 19 + u (k + j ) 2 Q1 2 Q2 = ) ( ) ( K ) T ~ ~ ~ ~ = M (Hx (k ) + SU ) − R Q1 M (Hx (k ) + SU ) − R + U T Q 2U = = ~ ~ ~ ~ ~ ~ ~ ~ ~ T = U T S T M T Q1MS + Q2 U + 2 S T M T Q1T MHx(k ) − S T M T Q1T R U + Constant ( ⇒ Ignore the constant and divide by two ⇒ ⇒ 12 U (S M~ Q~ M~S + Q~ )U + (S M~ Q~ (M~Hx(k ) − R)) U T T T T 1 T 2 ) ⇒ (3.29) T T 1 The standard form of a QP-problem is min x 1 T x Hx + f T x 2 (3.30) subject to Ax ≤ b With the manipulations made above the MPC-problem can be written as a standard QPproblem as ( ) ( ( )) T 1 T T ~T ~ ~ ~ T ~ T ~T ~ min U S M Q1MS + Q2 U + S M Q1 MHx(k ) − R U U 2 subject to U ≤ U limit (3.31) Control Signal Horizon In the previous sections it has been understood that the number of control signals that are being solved by the QP-solver have been as many as nu ⋅ H p , where nu is the number of control signals to the system. This means that the problem has the dimension nu ⋅ H p . By reducing the dimension of the problem, the problem becomes easier to solve and the optimisation time is thereby reduced. The control signal horizon is from now on in this thesis called Hu, where Hu ≤ Hp. This can be done in several ways. They are further presented in section 3.4.5. Integral Action Integration can be introduced in an MPC-controller by penalising changes in the control signal [16]. Intuitively it can be justified by thinking that holding a constant control signal should not cost anything, if it is necessary to fulfil the demands on reference tracking. However, it is still necessary to have some penalty on the control signal, otherwise it could be very aggressive because it does not cost anything to use full power at a short period. If changes in the control signals are penalised, large rapid changes in the control signals become very expensive and are therefore, as long as possible, avoided. In the meantime holding a constant, but possibly large, control signal does not cost anything. If changes in the control signals are penalised the cost function can be written as ∑ r (k + j + 1) − y(k + j + 1) H p −1 j =0 2 Q1 + u (k + j ) − u (k + j − 1) 2 Q3 where Q3 is the variable for adjusting the cost of changing the control signal. (3.32) 20 Control Theory To get the differences in the control signal a new matrix, here called Ω , is used. The first difference involves the last control signal to the system, which has to be saved from the last sample. This signal is placed on the first position in a row vector here called δ. If these two new variables are used the differences in the control signal sequence can be written as [16] u ( k u (k ) − u (k − 1) u (k + 1) − u (k ) M + H u − 1) − u (k + H u − 2) = I −I I O O −I U I u ( k − 1) 0 − = M 0 ΩU − δ (3.33) The cost function can now be rewritten on the standard form in terms of matrices as ( ) ( ( ) ) 1 T T ~T ~ ~ ~ ~ T ~ ~ ~ ~ U S M Q1MS + ΩT Q3Ω U + S T M T Q1T MHx(k ) + MPD − R − ΩT Q3δ U 2 (3.34) ~ where Q3 is defined in accordance with (3.25). 3.4.2 Extending the Framework to Handle Measurable Disturbances If measurable disturbances occur in the model, as described in expression 3.3.1, also the MPC-framework needs to be extended to handle these. The disturbances are introduced in the MPC-framework by creating a matrix similar to the S-matrix. This matrix tells how the disturbances evolve trough the system in time. The one-step and two-step predictions can then expressed as x ( k + 1) = Ax( k ) + Bu( k ) + Nd (k ) x( k + 2) = Ax( k + 1) + Bu(k + 1) + Nd (k + 1) = (3.35) = A2 x (k ) + ABu(k ) + ANd (k ) + Bu(k + 1) + Nd (k + 1) The disturbances in different times are bundled together to the vector D as D= d (k ) d (k + 1) M d ( k + H p − 1) (3.36) Note that it is necessary to supply the MPC-controller with future disturbances. This can be handled as previously described for the reference signal. Similar to the matrix S, the matrix P is created as P= N AN N −1 N A M 0 N M A N −2 N L 0 L 0 O M L N (3.37) It is now possible to write the state predictions with the matrix notation as X = Hx(k ) + SU + PD This in turn gives a cost function, on standard form, of the form (3.38) 3.4 Model Predictive Control – MPC ( 21 ) ( )) ( T 1 T T ~T ~ ~ ~ ~ ~ ~ ~ U S M Q1 MS + Q2 U + S T M T Q1T MHx(k ) + MPD − R U 2 (3.39) 3.4.3 More about Constraints The constraints used in the previous section were not as general as possible. According to the standard form of the QP-problem every linear combination of the components in the solution may be used, i.e. in this case every linear combination of control signals in U can be used [16]. The new formulation of the problem is then ( ) ( ( )) T 1 T T ~T ~ ~ ~ ~ T ~ T ~T ~ min U S M Q1MS + Q2 U + S M Q1 MHx(k ) + MPD − R U U 2 subject to AuU ≤ bu (3.40) With this general assumption upper and lower limits on the control signals are implemented by using the Au- and bu-matrices as I −I O O I U − I ≤ u1,max u H ,max p u 1,min u H p ,min ⇔ AuU ≤ bu (3.41) A more general form of constraints is constraints on linear combinations of states. Two special cases are of course constraints on certain states directly or on output signals y. An arbitrary linear combination of states Zconstraints can be written as ~ Z constraints = M constraints ( Hx(k ) + SU + PD ) (3.42) ~ ~ where M constraints is defined in a similar way as M . This gives a constraint, for nconstraints number of signals, on the form M y nconstraints ,limit ~ M constraints (Hx (k ) + SU + PD ) ≤ ⇔ ~ M constraints SU y1,limit ⇔ y1,limit ~ ~ ≤ M − M constraints Hx( k ) − M constraints PD y nconstraints ,limit (3.43) which is a linear constraint in U. To be able to write the state constraints in conjunction with the above-described control signal constraints the following definitions are necessary 22 Control Theory ~ Ay = M constrains S by = y1,limit ~ ~ M − M constraints Hx( k ) − M constraints PD y nconstraints ,limit (3.44) It is now possible to write these constraints on the standard form AyU ≤ by (3.45) The control signal constraints and the state constraints are concatenated to a single matrix and a single row vector to make the notation compatible with the standard QP-problem notation Au U Ay bu ≤ by (3.46) These thoughts can be summarised by saying that all constraints that are linear in the control signal sequence can be handled. Note that this property opens the possibility to handle constraints both on control signal derivatives and on state derivatives [16]. 3.4.4 Using a Model Linearised around a Non-Equilibrium Point A linearisation of a system can be written on state space form as ∆x& = f ( x0 , u 0 ) + ∂f ∂x ⋅ ∆x + ( x0 ,u0 ) ∂f ∂u ⋅ ∆u (3.47) ( x0 , u 0 ) If the system is linearised around an equilibrium point the term f ( x0 , u 0 ) = 0 . In this case, the framework presented above can be used directly. If the point ( x0 , u 0 ) is not an equilibrium point the term f ( x0 , u 0 ) ≠ 0 . More about linearisations of systems can be read in section 3.3.1. When a linearisation is made around a non-equilibrium point, the term f ( x0 , u 0 ) has to be included in the MPC-simulation. This means that an extra term is needed in the expression for the MPC-simulation to include how the f ( x0 , u 0 ) -term evolves in the system. If this term would have been omitted the initial derivative, from the linearisation point, would have been excluded. Before the term f ( x0 , u 0 ) can be used in the standard framework, it has to be converted into the discrete equivalent called f0d. This is described in section 3.3.2. The one step and two step predictions of the deviations from the linearisation point can then be written as ∆x (k + 1) = A∆x(k ) + B∆u (k ) + N∆d (k ) + I∆f od (k ) ∆x(k + 2) = A∆x(k + 1) + B∆u (k + 1) + N∆d (k + 1) + I∆f od (k + 1) = = A2 ∆x (k ) + AB∆u (k ) + AN∆d (k ) + AI∆f od (k ) + + B∆u (k + 1) + N∆d (k + 1) + I∆f od (k + 1) where I is a unit matrix of suitable dimension. (3.48) 3.4 Model Predictive Control – MPC 23 To be able to write the vector ∆X, which includes all future state predictions, using the previously used matrix notation the new matrix T has to be introduced as T= L 0 L 0 M M O M A L I 0 I I A N −1 A N −2 (3.49) Further on, the term f ( x0 , u 0 ) has to be expanded to a vector like F0 d = f0d ( k + 1) f (k + 2) 0d M f0d ( k + H p ) (3.50) Because the same point of linearisation is used in all time instants in the MPC-simulation the following relation holds f0d (k + 1) = f0d (k + 2) = K= f 0d (k + H p ) (3.51) The predictions of the future state changes may now be written with matrices as ∆X = H∆x(k ) + S∆U + P∆D + TF0 d (3.52) The new cost function is similar to those already presented, except from that delta variables are used instead of absolute variables ( ) 1 ~ ~ ~ ~ ∆U T S T M T Q1 MS + Ω T Q3 Ω ∆U + 2 ~ T ~ ~ ~ ~ ~ + S T M T Q1T MH∆x(k ) + MP∆D + MTF0d − ∆R − Ω T Q3δ ∆U ( ( ) ) (3.53) where ∆R = R − R0 and R0 = Y0 . The usage of a linearisation of the system also affects the constraint equations. If these are not modified, the constraints will be used on the delta variables instead of the absolute variables. After the modification the constraint equations will look like Z 0 + ∆Z ≤ b ⇔ ∆Z ≤ b − Z 0 ⇔ ~ ~ ⇔ M constraints S∆U ≤ b − Z 0 − M constraints (H∆x (k ) + P∆D + TF0 d ) (3.54) where ∆Z = Z − Z 0 . 3.4.5 Choosing the Type of Control Signal in the Prediction The control signal sequence actually used when the MPC-controller simulates the system at each time step can be chosen in several different ways. The solution from the optimisation, UM, may be seen as parameters in the control signal, U. The control signal can then be chosen as any function that is linear in these parameters. This can formally be written as U = ΩM U M (3.55) 24 Control Theory This equation can directly be used in all the formulas presented in the previous MPC-sections simply by replacing U with Ω MUM [16]. A common usage of this matrix is for example to hold the final control signal value some extra samples [16]. The number of free control signals is then denoted nu ⋅ H u and the total number of control signals, including the locked ones in the end, is denoted nu ⋅ H p . Another usage is to let the optimisation only to choose, for example, every fifth sample. Between these samples, the control signal is held constant. In this thesis, the possibility to use expression (3.55) to form the actual control sequence, also has been used to perform linear interpolation between some “original” samples. This is further described in section 3.4.5. When the number of optimisation variables is reduced, the dimension of the optimisation problem is reduced. The ability to choose the type of control signal is primarily used for two reasons. First, by choosing a smooth type of control signal the optimisation “knows” that rapid changes in the control signal are not allowed. This can also make the actual control sequence smoother. Second, if the system is slow and stable there might be no need for a rapid control signal. Instead, it might be more important to reduce the dimension of the optimisation problem. The sampling rate is however kept high, to enable the controller to quickly start the reaction to new disturbances. Worth to clarify is that U is the control signal sequence that is used during the MPCcontroller’s simulation of the system in each sampling time. It is not the control signal sequence that actually is actuated to the system. At each sampling time, the optimisation is reevaluated and only the first element in the optimal control signal sequence is actually used. All later signals are wasted. Using a Piecewise Constant Control Signal One type of control signal that can be used is a piecewise constant one. The length of the constant sections is a design variable. Longer constant sections give a smoother control signal and a lower dimension of the optimisation problem. Further, the length of the constant sections can be varied over the prediction horizon. It is common that the constant sections are short in the beginning of the control sequence and then chosen longer at the end. By choosing this configuration, the controller is allowed to make some rapid control signal changes in the beginning to quickly adjust any reference tracking errors. After some steps in the prediction, the system should be close to the reference and no rapid control signal changes should be necessary. The control signal may then be chosen constant over longer periods. An example of a control signal of this type is given in figure 3.3. Each circle indicates the beginning of a u time Figure 3.3: This figure shows how the control signal is held constant for certain number of samples. The beginning of each new constant section is marked with a circle. 3.4 Model Predictive Control – MPC 25 new constant control signal segment. This control signal value is then kept for a specified number of samples. Linear Interpolation of the Control Sequence If the optimisation is allowed only to choose the control signal in some of the time instants, and the actual control signal sequence is a linear interpolation between these, the number of optimisation variables is reduced. It is an alternative to hold the control signal constant in some time intervals. The interpolation kernel used is [17] 1 + x, h( x ) = 1 − x, 0, 1 −1 ≤ x < 0 0 ≤ x ≤1 otherwise -1 1 Figure 3.4: Linear convolution kernel. To perform linear interpolation the kernel is convoluted with the original control signal sequence as u (k ) = (h ∗ u M )(k ) (3.56) A convolution is a linear operation, which means that each interpolated sample point is a linear combination of the original ones. This makes it possible to write the interpolation as the product between Ω M and UM, which means that the usual framework can be used. This is because when the desired number of new samples is known, the values of the kernel in the needed positions are also known. This means that the coefficients may be calculated off-line and then placed on the right places in the matrix Ω M. The interpolation is then performed online by the simple calculation (3.55). Another way of thinking is that the kernels placed in every original sample forms a basis for the control signal sequence. This way of thinking is taken from Predictive Functional Control [9]. If the linear interpolation kernel from above is used, the basis functions are the same as those sometimes used in the Finite Element Method. When linear interpolation is used, the control signal sequence is piecewise linear between the control signals that are contained in UM. Because of the fact that the optimisation knows the value of the control signal between these original samples, the linear interpolation can be seen as some kind of parameterisation of the control signal with Hu degrees of freedom. * * ** * ** * * * * * Figure 3.5: This figure shows the resulting function when linear interpolation is performed between the original samples marked with circles. The new interpolated samples are marked with stars. 26 Control Theory When linear interpolation between the components in UM is performed, the total control signal sequence gets a shape like the one in figure 3.5. The original samples are marked with circles and the interpolated ones are marked with stars. In addition to linear interpolation, cubic spline interpolation has been tested. The spline function is then supposed to approximate the sinc-function, which is the ideal reconstruction function [17]. It can be thought of as if the MPC-controller internally works with an upsampled version of the actual control signal. The non-zero second order derivative between the samples in this case also brought some problems. When control signal constraints are used, bends on the control signals tend to cross the limits when the control signals are close to the limits. This forces the optimisation sometimes to produce “strange” control signals near the constraints. Because of this behaviour, the idea was abandoned, but it might work well if the constraints are not as commonly active as in this application. 3.4.6 Soft Constraints A problem, which can arise, is that no feasible solution to the optimisation problem (3.30) exists. The reason is that there are no control signals, which can hold the system within the output constraints described by Z ≤ bz (3.57) This might happen if for example large disturbances occur, the initial state is not allowed or the model used to predict future output signals is poor. Constraints on the control signals, i.e. U ≤ bu (3.58) will however never introduce infeasibility problems since the optimisation solver itself calculates the control signals. One way to handle the unfeasibility is to ‘soften’ the output constraints. This means that instead of seeing the constraints as hard boundaries, which the solution never is allowed to break, temporary constraint violations are allowed, but only if it is really necessary. The soft constraints are implemented by introducing non-negative slack variables, ε, which is added to the right-hand side of the ‘hard’ output constraints according to Z ≤ bz + ε ε ≥0 (3.59) The slack variables are then included as optimisation variables that are penalised hard in the cost function to keep them close to zero. There are a number of different ways in which ε can be introduced, both in the constraints and in the cost function. In the constraints, a first way might be to just add one slack variable to all of the constraints according to Y ≤ by + ε 1 ε ≥0 (3.60) where ε is a scalar and 1 is a vector, of ones. If this solution is used, the increase in size of the optimisation problem, i.e. the number of optimisation variables, will be small. This gives a 3.4 Model Predictive Control – MPC 27 fast algorithm but the drawback is that it limits the freedom, which means that if ε has to be non-zero in one sample, it will affect all the other samples as well. Another way to include the slack variables, which gives more freedom, but introduces a lot more slack variables is to add a different ε to each constraint in each sample time, i.e. Z ≤ bz + ε (3.61) ε ≥0 where ε is a vector with the same dimension as Y. The drawback with this method is that it gives a much slower algorithm, due to the size of the optimisation problem. A variant of the last approach is to use one slack variable to all constraints in one sample. Under the assumption that two different output constraints are used, this can be illustrated by z1 (n + 1) ≤ bz ,1 + ε 1 z 2 (n + 1) ≤ bz , 2 + ε 1 (3.62) z1 (n + 2) ≤ bz ,1 + ε 2 M This solution gives fairly large freedom, but will still not introduce a large number of slack variables. The slack variables can be penalised either linearly or quadraticly. If quadratic penalty is used the cost function for the minimisation will look like 1 min ∆U T H∆U + f T ∆U + ρ ε ∆U ,ε 2 2 2 ⇔ 1 min ∆U T H∆U + f T ∆U + ρ ∆U ,ε 2 ∑ε 2 (3.63) The drawback with this method is that ε might become non-zero to some extent, even if it is not necessary to avoid infeasibility. Another way of penalising ε that prevents this unnecessary usage is to penalise ε linearly instead of quadraticly. The cost function will then become 1 min ∆U T H∆U + f T ∆U + ρ ε ∆U ,ε 2 1 ⇔ ε ≥0 min ∆U ,ε 1 ∆U T H∆U + f T ∆U + ρ 2 ∑ε (3.64) For small ε, this method will push ε harder towards zero. One further method is to only penalise the largest of the slack variables, which will give the minimisation 1 min ∆U T H∆U + f T ∆U + ρ ε ∆U ,ε 2 ∞ ⇔ 1 min ∆U T H∆U + f T ∆U + ρ max ε ∆U ,ε 2 (3.65) This equation can be included in the MPC-framework, by rewriting the max-function as constraints instead, according to 28 Control Theory 1 min ∆U T H∆U + f T ∆U + ργ ∆U ,ε 2 subject to Y ≤ by + ε ε ≤γ ε ≥0 (3.66) However, this is not necessary since an equivalent description is to introduce only one slack variable, which is added to all of the constraints 1 min ∆U T H∆U + f T ∆U + ρε ∆U ,ε 2 subject to Y ≤ by + ε ⋅ 1 (3.67) ε ≥0 This can be explained as if only one ε is used, it will take the largest value needed to obtain a feasible solution. If this method is used, it can be seen that it is equivalent to linear penalty of the slack variable as defined by (3.64). The slack variable weight, ρ, should be chosen rather large. Otherwise, there is a risk that it becomes advantageous to use the slack variables even if a feasible solution exists without them. If the penalising is done either by the 1-norm or by the ∞-norm it can be shown that the ‘softened’-problem will give the same solution as the ‘hard’-constrained problem if a feasible solution exists and ρ is chosen large enough. [9] 3.4.7 Handling of Time Delays A fundamental idea in MPC is that most properties of the system included in the model, are taken care of. This means that when MPC is used no external compensations, such as an OttoSmith loop, are needed in order to handle time delays. Instead this is simply done by including the time delay, τ, in the model. Assume that the time delayed system has n states and is described on state space form as x& (t ) = Ax(t ) + Bu (t − τ ) y (t ) = Cx (t ) (3.68) This can be transformed into a time discrete model, and when this is done two different cases arise. In the first case, the time delay can be written as a multiple, N, of the sample time, Ts, i.e. τ = N Ts (3.69) 3.4 Model Predictive Control – MPC 29 In this case, the control signals should just be saved for N samples. It can be done by introducing extra states and the time discrete system then becomes x (k + 1) =Ad x (k ) + Bd x n +1 (k ) x n+1 (k + 1) = x n + 2 (k ) x n+ 2 (k + 1) = x n +3 (k ) (3.70) M x n+ N (k + 1) = u (k ) y (k ) = [C d 0]x (k ) where Ad, Bd, Cd are defined according to (3.10). In the second case, the time delay cannot be written as in (3.69), but has to be expressed as τ = N Ts + τ 1 (3.71) 0 < τ 1 < Ts (3.72) where N, Ts are defined as above and This problem can be divided into two parts. The N Ts -part is handled as earlier, i.e. by introducing extra states. Hence, in the rest of the solution it can be assumed that the total time delay is equal to τ 1 . In this case, (3.9) can be written as x((k + 1)Ts ) = e ATs x(kTs ) + ∫ e ATs Bu (t + kTs − τ 1 )dt Ts 0 (3.73) To be able to move u (t ) out from the integral, it must be constant (or at least known) in the interval, which should be integrated over. A usual assumption, which was mentioned in section 3.3.2, is that the control signals are piecewise constant, and look like the one in figure 3.2. The time delay will displace the control signals with τ 1 . The resulting control signals can be seen in figure 3.6. Ts 2Ts 3Ts 4Ts time [s] τ1 Figure 3.6: Example of a time delayed, piecewise constant signal. This leads to that the integral has to be divided into two parts. The first part depends on an old sample of the control signals, i.e. u ((k − 1)Ts ) , and the second part depends on the present sample, u (kTs ) . Equation (3.73) can then be rewritten as x((k + 1)Ts ) = e ATs x (kTs ) + τ1 ∫0 e A(Ts −t ) Bdt u ((k − 1)Ts ) + ∫ e A(Ts −t ) Bdt u (kTs ) Ts τ1 (3.74) 30 Control Theory When the N time delays have been put back, the total system becomes x (k + 1) = Ad x (k ) + B d ,1 x n+1 (k ) + Bd , 2 x n + 2 (k ) x n+1 (k + 1) = x n + 2 (k ) x n + 2 (k + 1) = x n +3 (k ) (3.75) M x n+ N +1 (k + 1) = u (k ) y (k ) = [C d 0]x (k ) where Ad = e ATs τ1 Bd ,1 = ∫ e A(Ts −t ) Bdt 0 Ts Bd , 2 = ∫ e A(Ts −t ) Bdt τ1 (3.76) Note that the index of the last state in (3.70) is n + N + 1 instead of n + N as in (3.75), depending on that u ((k − 1)Ts ) has to be stored for one extra sample. [16] 3.4.8 Dynamic State Weights In the introduction to MPC (see section 3.4.1), the weight matrices Q1, Q2 and Q3 were constant along the horizon. Sometimes it may be desirable to change the cost of a deviation from the reference or the usage of an actuator. If the penalty is chosen low at the beginning of the horizon and then is continuously increased along the horizon, a smoother settling may be achieved. This idea has been tested, but was later replaced by the use of terminal state weight, which can be said to be a special case of dynamic state weights. See section 3.4.9. 3.4.9 Terminal State Weight The basic desire is to calculate the optimal control signal by minimising a design criterion, similar to the one given by (3.17). This modified formula can be expressed as ∑ [e(k + j + 1) Q e(k + j + 1) + u(k + j) Q u(k + j)] ∞ j =0 T T 1 2 (3.77) where e is defined as e(k ) = r (k ) − y (k ) In that way, the optimal trajectory for the states would be achieved. This is however only possible if no constraints are used, and the solution is then obtained as an ordinary LQcontroller. Otherwise, i.e. if constraints are included, the solution to the optimisation problem must be calculated numerically and that is only possible if a finite horizon is used (see section 3.4.1). The finite horizon will result in that the optimal trajectory is not obtained and the reason for that is described by the figure 3.7 below. 3.4 Model Predictive Control – MPC 31 • • • • Time Ts 2Ts Ts 2Ts Time Hp Figure 3.7: The left figure shows the predicted trajectory when an infinite horizon is used and to the right figure shows the predicted trajectory when a finite horizon is used instead. In figure 3.7 the predicted trajectories, calculated in t = 0 and t = Ts have been plotted for a system that is perfectly known and with no disturbances present. In the left part of the figure above, an infinite horizon has been used and as can be seen the trajectories follow each other perfectly. In the right part, a finite horizon with length Hp, has been used. The trajectories are then somewhat different. This depends on that new information, i.e. the point that lies beyond the prediction horizon, is added in each sample time. A way to solve this would be to include the tail of (3.77) ∑ [e(k + j + 1) Q e(k + j + 1) + u(k + j ) Q u(k + j )] ∞ T T 1 j=H p 2 (3.78) into the optimisation problem despite that the optimisation occurs over a finite horizon [8], [9]. This can be done by introducing something called a terminal state weight, Q1,terminal . It means that an extra state, namely the state after the end of the prediction horizon, is added to the cost function and is penalised by Q1,terminal according to the equation below. ∑ [e(k + j + 1) Q e(k + j + 1) + u(k + j ) Q u(k + j)]+ e(k + H H p −1 min u T j =0 T 1 2 + 1) Q1,terminal e(k + H p + 1) (3.79) T p To add this into the regular MPC-framework, it is only to include the terminal state weight into Q1,terminal , which will then be ~ Q1 0 ~ Q1,new = 0 Q 1,terminal (3.80) and to add an extra prediction of y, i.e. Ynew = Y ( ) y k + H p +1 (3.81) The choice of Q1,terminal can be done in several ways. One is simply to choose an arbitrary, fairly large value. Two other, more sophisticated, methods can be derived if certain assumptions are done on the control signals and the system after the end of the prediction horizon. 32 Control Theory In the first method, the system is assumed stable and the control signal u is assumed equal to zero in the tail. Then Q1,terminal is achieved by solving the Lyapunov equation [9] AT Q1,terminal A + M T Q1 M = Q1,terminal (3.82) In Matlab this can be done by using the command dlyap. In the second method, it is assumed that neither the states nor the control signals hit any constraints. The optimal state feedback, u (k ) = − Lx(k ) , is then obtained as an LQ-controller, where L is given by the time-discrete Riccati equation ( S = AT SA + M T Q1M − AT SB B T SB + Q2 L = (B T SB + Q ) −1 2 ) −1 B T SA (3.83) T B SA In this case Q1,terminal is obtained as S in the equation above [8]. To solve the Riccati equation in Matlab the command dlqr can be used. More information about the first method can be found in [9] and for the second method in [8]. 3.4.10 Using MPC through an Already Existing Controller In this thesis, tests have been performed using MPC through an already existing controller. If this solution is supposed to be used, the system known to the MPC-controller has to include the controller. This means that the included controller has to be linear, except for possible internal constraints, to be used in the MPC-framework. Using an already existing controller in the reality might be hard if it contains logic or non-linearities. If a second controller is used between the MPC-controller and the actuator, the MPCcontroller does not necessarily need to handle all disturbances. This means that the performance demands on the MPC-controller in the outer loop decreases. For example if it controls the speed of the vehicle through a cruise controller, it does not have to compensate for road slope, wind drag and so on as much as if it controlled the engine directly. 3.4.11 Avoiding Simultaneous Use of Control Signals One way to introduce integral action in an MPC-controller is to penalise ∆u (k ) = u (k ) − u (k − 1) (see section 3.4.1). In that way, u can have some constant value to obtain y (k ) = r (k ) and will still not be penalised since the difference will equal zero. If several control signals are used, as in this thesis, it can sometimes be desirable to avoid simultaneous use of them. Unfortunately, problems occur when combining the ∆u-penalty with the desire to avoid simultaneous use of the control signals. It depends on that, as was said earlier, it does not cost anything in the optimisation to keep the signals non-zero simultaneously as long as they are constant. The avoidance of simultaneous control signal use is equivalent to a condition of the type “if something is active something else must be inactive”. Conditions like these are in general hard to fit into the standard MPC-framework, since these often result in non-convex optimisation problems. One solution, which at least decreases the use of several control signals simultaneously, is to penalise the cross-terms of the control signals, i.e. u i (k ) ⋅ u j (k ) . Then the optimisation algorithm will understand that it is favourable to keep one control signal equal to zero. If the design criterion looks like, 3.4 Model Predictive Control – MPC 33 ∑ r (k + j + 1) − y(k + j + 1) H p −1 j =0 2 Q1 + u (k + j ) 2 Q2 + u (k + j ) − u (k + j − 1) 2 Q3 (3.84) and for simplicity only two control signals are used, the penalisation of u1 (k ) ⋅ u 2 (k ) can easily be done by using the weight matrices Q2 = 0 Q2 , 21 Q2,12 0 Q3,11 Q3 = 0 0 Q3, 22 (3.85) where Q2,12, Q2,21, Q3,11 and Q3,22 are constants. This method can of course also be generalised to handle more control signals. Unfortunately, this method will not solve the problem completely because if Q2,12 or Q2,21 is chosen too large the optimisation problem (3.30), might become non-convex. A test, which thus has to be done, is to check if all eigenvalues of H (defined in (3.30)) are non-negative, because then the problem is still convex. Another method, which can guarantee that no more than one control signal is used simultaneously, can be derived if binary variables are introduced. This type of variables is also referred to as {0,1}-variables. The conditions on the control signals can then be formulated as constraints instead. In the case when only two control signals are used, the constraints can be written as u1 (k ) ≤ c1,1 ⋅ t1 u 2 (k ) ≤ c 2,1 ⋅ (1 − t1 ) u1 (k + 1) ≤ c1, 2 ⋅ t 2 u 2 (k + 1) ≤ c 2, 2 ⋅ (1 − t 2 ) M (3.86) u1 (k + H u − 1) ≤ c1, H u −1 ⋅ t H u −1 u 2 (k + H u − 1) ≤ c 2, H u −1 ⋅ (1 − t H u −1 ) K, H i = 0, K , H u1 (k + i ) ≥ 0, i = 0, u −1 u 2 (k + i ) ≥ 0, u −1 where c1,i are constants and t i ∈ {0,1} . In (3.86) it can be seen that for example t1 = 0 will imply that only u1 is allowed to be nonzero 0 ≤ u1 (k ) ≤ c1,1 0 ≤ u 2 (k ) ≤ 0 ⇒ u 2 (k ) = 0 (3.87) The problem is therefore solved. The drawback is that, since the optimisation problem contains binary variables, it is no longer convex. These problems are called Mixed Integer Quadratic Programming-problems, abbreviated MIQP-problems. They are solved with an algorithm based on tree-searching and a global optimum can be obtained reliably [9]. The computational complexity is however much larger, because a standard QP-problem has to be solved for each searched branch. 34 Control Theory 4.1 Truck 35 4 System Modelling 4.1 Truck 4.1.1 Vehicle The vehicle model has been implemented in Matlab’s Simulink environment. It contains some of the basic characteristics of the truck. Only longitudinal dynamics has been taken into account and all effects from side forces have been neglected. The external longitudinal forces that have been accounted for is tyre friction, wind drag and road slope. To model the longitudinal force between the tyre and the road, the slip has been calculated. All wheel axles and axles in the driveline have been modelled as completely stiff but with some inertia. Because of the assumption that all the axles are stiff, driveline oscillations will not occur. The main facts used in the models in this section have been taken from previous thesis works [3] and [4]. Air Resistance The air resistance has been modelled as proposed in [4]. The formula is then given by 1 Far = cρAv 2 2 (4.1) where Far – air resistance c – an aerodynamic constant ρ – air density A – frontal area of the truck v – truck velocity The Contact between Tyre and Road The tyre normally does not have a peripheral speed exactly corresponding to the velocity of the vehicle. This is called slip. The force transmitted in the contact area between the tyre and the road is proportional to the slip. This also implies that the only time when the wheel does not slip at all is when no force is transmitted through the tire-road contact area. The slip formula is defined as σ= ωwr − v v (4.2) where ωw – angular speed of wheel v – vehicle speed r – wheel radius For small slip values the tyre force is almost linear in the slip. The longitudinal tyre force Fx may then be calculated according to 36 System Modelling Fx = a1σ (4.3) where a1 is the proportional constant from slip to tire force. Rolling Resistance The expression for the rolling resistance is taken directly from [4] and is given by: ( ) C rr = C rr ,iso + C a (3.6v ) − 80 2 + Cb ⋅ (3.6v − 80 ) Frr = C rr ⋅ 2 Fz sign(ω w ) 1000 (4.4) where Fz = mg cos(α ) (4.5) and Crr, Crr,iso, Ca and Cb – rolling resistance coefficients Frr – rolling resistance force α – road slope v – truck speed Gravitation Force The road slope is one of the most important external environmental properties that affect the truck. When the slope is positive a force down the hill appears, which brakes the truck. In the same manner when the slope is negative a forces wants to accelerate the truck down the hill. This is easily expressed as follows: Fslope = − mg sin(α ) (4.6) where the sign is chosen in such a way that an uphill road generates a negative force. Front and Rear Axle Torque Balance Equations The rear axle torque balance equation is a first order differential equation that describes how the engine torque, retarder torque and load forces from the tyres affect the angular speed of the axle. The engine torque is first transmitted through the gearbox. The torque is there multiplied with the gear ratio. After the gearbox, the retarder is connected to the propeller shaft. Finally, the sum of the engine torque and the retarder torque is transmitted through the final gear, before reaching the rear axle. From the other side longitudinal tyre forces are transmitted to the tyre during the slip process. As described above the force transmitted in the tyre road contact area is proportional to the difference between the peripheral velocity of the tyre and the velocity of the road. In the wheel, the force is transformed into torque. This gives the following balance equation: ω& w,rear = J w,rear r ω 1 ⋅ n g n f (Tec − Tef ) − n f Tret − 2 ⋅ Twb − 2a1r w, rear − 1 (4.7) 2 v + n f ng ( J e + J g ) 2 where Tec – engine torque – engine friction Tef Tret – retarder torque 4.1 Truck Twb ng nf r Jw, rear 37 – wheel brake torque on each wheel (it is here assumed that each tire brakes with exactly the same torque) – gear ratio – final gear ratio – wheel radius – moments of inertia of the rear axle including two wheels In the same manner the corresponding balance equation for the front axis is given by: ω& w, front = 2 J w, front ⋅ a1r ⋅ 1 − ω w, front r − Twb v (4.8) The variables are named in obvious accordance with the list above. Truck Dynamics The above-described equations are used in the calculation of the longitudinal forces from the tyres. As mentioned earlier these forces are proportional to the slip, which is calculated from the difference in wheel peripheral velocity and actual longitudinal velocity. When the forces from the air resistance, the rolling resistance, the road slope and the axles are summed together the truck acceleration can be calculated according to Newton’s second law. The mathematical formulation of the solution can then be written as: v& = g cos(α ) 2a1 ω w,rear + ω w , front 2 C rr ,iso + C a (3.6v ) − 80 2 + Cb (3.6v − 80 ) − ⋅ r ⋅ − 2 − 1000 m v 1 cρA 2 − ⋅ ⋅ v − g sin(α ) 2 m ( ( ) ) (4.9) 4.1.2 Engine The engine is modelled by using a map. A map is a table of values that can be addressed by some variables or indexes. It is very common that the table originates from an experiment where the input variables have been varied over a certain range of values. In the meantime the interesting output variables have been measured and stored in the corresponding table positions. If the map is supposed to work as an ordinary function it is expected to return values not only in the original discrete measured points, but also in every point between these. This can be accomplished by interpolation between the original measured points. Usually linear interpolation is used which means that values between two measured values is assumed to lie on a straight line between the original ones. In this thesis, an engine map from the variables engine speed and injected fuel amount per stroke to the output engine torque has been used. Because the engine map originates from an experimental measurement, the engine friction is already included in the output torque. This means that the output torque is negative when the injected fuel amount is zero or close to zero. The engine map is almost linear in the variable injected fuel amount. At normal and low injected amounts of fuel, the map can also be considered to be linear in the engine speed. It is therefore a reasonable simplification in the controller design to use a linear approximation of this map. The linearisation of the engine map may then be written as the linear function 38 System Modelling Tengine,effective (t ) = c fm,1m fi ,desired (t ) + c fm , 2 N engine (t ) + c fm , 3 (4.10) where cfm,1, cfm,2 and cfm,3 – constants in the linear approximation chosen by the least square method from measured engine data Tengine,effective(t) – effective engine torque, i.e. engine friction is included mfi,desired(t) – desired amount of fuel injected per stroke Nengine(t) – engine speed This model has been used in the observer in the MPC-part of this thesis. After the engine, the torque is transferred through the clutch. In our model, it is not necessary to model the clutch because the ACC is not active during a gear change. When the clutch is engaged it can be treated as an ordinary axle, with no losses. After the clutch, the revolution energy is transferred through the gearbox. Ideally, no energy is lost in the gearbox. The purpose of the component is to convert the “level” of the energy. The same amount of energy can be transferred at different torques and angular velocities as long as the product between them is constant. This means that an increase in torque has to be compensated with a decrease in speed if the power is to be constant. The need for this conversion is mainly because of the fact that a combustion engine only has a short operation interval where the maximum torque and maximum efficiency are reached. Of course, the conversion is also done for sound level and durability reasons. With the power conservation in mind, it is natural to write down the formulas for the gearbox as follows: Tout = Tin ng ω out = n gω in Pout = ω outTout = n gω in ⋅ (4.11) Tin = ω inTin = Pin ng (Energy is conserved) where Tin, Tout – input and output torque ng – gear ratio ωin, ωout – input and output axle speed After the gearbox the power is transmitted through the propeller shaft to the final gear, which is the last link to the rear axle. The final gear is modelled in accordance with the gearbox model as: Tout = Tin nf (4.12) ω out = n f ω in where nf is the fixed final gear ratio. 4.1.3 Wheel Brakes The model that has been used is exactly the one used for control in [5]. It is a so-called threeparameter model, i.e. a first order model with a delay. In [5], the parameters were given with 4.1 Truck 39 uncertainty intervals. In this thesis the parameters have been chosen to the values in the middle of the intervals. The model then looks like TEBS = 1 e −0.05 s Tebs , desired 1 + 0.06s (4.13) where TEBS – braking torque Tebs, desired – desired braking torque However, as was said in section 2.4, the control signal is a requested retardation. The special controller used in the EBS system to effectuate this retardation is quite hard to implement in Simulink. It depends on that it has several special properties, such as the load dependence, the correction of the κ-value and the ABS-function, that are hard to model. Therefore a controller where these properties have been excluded was designed. It is an ordinary PI-controller, with the possibility to reset the output and the integrator part when the controller is inactive. The control law is described by the expressions below Tebs ,desired (t ) = (K p (rdesired (t ) − rmeasured (t )) + K i I (t ) )⋅ enable (4.14) where I (t ) = enable ⋅ ∫ t t enable (rdesired (t ) − rmeasured (t ))dτ and enable – a signal that is set equal to one to enable the controller tenable – the time when the controller last was enabled 4.1.4 Retarder The retarder dynamics is fairly slow, due to limitations in the oil flow. Because of this, a static model is not good enough and a dynamic model is therefore needed. Measured data, from a test of a retarder, shows that a number of overshoots come up when a step is applied to the input. This indicates that the transfer function for the retarder is of higher order than one. The model that has been used in this thesis is a modified version of the model used in [5], which can be seen in (4.15). The control signal sent via the CAN bus is the requested torque, Tret , desired , and the output signal is the delivered torque, Tret . Q& ret ,unsat (t ) = c1 (Tret ,desired (t − τ d ) − Qret ,unsat (t ) − Tret (t ) ) Qret ,max if Qret ,unsat (t ) > Qret ,max T&ret (t ) = c2 Qret ,min if Qret ,unsat (t ) < Qret , min Q ret ,unsat (t ) otherwise (4.15) where Qret ,unsat – the oil flow to the torus, before it is saturated τd – retarder time delay The constants c1 and c2, which in [5] was given as intervals, have in this thesis been chosen to the values in the middle of these intervals. An important remark is that the delivered torque is 40 System Modelling not controlled in closed-loop, so if there are any disturbances present there is no guarantee that the delivered torque is equal to the requested. The model is a second-order model, with a saturation of the oil flow. The problem with this model was that the saturation limits and the time delay were modelled as constants. When looking at measured data, it can be seen that these are dependent of changes in amplitude of the input, Tret ,desired . A larger change gives a larger time delay and a steeper change of the retarder torque. The time delay, τd, is therefore modelled by a static function according to τ d = f1 (∆Tret ,desired ) where (4.16) ∆Tret ,desired = Tret ,desired (n ⋅ Ts ) − Tret ,desired ((n − 1) ⋅ Ts ) The function f1 , which can be seen in figure 4.1, was derived through testing and comparing to the real output. 1 time delay [s] 0.8 0.6 0.4 0.2 0 0 500 1000 ∆T 1500 ret,desired 2000 2500 3000 [Nm] Figure 4.1: Time delay as a function of the step size at the input. Because of the steeper changes in retarder torque when ∆Tret ,desired is large, the oil flow saturation has to increase. The oil flow saturation limits were derived in same way as the time delay, i.e. by testing and comparing. The upper limit, Qret,max , and the lower limit, Qret,min , are described by the functions g1 and g 2 respectively according to Qret,max = g1 (∆Tret , desired ) Qret,min = g 2 (∆Tret ,desired ) where g1 can be found in figure 4.2 and g 2 in figure 4.3. (4.17) 4.1 Truck 41 180 -180 170 -200 150 Qret,min Qret,max 160 140 -240 130 120 110 0 -220 -260 500 1000 ∆T 1500 ret,desired 2000 [Nm] 2500 3000 -3000 Figure 4.2: Upper oil flow limit as a function of the step size at the input. -2500 -2000 ∆T -1500 ret,desired -1000 [Nm] -500 0 Figure 4.3: Lower oil flow limit as a function of the step size at the input. Another property, which has not been modelled in [5], is that the delivered torque is limited both by physical limitations in the retarder and by the cooling system of the truck. The maximum delivered torque from the retarder, is 3000 Nm. This will be the limiting factor for low propeller shaft speeds. For high propeller shaft speeds, the generated power will instead limit the delivered torque. The power can be calculated by the equation below P = Tret ⋅ ω = Tret ⋅ 2 ⋅ π ⋅ N 60 (4.18) where N – propeller shaft speed Tret – braking torque that the retarder produces From the equation above, the maximal torque can be calculated as Tret ,max = 60 ⋅ Pmax 2 ⋅π ⋅ N (4.19) where Pmax – maximal power that can be dissipated by the cooling system If the model described in [5] is expressed in state-space form, together with the additions that have been done in this thesis, the following equations are achieved Q& ret ,unsat (t ) = C1 (Tret ,desired (t − τ ) − Qret ,unsat (t ) − Tret ,unsat (t )) where Tret ,unsat Qret , max if Qret ,unsat (t ) > Qret ,max T&ret ,unsat (t ) = C 2 Qret , min if Qret ,unsat (t ) < Qret ,min Q ret ,unsat (t ) otherwise 3000 if Tret ,unsat (t ) > 3000 0 if Tret ,unsat (t ) < 0 Tret (t ) = Tret ,max < Tret ,unsat (t ) < 3000 Tret ,max if Tret ,unsat (t ) otherwise is the unsaturated retarder torque. (4.20) 42 System Modelling In this thesis the maximum torque, which the retarder can produce, has been modelled in accordance with the earlier mentioned power limitation. This limitation, Pmax, was set constant. An improvement could be to let this maximum limit be temperature dependent, because if the temperature becomes too high the maximum torque is reduced (see section 2.5). The variable time delay, τ, caused some problems when τ was varied. This can be exemplified by the following example. Assume that a ramp with a slope equal to one is sent through a variable time delay. If no change of τ is done, the output will be identical to the input, except for the time delay. This case can be seen in the figure 4.4 for t ∈ [0,5] . However, if the length of the time delay is changed, the slope will change. A break point is when τ is increased with one second each second. Then the time delay is increased with the same rate as the time. The output will then be constant, since no values are output. If τ is increased with more than one second in each second, it means that values, which have already been output, must be withdrawn again. This can be seen in figure 4.4 for t ∈ [5,7] . How negative the slope will be depends on how fast τ is changed. If τ is increased with less than one second in one second the slope of the output will be less then the original signal, but not zero. Input to the variable time delay 6 5 4 3 2 1 0 2 4 Time [s] 6 8 10 Figure 4.4: Output from a variable time delay when the length of the delay is changed, if the input is a ramp with slope one. The conclusions are summarised in table 4.1 Table 4.1: Slope of the output, compared to the original signal, depending on how fast the time delay is changed. Change rate [s/s] <0 0 0-1 1 >1 Slope, compared to the original signal Larger Equal Between equal and 0 0 Smaller 4.2 Driving Environment 43 4.2 Driving Environment 4.2.1 Road Description To be able to simulate how the road affects the truck, a road model was implemented. The most significant road parameters, that are important to the behaviour of the ACC-system, are the road slope and the road curvature. The road slope makes differences in the actuator choice and acts as external low frequency noise. The road curvature has no effect, in our longitudinal model, on the forces on the truck. However, high road curvature can make the radar lose its target. To simulate this event, a lower limit on the road radius has been set. Below this limit, the radar model will simulate a loss of its target and report that the distance to the target equals zero. The model has been implemented in such a way that when the road conditions are to be changed, a new road slope and a new road curvature are specified together with the time when the change shall appear. When the road model was to be implemented, it was necessary to make the choice whether the road conditions should appear at different times or at different road positions. The later alternative, i.e. to switch at different road positions, is more natural because it is how it works in reality. In that case, the distance between two adjacent road events is independent of the truck speed between these. If the road conditions instead are switched at specified time events, the distance between two adjacent road events changes with speed. Despite all this, the choice was made to make the road conditions change at different times instead of different positions. This is because it is easier to compare different controllers if the conditions change in exactly the same time. The time it takes for the controller to settle the truck at the new conditions can more easily be compared in this way. Another factor, which makes it preferable to switch road conditions on a specific time, is that it is easier to implement switches on specific times than specific distances in Matlab. 4.2.2 Traffic Situation The traffic situation is modelled in the same way as the road condition. The traffic situation is the behaviour of the vehicle ahead of the truck. As with road conditions, different events in the traffic situation occur at different times instead of different positions. At desirable time instants, the initial distance to the vehicle ahead and its velocity is given to the model. A new vehicle is introduced by changing the initial distance. The radar unit then starts to integrate the distance from the calculated relative velocity between the truck and the vehicle ahead. The speed of the vehicle ahead can be changed by sending new traffic events with new speed requests, without changing the initial distance. By changing the initial distance, the position integrator is reset and a new target comes up with the new initial distance. If there is no target ahead the distance zero is sent to the ACC logic. 44 System Modelling 5.1 Introduction 45 5 State Machine Combined with Traditional Control 5.1 Introduction The problem to control the truck demands that the controller has different discrete states, in which it has to act differently. The choice of states can be done in several different ways. One solution is to let the discrete states match the different actuators, i.e. one state agrees with the engine and another with the retarder. In this thesis, the concept was extended with another higher discrete state level. The discrete states contained in the high level correspond to different driving situations (see figure 5.1). All these states include lower level states corresponding to the different actuators. No lead vehicle Lead vehicle present Normal operation Following lead vehicle Lost target Urgent operation Figure 5.1: Overview showing the highest logic level, i.e. the level where the different discrete states corresponds to different driving situations. The first step in the controller design was therefore to distinguish which driving situations that cannot be handled by the same controller, for example a car ahead, no car ahead and curve. Due to the structure of the problem, a hybrid controller was chosen. A hybrid controller is, as written in section 3.1, a controller that has a logic part that selects which continuous controller to be used, depending on in what discrete and continuous state the controller was before the switch. This structure gives high modularity, because if the controller should be able to act differently in some driving situation, it is just to add another state and the belonging switch conditions. The structure of the problem can be visualised as different levels, see figure 5.2. When the state is changed, bumps can easily occur if the integrator parts in the controllers switched between do not match. In the top level, where there are many different controllers, simple Pregulators are used. In the second level, which only contains three different controllers, one for the engine, one for the retarder and one for the wheel brakes (EBS), differential PIcontrollers are used. The advantage of this structure is that where many switches are 46 State Machine Combined with Traditional Control performed, i.e. in the highest level, no settings of the integrator part is needed and where there are fewer switches, the integrator part is included to eliminate static errors. Logic P-controllers, Trajectory PI-controllers Actuators Figure 5.2: System and controller structure overview. One further idea is based on the desire to switch between the engine and the retarder when none of them produces any torque. If a switch is carried through at that moment, no bump will then occur in the control signals. Since estimations of these torques are available, it should be pretty simple. A small problem appears when hysteresis should be introduced. To introduce a hysteresis, the switch has to occur for a small negative value, but since the torques has a lower limit equal to zero, it is impossible. This can be solved by having a time that the torques have to be zero before the switch is allowed to happen. This solution causes a new problem and that is to choose a proper time, because it will depend on the surroundings for example if there is downhill or uphill. In this thesis, all these problems have been solved by introducing the acceleration from the environment, aenvironment, as switch criterion. This acceleration is the forces acting on the truck normalised by the mass of the truck according to the equation below aenvironment = Fenvironment Fslope − Far − Frr = m m (5.1) where Frr, Far and Fslope are defined by (4.4), (4.1) and (4.6) respectively. How can this be used? Newton’s second law yields that m ⋅ a vehicle = Fresultant ⇔ a vehicle = Fresultant = a eng + ret + EBS + a environmen t m (5.2) where aeng + ret + EBS = Feng ,ret , EBS m = Fengine − Fretarder − FEBS m (5.3) and Fengine – the force acting on the truck produced by the engine Fretarder – the force acting on the truck produced by the retarder FEBS – the force acting on the truck produced by the EBS Assume that the engine and retarder do not work at the same time, which must be prevented to minimise the fuel consumption, and that the EBS is not activated. Then it can be seen that avehicle = a environment (5.4) 5.2 Estimations and Calculations 47 only when both Fengine and Fretarder are equal to zero. Since the switch criterion, mentioned in (5.4), was to switch exactly when this happens, an equivalent switch criterion has been derived. There are two main advantages with using the switch criterion (5.4). First, since the control signal is chosen to be the desired acceleration of the truck, adesired, it is easy to change the switch condition from a condition on the real acceleration to a condition on the control signal according to a desired = a environment (5.5) Second, it is very easy to introduce hysteresis. For this purpose, a small negative value was added to the aenvironment when switching from engine to retarder and in the same manner a small positive value when switching from retarder to the engine. The main problem with the choice of switch criterion is that aenvironment is not a measured signal. Hence, it has to be estimated. More about the estimation and the related problems are discussed in section 5.2.2. 5.2 Estimations and Calculations 5.2.1 The Desired Distance to the Lead Vehicle The calculation of the desired distance is based on the velocity of the vehicle ahead of the truck. The driver specifies a desired time gap to the lead vehicle. This gives the distance as dist ref = vlead ⋅ t hw (5.6) where distref – the desired distance vlead – the velocity of the lead vehicle thw – the desired time gap to the lead vehicle Distance [m] Old desired distance New desired distance Time [s] Figure 5.3: The down ramping of the desired distance. As can be seen in (5.6), the desired distance is calculated based on the lead vehicle velocity. A drawback with this approach is that the desired distance can be set low due to a low lead vehicle velocity, although the speed of the truck is high. If this happens, it may cause that the truck brakes too late. This problem is handled by an asymmetric rate limit of the desired distance. This means that for an increase in the desired distance, the rate is not limited, but for a decrease, it is. An example of the last case is shown in figure 5.3. Here it can be seen that if the rate limit is hit, the shape of the curve becomes a ramp. Hence, the rate limit will temporarily extend the desired distance, which will cause the truck to brake earlier than it 48 State Machine Combined with Traditional Control otherwise would have done. One further positive effect of the earlier braking is that the retardation can be smaller, which results in higher comfort. 5.2.2 The Acceleration From the Surroundings, aenvironment The acceleration that affects the truck from the surroundings (5.1) is used as switch condition in many places in the controller. This depends on that when the truck has the same acceleration as the environment exerts on it, it is time to change from engine to retarder or vice versa. Unfortunately, aenvironment is not measurable, but has to be estimated. The definition of aenvironment is aenvironment = Fslope − Far − Frr (5.7) m The forces that occur in (5.7) are not measured, so it is not possible to use (5.7) directly. Another way to calculate aenvironment is to use the model for the vehicle and for the wheels. Using section 4.1.1, the following balance equation for the rear wheels can be derived ω& w,rear = r ω 1 ⋅ ng n f (Tec − Tef ) − n f Tret − 2 ⋅ Twb − 2a1r w,rear − 1 (5.8) 2 v J w,rear + n f n g ( J e + J g ) 2 This can be rewritten as ω w, rear r Fx, rear = 2a1 r = v − 1 = ( ) n g n f (Tec − Tef ) − n f Tret − 2 ⋅ Twb − J w, rear + n f n g ( J e + J g ) ω& w ,rear 2 2 (5.9) r where Fx,rear – longitudinal tyre force from both of the rear wheels The equations above only cover the rear wheels. The front wheels are not connected to the propeller shaft, which implies that the only torque acting on them is the one generated by the tyre force, under the assumption that the wheel brakes are not applied. This force will always work in the opposite direction to acceleration of the truck, so if the truck has a positive acceleration this force will brake the truck. The equation for the front wheels can be seen in (5.10). J w, front ⋅ ω& w , front = − Fx , front ⋅ r ⇔ Fx , front = − J w, front ⋅ ω& w, front (5.10) r where Fx,front – longitudinal tyre force from both of the front wheels To simplify somewhat, it will be assumed that the front wheels are rotating with the same speed as the rear wheels, thus neglecting the wheel slip. The expression for the total force acting on the truck from both the rear and the front wheels may then be written as Fx,tot = Fx , front + Fx ,rear = = ( ) n g n f (Tec − Tef ) − n f Tret − 2 ⋅ Twb − J w, front + J w,rear + n f n g ( J e + J g ) ω& w,rear 2 r 2 (5.11) 5.2 Estimations and Calculations 49 If Newton’s second law is applied to the truck, the resulting equation becomes avehicle = Fresultant Fx ,tot = + aenvironment m m (5.12) If (5.12) is combined with (5.11), it holds that aenvironment = = avehicle − ( ) 2 2 n g n f (Tec − Tef ) − n f Tret − 2 ⋅ Twb − J w, front + J w,rear + n f n g ( J e + J g ) ω& w, rear (5.13) mr Yet some problems remain. One of these is that the mass of the truck, m, is unknown. The mass must therefore be estimated in some way. In this thesis it have been assumed that this is done somewhere else, for instance in the gearbox control unit, and hence m can be seen as a known variable for the adaptive cruise controller. Another problem is that the angle acceleration of the rear wheels, ω& w,rear , is not measured. Instead the angular speed, ω w,rear , is measured by the EBS-system. Thus, ω& w, rear has to be calculated through numerical differentiation. If the numerical differentiation is performed using the Euler-backward method, the result is ω& w,rear (t ) = ω w,rear (t ) − ω w,rear (t − Ts ) Ts (5.14) where Ts – sampling time To use this method directly, without any correction, is not useful. The reason is that the differentiation (5.14) has very high gain for high frequencies. This may result in that the differential of the high frequent noise drowns the real signal. Further low-pass filtering of the measured, and already anti-aliased filtered, signal is therefore needed. If a first-order low-pass filter is used the expression for the filtered version of ω w,rear becomes ω w,rear,fil tered (t ) = τ ⋅ ω w,rear,fil tered (t − Ts ) + Ts ⋅ ω w,rear (t ) Ts + τ (5.15) where τ – time constant of the filter By combining (5.14) and (5.15), the expression for ω& w,rear finally becomes ω& w,rear (t ) = τ ⋅ ω& w,rear (t − Ts ) + ω w,rear (t ) − ω w,rear (t − Ts ) Ts + τ (5.16) The time constant, τ, has to be chosen to an appropriate value, so that the filter suppresses as much of the noise as possible without affecting the actual signal in the frequencies of interest. In this thesis τ was chosen to 0.1 s. 50 State Machine Combined with Traditional Control As can be seen in (5.13) the acceleration of the truck, avehicle, is also needed to be able to predict aenvironment. As for ω& w,rear the acceleration of the truck is not measured, but the velocity is. Using differentiation and filtering with the same τ as for ω& w,rear , avehicle can be expressed as avehicle (t ) = τ ⋅ avehicle (t − Ts ) + vvehicle (t ) − vvehicle (t − Ts ) Ts + τ (5.17) When aenvironment is estimated, peaks and dips might arise. It depends on that the estimation contains both filtered and unfiltered versions of signals that have physical connection with each other. For example, the filtered version of the acceleration for the truck is included in the estimation but the unfiltered versions of the torques. According to physics, the acceleration and the torques are directly connected. It means that if the torques are changed, the unfiltered version of the acceleration will also change at the same time. However, the filtered version of the acceleration will be somewhat delayed. This will lead to an unbalance in (5.13) between the filtered signals and the unfiltered signals, and that is the reasons for the peaks and the dips. It is now rather simple to explain whether a peak or a dip will occur, when the torques are changed. Assume that expression (5.13) is divided into two parts aenvironment = aenvironment , filtered part − aenvironment ,unfiltered part (5.18) where first part includes the unfiltered signals aenvironment ,unfiltered part = n g n f (Tec − Tef ) − n f Tret − 2 ⋅ Twb mr and the second part includes the filtered signals aenvironment , filtered part = avehicle, filtered (J + ) + J w,rear + n f n g ( J e + J g ) ω& w,rear , filtered 2 w , front 2 mr (5.19) (5.20) If the first part is increased, which means for example the engine torque is increased, it can be seen that aenvironment will be more negative before the second part closes the gap. Hence, the result will be a small dip in the estimation. If, on the other hand, the first part is decreased, a small top will occur. To reduce the size of the peaks and dips, aenvironment can be low pass filtered. The final expression for aenvironment finally becomes aenvironment , filtered (t ) = τ ⋅ aenvironment , filtered (t − Ts ) + Ts ⋅ aenvironment ,unfiltered (t ) Ts + τ (5.21) The choice of τ has to be tuned in a real case, but for simulations, τ = 1 s was chosen. An alternative way had been to use unfiltered signals and differentials of the unfiltered data and filter only aenvironment. This gives less degrees of freedom, which means that even if one signal has noise with some frequency and another signal has noise with another frequency, they must be filtered with the same filter. In the rest of this thesis aenvironment,filtered will be called just aenvironment. 5.2.3 Curve Detection The reason for detecting curves is that the radar sometimes loses the target. If this occurs, it is important to know the reason. One reason can be that the car ahead turned of at a road exit. In that case the car is no longer a problem and, unless there is another car ahead, the truck may 5.2 Estimations and Calculations 51 resume the desired velocity. Another reason, which can cause the radar to lose its target, is that the car ahead enters a curve with a radius so small that it ends up out of the radar’s field of vision. Hence the car cannot be seen but it will still be in front of the truck. In this case, it might not be so good to increase the velocity. To be able to separate these cases, curve detection is needed. It can be done in several ways. One is to measure the steering angle and have some sort of algorithm to estimate the road radius. A better way is to mount a yaw-rate sensor on the truck (see section 2.2.2). When the yaw-rate is known it is easy to detect curves. A first idea is to use the yaw-rate directly, which means that if the yaw-rate is larger than some value, a curve is detected. The definition of radians gives the following expression b =α ⋅R (5.22) where b – arc length α – angle R – radius of the circle If the expression above is differentiated, it becomes b& = α& ⋅ R (5.23) Here b& can be recognised as the velocity, v, α& as the yaw-rate, Ψ , and R as the road radius which gives v = Ψ⋅R (5.24) A problem with using the yaw-rate directly can be realised. The curve detection will then depend on the velocity, i.e. a large radius may give a high Ψ if the velocity is large enough. A better way is to say that a curve is detected if the road radius is smaller than an appropriate limit, Rlimit. An expression for R is easily obtained from (5.24), but a problem may occur. The straightforward expression has a singularity when the road is straight, since the yaw-rate goes towards zero in the same time as the velocity does not. By calculating the inverse of the road radius instead, this problem can be eliminated. The final expression then becomes R −1 = Ψ v (5.25) A curve is now detected if R −1 > ( Rlimit ) −1 (5.26) How (Rlimit)-1 should be chosen depends on what its purpose, but in this thesis it has been chosen to the value where the radar might loose the target ahead. Another choice could have been to choose as a smaller value, which will give a controller with a larger safety margin. 52 State Machine Combined with Traditional Control 5.3 State Switch Strategies Distance P-Controllers (possibly a map) Trajectory Distance controllers No target controller Desired acceleration Reference Selection Current mode Acceleration Switch Logic v set , new = v set ,old + Ts ⋅ 3.6 ⋅ a desired Retarder acceleration controller (PI-controller) Cruise Controller Retarder Engine Figure 5.4: Controller structure overview. One of the main problems, which were to be solved in this thesis, was to decide when to switch between different controllers. Switches between controllers are needed for example when another vehicle appears in front of the truck after a period with no other vehicles ahead. Another example is when different behaviours are desired for different relative velocities and distances to the vehicle ahead. This can often be implemented by so-called gain scheduling, where the coefficients in the controller are possible to change depending on the state of system. Finally, the most important example, which is the hardest one to solve, is the switches between engine support and brake support. There are several possible switch point candidates. The most obvious one is probably to switch when the measured speed differs from the desired more than some certain limit. This strategy has both positive and negative properties. Among the benefits, the most important one probably is its simplicity. If the switch point is combined with a hysteresis it is quite predictable with no uncertain estimations. The major drawback is that the switching is done when an error in speed already is present. If the hysteresis is chosen large, to reduce the possibility of excessive mode switching, the speed error accepted before a switch is performed can be several kilometres per hour. It is also possible that periodic switch patterns will occur on downhill roads. A variable that is strongly connected to the choice of actuator is the resistance from the environment. If a nonnegative acceleration is desired, and the resistance is high enough to cause the vehicle to slow down when no engine power is applied, the engine controller has to be used. If a zero or negative acceleration is desired, while the vehicle has positive acceleration if neither the brakes nor the engine is applied, the brakes have to be used. Notice that this strategy will not switch to the brake controller only because the desired acceleration 5.3 State Switch Strategies 53 85 85 80 80 75 75 v [km/h] v [km/h] is negative if this can be achieved by a decrease in the injected amount of fuel. This property is used during downhill driving when a positive acceleration may require a small brake support. If the switch point would have been a static limit like, for example, adesired = 0, the engine mode would have been entered for a while and then retarder mode again. This would probably have caused a repetitive switch pattern like the one in figure 5.5 b. In this example, the vehicle ahead accelerates with the constant acceleration 0.1 m/s2, while the slope is –4 %. The positive desired acceleration forces a mode switch in the case when a static switch position is used. In figure 5.5 a, the solution presented in this thesis is used. Due to the positive acceleration from the environment, and the low acceleration that is needed to follow the vehicle ahead, it predicts that the right choice is to choose the retarder and stay to it. As can be seen in figure 5.5 a, no unwanted mode switches occur. 70 65 70 65 60 60 55 55 50 0 20 a, 40 time [s] 60 80 50 0 20 40 time [s] 60 80 b, Figure 5.5: In figure a, the solution with the dynamic switch point is used and in b, the switch point adesired = 0 is used. With these thoughts in mind, combined with the earlier presented acceleration from the environment (see section 5.2.2), the choice was made to use a top-level controller that returns a desired acceleration. Due to this solution, it is easy to decide which actuator to be used in a particular case. The desired acceleration is compared with the current prediction of the acceleration from the environment, aenvironment (See figure 5.4). If the desired acceleration is higher than aenvironment, the engine has to be used to be able to effectuate the acceleration. In the same manner, the brakes have to be used if the desired acceleration is lower than aenvironment. To avoid excessive switching when the desired acceleration is of the same size as aenvironment, hysteresis was introduced in the switch condition. The switch condition from engine to brake then became adesired < aenvironment − C hysteresis . In the same manner the switch condition from brake to engine became adesired > aenvironment + C hysteresis . The constant Chysteresis is a design variable dependent of the noise level. To prevent single peaks in aenvironment and adesired to trigger a switch event, the conditions above have to be fulfilled for a certain number of samples if the switch actually is to occur. The main benefit with a dynamic switch point is that the switches, ideally, occurs in the right time when a change of actuator is needed and there is no need for a velocity error to decide whether to switch or not. The major drawback with this method is that it includes a prediction of aenvironment. In-truck measurements have shown that it can be very hard to estimate this 54 State Machine Combined with Traditional Control variable correctly. The main problems are to get a correct estimation of the torque delivered by the actuators and to make the correct estimation of the acceleration of the truck. One important property of aenvironment when used as a switch point is that it is dynamic, i.e. the switch point between engine and brakes is adjusted according to the surrounding environment. This means for example that switching to brakes is performed at a lower retardation request on a downhill road than on a flat or uphill road. It also maximises the use of the other forces in the environment such as wind drag and rolling resistance. The goal with the switching strategy is to avoid usage of the brakes before it is really necessary and to avoid unnecessary switches. In our solution the environmental forces are automatically primarily used. The dynamic behaviour of the switch point is demonstrated in figure 5.6. aenvironment Brake Engine avehicle=0 Downhill road aenvironment Brake Engine avehicle=0 Flat road aenvironment Brake Engine avehicle=0 Uphill road Figure 5.6: This figure shows the dynamic behaviour of the switch point, aenvironment. This switch strategy is independent of the road slope, but the calculation of aenvironment is dependent of the mass of the truck. Simulations have shown that the consequences of an incorrect estimation of the mass are not extreme (see section 7.2), but of course, it lowers the performance of the controller. Worth to clarify is that the primary goal with this switch strategy is not to uniquely identify uphill, downhill and flat parts of the road. However, in most cases the switch between engine support and brake support coincides with the switch between flat road and downhill road, but it does not necessarily have to be in that way. This means that if the method is used, the switch between flat road and uphill road does not have to be explicitly detected. 5.4 Main Controller Structure The chosen controller structure is built upon one high-level controller, which controls the acceleration request to other low-level controllers. The controller structure is a so-called cascade structure (See figure 5.7). In this structure a fast inner loop controls the plant and eliminates most of the influence from external disturbances, while the outer loop can 5.4 Main Controller Structure 55 concentrate on controlling the reference signal to this inner loop. The inner loop has been marked with a dashed box in figure 5.7. Tenvironment dref vlead adesired F1 F2 G avehicle 1 s + vvehicle - vrel 1 s d Tdesired Figure 5.7: The used controller structure is a so-called cascade structure. The inner loop is marked with a dashed box. In this case, the reference signal from the outer loop to the inner loop is of the type acceleration. The choice of this unit (instead of force or torque) on the pass-by variable between the outer and inner loop makes the outer loop almost insensible to variations in the mass of the truck, if the inner loop is fast enough. The mass appears as a variable gain in G, because the acceleration of the truck due to an applied torque, Tdesired, is inverse proportional to the mass according to avehicle = C ⋅ Tdesired m (5.27) where C is a constant depending on the gear ratio and the wheel radius. The inner loop controller handles external disturbances as road slope and wind disturbances, while the outer loop controls the distance to the vehicle ahead. The cascade structure almost eliminates the dependence of the mass of the truck and external disturbances to the outer loop. In the reality, or even in simulations, it can be seen that this is not completely true. The main problem is that the actuators have limited signals, i.e. it is not possible to accelerate the truck using more torque than the engine can produce or brake the truck using more torque than the brakes can produce. As mentioned earlier there is a dependence on the mass in aenvironment, which means that the switch points are not independent of the mass. But as long as no switching occurs, and the inner loop is acceptably fast, the inner loop compensates for different masses automatically. In figure 5.7, F1 is a P-controller in the distance between the truck and the vehicle ahead and the relative velocity. Since the relative velocity is the derivative of distance, this is equal to say that F1 is a PD-controller in only the distance between the truck and the vehicle ahead. It is not necessary to have any integrating part in this controller because the system itself contains integration. This property also ensures that no problems with integrator windup occur in the outer loop if the inner loop saturates. If this had occurred, the strategy presented in section 3.2 could have been used. A problem, which might occur when switching between different states in the controller, is that bumps appear in the actuator control signals. The bumps depend on that the integrator part in the controller changes value abruptly and can be eliminated, or at least decreased, in different ways. One way is to set the integrator part to a value according to the equation below 56 State Machine Combined with Traditional Control u old = u new = Pnew + I new ⇒I new = u old − Pnew (5.28) where uold – old control signal Pnew – new proportional part Inew – new integrator part In that way, the new control signal is equal to the old one in the switch moment. Another similar way is to let the new integrator part be equal to the old one as I new = I old (5.29) This will not give a switch as smooth as the earlier one, but it is often good enough. When the hybrid controller has many different states, those corrections may be fairly complicated. Another way to eliminate these bumps is to skip the integrator part, i.e. use a proportional controller. If the total system should be able to eliminate static errors despite this, either the system or another controller, which is connected in series, has to contain an integrator part. In this thesis the last two methods have been used. The relative velocity has been modified in the sense that it has been limited to the truck’s ACC set speed. This is because it is never interesting to accelerate the truck to a higher velocity than this value. The controller F2 in figure 5.7, which controls that the desired acceleration is reached, is a PIcontroller in the acceleration. Integration is here included to make sure that the desired acceleration from the outer loop is really the one delivered by the actuator in steady-state, despite the presence of disturbances. The switching between engine and brake is performed when adesired passes a certain limit, as described in the previous section. When a switch is performed, this can be thought of as F2 and the dynamic G (see figure 5.7) is changed to those corresponding to the new actuator and its controller. When the retarder is active, F2 is a retardation controller and when the engine is active, F2 is the cruise controller. The latter can be used as an acceleration controller by setting: vCC ,ref (kTs ) = vCC , ref ((k − 1)Ts ) + Ts ⋅ 3.6 ⋅ a desired ((k − 1)Ts ) (5.30) It can be noticed that (5.30) is actually a discrete integration of adesired. In the expression, vset is the cruise controller set speed. To eliminate the risk for windup in the vset-variable, if the actual speed does not follow the vset-speed, some windup protection is needed. This can for example be implemented as if the difference between vset and vmeasured exceeds a certain limit, then vset is not allowed to increase. Our solution is to lock vset if maximum, or minimum, fuel amount is being injected. One possible extension to our concept is to replace the controller F1 in figure 5.7 with a map, which has several input variables to decide the appropriate acceleration. This can make the control more adaptive, which in turn may result in lower fuel consumption and a more comfortable behaviour. 5.5 State machine modes 57 5.5 State machine modes 5.5.1 No Car Ahead When there is no car ahead of the truck, the adaptive cruise control should behave almost as an ordinary cruise controller. The driver sets a reference velocity that the truck should hold. A difference is that an ordinary cruise controller does not have the possibility to brake even if the velocity exceeds the desired. For the adaptive cruise controller in this thesis this possibility exists. As for many other systems manoeuvred by a driver, it is mostly a question about how the driver wants the system to react. In this thesis it was assumed that the driver wants the truck to hold the same velocity all the time, no matter if it is uphill, flat or downhill. Therefore a controller called the enhanced cruise controller (ECC) was designed, which is just a combination of the ordinary cruise controller (CC) and the downhill cruise controller (DC). To be able to use the right cruise controller in different surroundings, appropriate switch criteria had to be decided. A good switch criterion (see section 5.3), is to switch from engine support to retarder support and vice versa when aenvironment = adesired (5.31) where adesired is calculated by a controller in a higher layer . On the analogy of that, the switch point between the CC and the DC in this case is aenvironment = 0 (5.32) since the truck is supposed to hold constant velocity, i.e. adesired = 0 . However, if the switches are performed exactly when aenvironment is equal to zero, it is a risk that chattering occurs. According to section 3.1 this problem may be solved by introducing hysteresis. It means that the switch from engine to retarder, i.e. CC to DC, is performed when aenvironment > alimit,high (5.33) and the switch from DC to CC is instead performed when aenvironment < alimit,low (5.34) In this thesis alimit,low and alimit,high have been chosen to -0.05 and 0.05 respectively. In a real case, the hysteresis values have to be tuned more carefully. One problem with only switching on aenvironment is the possibility that the estimation of aenvironment may fail, i.e. the value of aenvironment can for example correspond to uphill but it is actually downhill. In this case, the truck will not brake if the CC was active, since no switch to the DC is performed. This can result in that the velocity will increase over the reference velocity. To eliminate this sort of problems one further kind of switch criterion has been introduced. This is the difference between the actual velocity and the desired velocity, i.e. ∆v = v − vdesired (5.35) If the difference becomes larger than a specific value a switch is performed, i.e. if the truck goes faster than what is desired the DC is activated and vice versa. The total switching scheme can then be seen in the figure 5.8. 58 State Machine Combined with Traditional Control aenvironment > alimit,high or ∆v > ∆vmax CC DC aenvironment < alimit, low or ∆v < ∆vmin Figure 5.8: The total switching scheme for the ECC (enhanced cruise controller) There are, as was mentioned earlier, many different ideas how the truck should behave when no car is ahead of the truck. For example, it can be assumed that the driver wants the truck to hold a specific velocity independently of the road slope. Another idea is to allow the truck to exceed the desired velocity to some extent when it is downhill. The main thought behind this is to gain some kinetic energy before the downhill ends. This energy can be used to decrease the fuel consumption. It is not difficult to change the ECC to work in this way. It can be done by setting a temporary reference speed equal to the real reference velocity plus some offset when the DC is active. 5.5.2 Follow Mode Introduction dref ACCvehicle Lead vehicle Driving situation dependence of controller mode choice Long distance Normal distance Short distance Air brake Cut-in 1444Distance 444dependence 44444of2controller 4444mode 44choice 444443 Figure 5.9: This figure shows the principles of the choice of submodes in the "follow mode". Only one of the five possible modes can be activated at a time. The dashed arrow illustrates the scenario described below. “Follow mode” is the controller mode normally used when the truck is following another vehicle. If the target is lost, or if an urgent situation occurs, the mode is left in favour to the “lost target mode” or the “urgent mode” respectively. Inside the “follow mode” several submodes exist. In normal driving situations, the “long distance mode”, the “normal distance mode” or the “short distance mode” is activated. The choice between these is based on the distance as shown in figure 5.9. Under certain circumstances, the “air brake mode” is activated. Finally, if a cut-in situation is detected, the “cut-in mode” is activated. Only one mode can be active at a time. This is illustrated in figure 5.9. In the figure, the modes have been ordered in rows. Which row being used is decided by the current driving situation. Which mode that is active in that particular row is then decided by the distance to the lead vehicle. For example, the “short distance mode” is only activated during normal following conditions and when the distance to the lead vehicle is short. If the shortest distance allowed 5.5 State machine modes 59 in the “air brake mode” is crossed, the “short distance mode” takes over the control, because then neither the cut-in situation nor the air brake situation is the current driving situation. This is illustrated by the dashed line in figure 5.9. Long Distance This controller handles driving cases when the distance to the leading vehicle is longer than 1.2 times the desired distance. It is optimised to get smooth and comfortable stabilisation of the speed and distance. It has been divided into three parts. Two of them control the engine via the cruise controller and one controls the retarder via the retardation controller. When the lead vehicle travels with a constant and lower speed than the truck, it is possible to calculate a trajectory, which will result in that the truck will end up in the exact right position with exact right velocity. This trajectory is based on that the truck should hold a constant retardation during the braking. This retardation looks like anominal (t ) = 2 vrel ,meas (t ) 2∆dist meas (t ) (5.36) where v rel , meas (t ) = vlead (t ) − v meas (t ) ∆dist = dist desired (t ) − dist meas (t ) (5.37) In this section, the following nomenclature is used to separate between measured and actual values of a signal: vrel,actual – true relative velocity vrel,meas – measured value of vrel ∆dist actual – true deviation from the reference distance ∆dist meas dist meas vmeas – calculation of ∆dist based on measured values of the distance – measured distance – measured truck velocity When entering the long distance state, a comparison is done between anominal and aenvironment. This will decide whether the CC or the retardation-controller should effectuate anominal. If anominal exceeds aenvironment, it means that the truck does not accelerate enough by just using the forces from the environment and hence it needs engine support. In the same manner, if anominal is below aenvironment, retarder support is needed. Effectuation of the calculated anominal is very easy for the case when the retardation controller is used. In this case, the reference retardation is simply set to -anominal. However, for the CC, it is somewhat harder since it controls the velocity. A reference velocity that gives the desired trajectory must therefore be calculated. This is performed as described in (5.30), with adesired chosen as anominal. If the CC and the retardation-controller effectuated exactly the desired retardation, and if the lead vehicle held its velocity constant, the calculation of anominal could be done just once. In practice, none of these conditions is fulfilled and therefore a new reference retardation has to be calculated in each time step. 60 State Machine Combined with Traditional Control -0.2 5 -0.4 [m/s2] 6 nominal 4 3 -0.6 -0.8 a ∆dist meas [m] A problem is the singularity in (5.36) that occurs when the desired distance is reached. In figure 5.11 anominal is shown when ∆distmeas tends towards zero as described in figure 5.10. When ∆dist becomes smaller than about 1 m, anominal begins to decrease very fast. 2 -1 1 -1.2 0 12 13 14 time [s] 15 16 17 12 13 14 15 16 17 time [s] Figure 5.10: The measured ∆dist. Figure 5.11: The calculated anominal. Actually, the singularity should not be a problem since the limit of the quotient in (5.36) is anominal, but the sensitivity to noise becomes large when ∆dist becomes small. This problem can be solved by having the normal distance controller take over short before the right distance is reached. Exactly at which distance the controller switch shall occur is a design variable. However, it is good to use the prediction as long as the controllers, i.e. the CC and the retardation controller, manage to effectuate the desired anominal. The settling will then be as smooth as possible. To be able to choose the design variable correctly, a small modelling of the noise can be appropriate. Two sorts of noise have been included. First, a noise corresponding to that the CC and the retardation controller do not effectuate the calculated anominal. Second, a noise originating from the quantisation of the measurement signals. The most interesting case to study is the worst case. Since the retardation controller is better than the CC to effectuate anominal, the CC is used in the calculations below. To follow the trajectory perfectly, the velocity of the truck shall equal vCC,ref. This is not always fulfilled, especially when anominal is small, and therefore the real velocity, v actual , has been modelled according to vactual (t ) = vCC ,ref (t ) − nv , system (t ) (5.38) Since the velocity sometimes is incorrect, it will result in an error in ∆dist. This has been modelled as an additive noise source, ndist , system , as shown in (5.39). In the derivation, it has been assumed that vlead is constant between two adjacent sample times and that v actual is given by expression (5.38). 5.5 State machine modes 61 ∆dist actual (kTs ) = T = ∆dist actual ((k − 1)Ts ) + ∫ (vlead ((k − 1)Ts ) − vactual ((k − 1)Ts + t )) dt = 0 (5.39) = ∆dist actual ((k − 1)Ts ) + (vlead ((k − 1)Ts ) − vCC , ref ((k − 1)Ts )) Ts + Ts + ∫ (nv , system ((k − 1)Ts + t ))dt = ∆dist ref (kTs ) + ndist , system (kTs ) 0 The trajectory, anominal, was calculated from values of vrel,meas and distmeas. In this thesis the quantisation level is chosen to 0.1 m. The quantisation noise is for simplicity modelled as v rel ,meas (t ) = vlead (t ) − vactual (t ) + nv ,meas (t ) (5.40) ∆dist meas (t ) = ∆dist actual (t ) + ndist ,meas (t ) The different sorts of noises present will affect anominal according to (5.41) anominal,actual (v (kT ) = s (kTs ) ) 2 rel ,meas 2∆dist meas (kTs ) = (v (kTs ) + nv,total (kTs ) ) 2 rel ,ref 2∆dist ref (kTs ) + ndist,total (kTs ) (5.41) where v rel ,ref (kTs ) =v lead (kTs ) − vCC ,ref (kTs ) ∆dist ref (kTs ) = ∆dist ((k − 1)Ts ) + (vlead ((k − 1)Ts ) − vCC ,ref ((k − 1)Ts ))Ts (5.42) and the noises nv ,total , ndist ,total are defined as nv,total (kTs ) = nv , system (kTs ) + nv , meas (kTs ) (5.43) Ts ndist ,total (kTs ) = ndist ,system (kTs ) + ndist ,meas (kTs ) = 2 ∫ (nv , system (kTs ) )dt + 2nv ,meas (kTs ) (5.44) 0 If ∆dist ref is small, it can be seen that nv,total and ndist ,total will affect the calculation of anominal much. Thus, the problem is to choose a switch point where both nv ,total is small, i.e. the controller still manage to follow the trajectory, and ndist ,total is significant smaller than ∆dist. Since the quantisation level is 0.1 meter, ndist , meas will be small enough if the minimal ∆dist is chosen to, for example, 1.5 meter. The switch should then occur when the distance is 1.03 times the desired one, under the assumptions that the velocity of the lead vehicle is 60 km/h and the desired time gap is 3 seconds. The quantisation of the relative velocity does not influence anominal so much, since it belongs to the numerator. Therefore (5.43) can be approximated as nv ,total ≈ nv, system (5.45) Then nv ,total can be calculated using (5.38) nv ,total (t ) ≈ vCC ,ref (t ) − vactual (t ) (5.46) 62 State Machine Combined with Traditional Control Expression (5.46) has been simulated and the result is plotted in figure 5.12. As can be seen in figure 5.10, the time when ∆dist equals 1.5 meter is 14.8 seconds. According to figure 5.12, this is before the velocity error runs out of control. -0.05 velocity error [m/s] -0.1 -0.15 -0.2 -0.25 -0.3 -0.35 12 13 14 15 16 17 time [s] Figure 5.12: The noise caused by that the CC does not follow the reference velocity. Since the trajectory following controller is based on that the lead vehicle holds a constant velocity, it is necessary to investigate what happens if it does not. In the test, the lead vehicle varies its speed like a sinus curve according to 2π t + 60 30 vlead (t ) = 5 sin (5.47) i.e. a period of 30 s, an amplitude of 5 km/h and a mean value of 60 km/h. After 18.8 s, the distance equals 1.03 the desired distance, i.e. 51.5 m, as can be seen in figure 5.13. A switch, from the long distance controller to the normal distance controller, is then performed. In figure 5.14 the speed of the lead vehicle, represented by the dashed line, and the speed of the truck, represented by the solid line, have been plotted. At the time when the switch occurs, the difference between these two equals 4.5 km/h. A deviation of this size is acceptable and shows that the control strategy seems to work. Furthermore, it can be interesting to compare the truck velocity with the mean value of the lead vehicle velocity (5.47). The mean value of this function is equal to 60 km/h and the truck velocity at the moment of the switch equals 59 km/h. This yields a difference of only 1 km/h, which is small. The conclusion is that even a rather large variation in the speed of the car ahead is handled by the trajectory-based controller without any problems. 5.5 State machine modes 63 100 80 [km/h] 75 lead 80 65 meas and v 70 70 60 v dist meas [m] 90 60 50 40 0 5 10 time [s] 15 Figure 5.13: The distance settling. 20 55 0 5 10 time [s] 15 20 Figure 5.14: The velocity settling for the truck (solid line) and the velocity for the lead vehicle (dashed), in the case when its velocity varies. During retardation switches may occur between the engine and the retarder and vice versa. Besides the usual comparison between the adesired and aenvironment, one further condition has to be fulfilled before a switch from the engine to the retarder is performed. It is that the relative speed has to be negative. The reason is that small peaks might appear in aenvironment when the retarder is applied, which can result in that the switch condition become fulfilled for a short while even when it should not (see section 5.2.2). A better solution is to switch only if the switch condition is fulfilled for a certain number of adjacent samples. This can be read about in section 5.5.2, i.e. “normal distance controller”. When no trajectory is possible to calculate, i.e. when the car ahead is moving away from the truck, the engine should always be used, since the truck should try to decrease the gap. In that case an ordinary P-controller is used. The gain should be rather small, because as was said earlier the behaviour in the long distance state should be very smooth. When controlling the engine via the acceleration controller (see (5.30)), it is important to protect it against windup. This has been done by checking the injected fuel amount per stroke, mfi,desired. When it has reached 227 g/stroke, which is the maximal value, vCC,ref is only allowed to decrease. In the same manner, vCC,ref is only allowed to increase when mfi,desired reaches 0 g/stroke. Normal Distance The foundation in this mode is the acceleration controller described in section 5.4. The outer loop controller used is a PD-controller in respect to the distance to the vehicle ahead. This controller calculates a desired acceleration, which is used as reference in the next controller level. If the velocity of the vehicle ahead is greater than the ACC reference velocity, the proportional part in the PD-controller is disabled and the measured velocity of the vehicle ahead is modified to the reference speed. When the system has reached stationarity, i.e. the vehicle ahead is driving with constant speed and the distance controller has reached steady state, the controller is shut down and the last measured truck velocity is held constant. This is done until the distance differs too much from 64 State Machine Combined with Traditional Control the desired distance or the relative velocity has become too high. This feature decouples the ACC-vehicle from small oscillations in the distance caused by small oscillations in the velocity of the vehicle ahead. This can for example be a small decrease in velocity originating from a small gust of wind hitting the vehicle or an increase in road slope. Short Distance Short distance is the mode used when the vehicle ahead is at a short distance. Short distance means that the distance is smaller than 0.8 times the desired distance. In this mode, safety has a higher priority than comfort, compared to normal and long distance. Even if the most important actuator in the short distance mode is the retarder, the engine also has its role to play. Especially when a braking manoeuvre has finished, and the vehicle ahead is receding, it is important to start the acceleration as early as possible to get a comfortable settling behaviour. The short mode can basically be entered in two ways. The most common one is that the normal mode could not handle the situation when the vehicle ahead was braking hard. In this case the short mode takes over the control of the retarder from the normal mode and the more aggressive parameterisation in the short distance controller is used. If necessary, the controller can order the retarder to brake with up to 3000 Nm. If this is not sufficient the urgent mode (described in section 5.5.3) takes over the control and brakes the truck with the aid of the EBS-system. The torque produced by the retarder is in certain situations enough to reach the maximum retardation limit 3 m/s2. This is a standard retardation limit for ACC-systems [20] and the truck is not allowed, in any case, to brake harder than this limit. This makes the EBSsupport redundant in normal situations, but it may be needed if the truck is heavily loaded or if it is a downhill road. In the same way as was described in the text about the normal mode, the retarder is also in the short mode controlled by retardation. The difference is the parameters in the outer-loop controller. The controller parameters are scheduled with the relative velocity as the scheduling variable. When the relative velocity is positive the vehicle ahead is receding and to get a less aggressive control it is desirable to decrease the dependence of the relative velocity. This can of course be done permanently without gain scheduling but if that solution is chosen, the derivative effect is also lost when the truck is approaching another vehicle. In this case, the difference speed is the most important control variable. Of course, it is of greatest importance also to keep a proper distance to the vehicle ahead, otherwise a collision might occur. In this mode, when the vehicle ahead is receding, the engine is used. This gives the ability to use a trajectory calculation similar to the one used in the long distance mode (See section 5.5.2). When the vehicle ahead is receding, a nominal acceleration is calculated which places the truck at the desired distance and with the correct velocity at the same time. The cruise controller is used to effectuate the acceleration in the same way as described in expression (5.30). In the short mode, the switch between engine and brakes is triggered by several conditions concerning the desired retardation (in the retarder mode), the environmental acceleration and the relative velocity. Cut-In A situation that has to be handled, as a special case is the so-called cut-in situation. This situation occurs when a new vehicle appears at short distance but with positive relative speed. 5.5 State machine modes 65 If no extra care was taken, this situation would very likely result in a heavy brake. This is because the distance part in the controller would produce a large signal due to the large deviation from the desired distance. To prevent this uncomfortable behaviour the logic has to recognise the situation and try to handle it without braking. Under some restrictions, and sometimes also under a limited period of time, a shorter distance than the desired distance can be accepted. The cut-in situation can be recognised if the state is changed to a state that handles a shorter distance (See figure 5.9) than the one before and the relative velocity is positive. With the structure of the logic used in this thesis, it means, for example, that on the branch from “normal distance” to “short distance”, a test whether the relative velocity is positive or negative is performed. If the relative velocity is positive in the same time as the distance has decreased, a cut-in is detected. This is because in normal cases it is impossible to have a positive relative speed while the distance is decreasing. The action made after a cut-in is detected should be as similar as possible to what an ordinary driver would have done in the same situation. Because of the limited amount of information that the radar gives, it is very difficult to make the controller act intelligent, from the driver’s perspective, in every situation. Instead the focus will be on finding a solution that is safe in all situations. A cut-in can be detected for several reasons. For example, it can be a take over or it can be a car that enters the highway from a slipway. This shows that there are several types of cut-ins and each of them should ideally be handled differently. After a cut-in has been detected the new vehicle’s relative velocity is compared to a “safe relative speed limit”. If the relative speed is higher than this limit the truck’s set speed is held constant until the new vehicle disappears. In this case it is probably an overtaking in progress and an ordinary driver would most likely had kept the speed. If the relative velocity is larger than zero, but smaller than this safe limit, a timer is started. When this timer is started the vehicle ahead has a certain time to accelerate to a speed higher than the safe speed. If this is Cut-in detected (implies that vrel > 0) vrel ≥ vrel,safe 0 ≤ vrel ≤ vrel,safe Hold velocity Start timer Return to ordinary control Time is out OR vrel ≤ 0 OR distance < 0.2*desired_distance 0 < vrel ≤ vrel,safe vrel ≥ vrel,safe vrel ≤ 0 Vehicle disappears Figure 5.15: Flowchart showing the cut-in logic. not the case, or if the relative speed becomes negative, the ordinary control strategy takes over the control. This usually means that the truck is braked as much as the distance equals the desired distance again. The complete logic for the cut-in situation is described in the flowchart in figure 5.15. 66 State Machine Combined with Traditional Control Air Brake The usual acceleration controller (described in section 5.4) automatically uses the forces from the environment if it is possible. Despite this, it may be desirable to extend the usage of the environmental forces, i.e. to minimise the usage of the brakes. If the truck catches up another slower vehicle, and it first seems necessary to use the brakes, an extra attempt to avoid using the brakes is made. The desired acceleration required by the outer loop controller is based on the desire to place the truck at the correct distance with the correct velocity. The extra test performed is made to test if the forces from the environment could slow the truck enough, if the desired distance was decreased. This means that the length of the braking operation is increased, but the braking is performed with a lower retardation. The ability to use the extended brake distance can only be entered instead of the “long distance mode”, but after it has been entered its working range overlaps those belonging to “normal distance mode” and “short distance mode”. To be able to make the decision whether the extended brake distance will handle the situation or not, a variant of the previously mentioned calculation of anominal (See expression (5.36)) is used 2 v rel a nominal,extended = (5.48) dist ref , reduced − dist The conditions that have to be fulfilled in order to be able to enter “the air brake mode” is aenvironment > anominal and aenvironment < anominal,extended and d ≤ 1.1distref When the “air brake mode” has been entered, it will be left if any of these conditions occur • • • • • dist ≤ 0.5distref alead < -0.2 and dist ≤ 0.8distref dist ≥ 0.85distref and dist < 0.8distref and vrel ≥ -1 dist ≥ 1.2distref and vrel ≥ 0 aenvironment > anominal,extended + 0.02 5.5.3 Urgent Operation The urgent operation mode is the only way to use the EBS when another vehicle is in range of the radar. The EBS is only to be used in urgent situations when the retarder is to slow or lacks the power to brake the truck safely. The urgent operation can be thought of as some kind of a super mode, i.e. it has the ability to take over the control from any other sub mode in the follow mode. There are two different switch conditions to enter this mode. It is enough that only one of these is fulfilled to enter the urgent mode. The first condition is if the time to collision, ttc, falls below a certain critical limit. This variable is defined as t tc = dist vrel (5.49) This expression is not to be confused with the well known three-second-rule. In the latter the velocity in the calculation is the absolute truck velocity instead of the relative velocity. The time to collision is the time until a collision occurs, if the truck and the vehicle ahead are maintaining their current velocity until impact. It can be noticed that if the two vehicles have the same velocity, the relative velocity vrel is zero and this in turn implies that ttc is infinite. 5.5 State machine modes 67 The second switch condition is if the relative acceleration between the truck and the vehicle ahead is lower, i.e. more negative, than some appropriate critical limit. This limit can be set relatively low to ensure that urgent operation really is required to handle the situation. To ensure that this condition is not triggered by mistake it is also necessary that the distance is shorter than half the desired distance. Even if any of the conditions mentioned above would become true, and the urgent operation mode is entered by mistake, it does of course not automatically mean that a panic brake is performed. The only thing that happens is that another controller that controls the EBS takes over. When the urgent operation mode is entered, a PD-controller in the distance controls the acceleration requested from the EBS acceleration controller. The coefficients in the PDcontroller are chosen in a way that the braking is quite hard. To ensure that the maximum retardation limit at 3 m/s2 is never exceeded, the controller cannot request a higher retardation than this limit. Because the controller does not have an integrator part, no problems with integrator windup will occur, although the control signal is limited. The urgent operation mode is left if the relative velocity becomes equal to, or greater than, zero and the relative acceleration is greater than the earlier mentioned critical limit. This means that the urgent controller is enabled until the situation is normal again. 5.5.4 Lost Target If the lead vehicle disappears it can depend on many reasons. They can be divided into two main cases: • Case 1: The lead vehicle is still in the same road lane as the truck. This case can be when the lead vehicle goes into a curve with a radius so small that the radar looses its contact with the target. • Case 2: The lead vehicle has left the road lane. This case can be exemplified by that it either turns off at a road exit or makes a change of road lane. These cases are to be handled in ways that are in contrast to each other. This can be visualised by the following example. Assume that the truck lies behind a car with a certain velocity. This velocity is lower than the one specified by the driver and therefore the controller wants to increase the speed if it was allowed, i.e. no car lay ahead. In the case when the lead vehicle disappears and the reason is that it has left the road lane, i.e. case two, the controller can resume the desired velocity. However, in the case when the lead vehicle has not left the road lane, i.e. case one, an increase in speed can be directly dangerous. A first thought, when designing the lost target controller, is to wait for a while to see if the lead vehicle appears again, because then it probably was a curve. This might however not be a good idea, since the driver can see which of case one or two that has occurred. Consequently, the driver will experience that the system is slow if the truck does not begin to accelerate if the lead vehicle leaves the road lane. Safety must of course be of highest priority. Immediate acceleration is therefore no possible solution with the sensors used today. A consequence of this is that a compromise between a long wait and immediate acceleration must be done. If the car ahead disappears its last speed, vlead,last, is stored. At the same time a timer is started. If a curve is detected before the timer has reached a specific value, Tcurve, it is assumed that the curve was the reason for the disappearance of the lead vehicle. In this case, the reference 68 State Machine Combined with Traditional Control speed is set to vlead,last. To hold this reference velocity the ECC, described in section 5.5.1, is used. Since it uses both the CC and the DC to hold the velocity, no specific consideration whether it is uphill or downhill has to be done. There is no guarantee that this will be the velocity that the car ahead holds after the disappearance and the distance can therefore both increase or decrease, but vlead,last is still the best guess that can be done. The waiting time, Tcurve, is a design parameter that has to be evaluated carefully, because as was mentioned earlier, it is a balance between safety and that the driver experiences that the system react in a good way. In simulation, Tcurve has been chosen to 4 s. There are more aspects to how the truck should react in a curve. Assume that the truck holds the correct distance to the vehicle in front. If a curve arises, the lead vehicle might brake to a lower speed. In the same time, the truck begins to brake, in order to keep the correct distance. Probably, the retardation of the truck will be smaller than for the lead vehicle, if the lead vehicle has a lower mass than the truck. Hence, the distance will become too short. If the vehicle ahead then disappears, the truck will use the last measured lead vehicle velocity. However, if the vehicle ahead holds vlead,last through out the whole curve, the distance will be too short all the time. This is not a desirable behaviour. Another way to handle curves is to simulate the distance to the car ahead. This can be done by the following expression dist (t ) = dist 0 + ∫ (vlead ,last − v (τ ))dτ t 0 (5.50) where dist0 – distance to the lead vehicle when it disappeared v – velocity of the truck In this way, the controller can correct the distance so the desired distance is achieved. A problem with this is that if the car ahead not holds the assumed velocity, the behaviour of the truck may become illogical. An example of this is if the lead vehicle brakes to a low velocity, which is stored as vlead,last and then accelerates. The simulation will then tell the truck to brake to an even lower velocity than the lead vehicle did in order to reach the desired distance, despite that the real distance might be even longer than it should. Both methods have been used in this thesis. In the hybrid controller part, the first method, i.e. constant velocity, was used, and in the MPC-controller part, the second method was used. 6.1 Introduction 69 6 State Machine Combined with MPC 6.1 Introduction In the second part of this thesis, the ACC-problem is solved by using an MPC-controller. The first reason why an MPC-controller could be appropriate for solving this problem is that it handles constraints on control signals and states very easily. The first control signal is the amount of injected fuel to the engine. The second control signal is the desired retarder torque. Constraints are used to indicate what intervals on these control signals that are allowed to use. In the MPC-controller a linear model of the truck, on state space form, has to be used. The original model of the truck has some non-linearities, and it is therefore necessary to linearise it around a suitable operation point. In our case, this point is continuously chosen to the current estimated state of the system. This kind of state estimator is a so-called extended Kalman filter (see section 3.3.3). When the model of the truck is combined with the constraints on the control signals, it is easy to see that if a negative torque, larger than the engine brake, is desired the retarder has to be used. In the same way, if a positive torque is desired, the engine is the only actuator on the truck that can generate this torque. These thoughts lead to the second reason why an MPCcontroller is interesting to study in this application; if the controller is aware of the control signal constraints, it should be able to do the actuator choice automatically. This means that no explicit switching regions or strategies need to be found. A third reason why an MPC-controller is interesting to study is that it behaves much like an ordinary LQ-controller, if no constraints are active. This behaviour is normally nice and therefore desirable. The thought was also that the use of reference trajectories would be unnecessary and that, to some extent, fuel optimal trajectories would be produced. 6.2 Linearisation of the System Model The original model for the system has six states, the distance to the lead vehicle, the velocity, the wheel speed for the front and rear axle and two states for the retarder (see section 4.1), i.e. x = (dist , v,ω f ,ω b , Qret ,unsat ,Tret ,unsat ) T (6.1) In the MPC-controller a linear model is needed to be able to predict future output signals as can be seen in section 3.4.1. The distance between the truck and the lead vehicle is modelled according to (6.2), which is a linear expression. • (dist (t )) = vlead (t ) − v(t ) (6.2) The velocity and the wheel speed for the front and rear axle are defined by (4.9), (4.8) and (4.7) respectively. To reduce the complexity of the calculations, which have to be done to predict future outputs, it is desirable to decrease the number of states. By assuming that the front and rear axle rotate with the same speed, ω, and that no slip occurs, i.e. that following expression is fulfilled (see (2.1) with σ = 0 ) 70 State Machine Combined with MPC ω= v r (6.3) where v – velocity of the truck r – wheel radius the equation for the front axle speed (4.8) and rear axle speed (4.7) can be inserted into the equation for the velocity (4.9) and the following equation is received v&(t ) = K (Fengine+retarder (t ) − Frr (t ) − Far (t ) − Fslope (t ) ) (6.4) where Frr, Far and Fslope are defined by (4.4), (4.1) and (4.6) respectively and K and Fengine+ retarder are defined by the equations below K= m+ Fengine+ retarder (t ) = J w,tot 1 + n 2f n g2 ( J e + J g ) (6.5) r2 n g n f (Tec (t ) − Tef (t ) ) − n f Tret (t ) (6.6) r This reduces the original six states to four states. The new state vector is x = (dist , v, Qret ,unsat ,Tret ,unsat ) T (6.7) The assumptions that are done make it possible to reduce the number of states. However, none of these assumptions are exactly fulfilled in practice. The simplification will therefore reduce the accuracy of the model. This has however been assumed negligible in this thesis. The engine is controlled by specifying the amount of fuel injected, mfi,desired. Therefore Tec − Tef is replaced in (6.6) with the linear engine map, given by (4.10). The engine map approximates how much torque the engine produces for a given mfi,desired and for a certain engine speed. The result becomes Fengine+retarder (t ) = 30n g n f πr n g n f c fm ,1m fi ,desired (t ) + c fm, 2 r v (t ) + c fm ,3 − n f Tret (t ) (6.8) The retarder model, defined by (4.20), has also been somewhat simplified. The power limitation and the oil flow limitation have been neglected. This does not decrease the number of states, but reduces the number of non-linearities and the number of upper boundaries. Furthermore, the time delay has been omitted. This decreases the number of states, since modelling of delays are done by introducing extra states as can be seen in section 3.4.7. The simplified retarder model can be expressed as 6.2 Linearisation of the System Model 71 Q& ret ,unsat (t ) = C1 (Tret , desired (t ) − Qret ,unsat (t ) − Tret ,unsat (t )) T&ret ,unsat (t ) = C 2 Qret ,unsat (t ) (6.9) 3000 if Tret ,unsat (t ) > 3000 if Tret ,unsat (t ) < 0 Tret (t ) = 0 T ret ,unsat (t ) otherwise In a first model, the slope of the road, α, was assumed equal to zero all the time, but then the prediction errors for the state vector (6.7), i.e. xˆ − x , became very large. Therefore the last term in (6.4), i.e. aslope = KFslope = Kmg sin α (6.10) was modelled as low frequency noise, due to that α changes slowly. According to [2] this can be written as a& slope = 0 + v1 (6.11) where v1 is white noise. This worked out well and small prediction errors were obtained. As a result of the simplification of (6.10) to (6.11), aslope becomes a state instead of a function of α. This implies that aslope should be seen as a variable in the linearisation. The dynamics of the truck, i.e. (6.4), and the retarder model, described by (6.9), are nonlinear and are thus not appropriate for linear MPC. Some of the non-linearities originate from that signals in the system, e.g. the injected fuel amount and the desired retarder torque, are supposed to be bounded to certain intervals. However, constraints of this type are only nonlinearities if the signals want to cross the limits of the intervals. If these constraints are included in the MPC-controller, the signals will be kept within their allowed regions by the optimisation. The boundaries may then be ignored, when linearising. Using the reduced model including aslope the state vector becomes x = (dist , v, Qret ,unsat , Tret ,unsat , a slope ) T (6.12) Linearisation around the point x 0 = (dist 0 , v 0 , Qret ,unsat 0 , Tret 0 , a slope0 ) u 0 = (m fi 0 , Tret , desired 0 ) d 0 = (dist desired 0 , v lead 0 ) (6.13) α0 = 0 gives the following expression for the truck model x& (t ) = Ax(t ) + Bu (t ) + N d d (t ) + f 0 z(t ) = Mx(t ) y (t ) = Cx (t ) (6.14) 72 State Machine Combined with MPC where A= 0 0 0 0 0 −1 0 0 0 − C1v0 + C 2 0 − c1 0 0 c2 0 0 0 1 C 5 0 , B = 0 C3 − c1 0 0 0 0 f0 = − v0 + vlead 0 K Fengine+ retarder 0 − Frr 0 − Far 0 + a slope 0 − c1 (Tret 0 + Qret ,unsat 0 ) + c1Tret ,desired 0 c2Tret 0 0 M= 1 0 ( 0 0 0 0 c1 , 0 0 Nd = 0 0 0 0 0 1 0 0 0 0 ) 0 0 0 0 , 0 1 0 0 Fengine+retarder 0 = C= 0 0 0 0 1 0 0 0 0 0 1 0 30n g n f πr n g n f c fm ,1m fi 0 + c fm , 2 v0 + c fm ,3 − n f Tret 0 r mg cos(α 0 ) C rr ,iso + C a (3.6v0 ) 2 − 80 2 + Cb (3.6v0 − 80 ) 1000 cρAv02 = 2 ( Frr 0 = Far 0 1 0 0 ( ) ) and mg cos(α 0 ) C1 = K 1000 C2 = K− C3 = − 2 ⋅ 3.6 2 C a + cρA mg cos(α 0 ) 30 n g n f 3.6C b + c fm , 2 1000 π r 2 Kn f r K C4 = − r Kc fm , 2 n g n f C5 = r Note that there is difference between c1, c2, which are constants defined in section 4.1.4, and C1 and C2. To reduce the calculation time of the linearisation the constants C1, C2,…, C5 are calculated during the initiation phase and then stored for future use. 6.3 Conversion of the Time Continuous System Model to a Time Discrete Model 73 As was mentioned in section 3.3.1, relinearisation should be done whenever the signals in the system change. How much the values are allowed to change before a new linearisation has to be performed depends on how much the non-linearities affect the dynamics of the system. The non-linearities in the truck model are not very significant, which implies that relinearisation could be done rather sparse. Despite this, a new linearisation is performed in each sample moment. The main reason is that the controller should be as general as possible. In that way, further extensions of the model are easy to carry through, even if the additions are nonlinear. Another reason is to see what problems that could arise, due to frequent relinearisation. For example, the Kalman gain in the observer (see section 3.3.3) has to be calculated in another way. It is also interesting to investigate how the total solution time of one sample in the MPC-controller changes, if linearisation is performed compared to if it is not. The results can be seen in the table 6.1. Note that the total solution time also includes the other necessary calculations, for example the optimisation and the estimation of the states. The relative change of the solution time have been calculated according to change = where tcalc ,linearisation tcalc ,no linearisation tcalc,linearisation − tcalc, no linearisation (6.15) tcalc ,no linearisation – total solution time for one sample, when relinearisation is performed in each time step – total solution time for one sample, when no relinearisation is performed Table 6.1: The difference in solution time between the cases when linearisation is performed and when it is not. Hp 3 50 50 Hu 3 10 25 tcalc,linearisation [s]1 0.146 1.33 1.75 tcalc,no linearisation [s]2 0.0976 1.30 1.72 change [%] 49.6 2.31 1.74 For small Hp and Hu, it can be seen that the linearisation results in a rather large increase of the solution time. However, for larger values of Hp and Hu, the increase is negligible. This is logical since the other calculations take longer time, but the model is the same regardless of Hp and Hu. 6.3 Conversion of the Time Continuous System Model to a Time Discrete Model The linearised truck model, given by (6.14), is time continuous. Since the MPC-controller needs a time discrete model for predicting future outputs, (6.14) has to be time discretised. The time discretisation is done in accordance with section 3.3.2 and is in Matlab performed by the command c2d. However, a small problem occurs. It is that the linearised truck model is given on the form 1 2 See appendix B.1, for more information about the test procedure. See footnote 1. 74 State Machine Combined with MPC ∆x& = A∆x + B∆u + N d ∆d + f 0 ∆z = M∆x ∆y = C∆x (6.16) but the command c2d assumes that the system is on the form ∆x& = A∆x + B∆u ∆y = C∆x + D∆u (6.17) Hence, N d ∆d , f 0 d and ∆z = M∆x are missing. To transform the system described by (6.16) into a system described by (6.17), ∆d and f0 were added to the input signal and ∆z were added to the output signal according to ∆x& = A∆x + [B ∆z ∆y Nd I] ∆u ∆d f 0 (6.18) M ∆x C = where I is an identity matrix of proper size. The system matrices used as parameters in c2d then becomes Ac 2 d = A Bc 2 d = [B N d I ] M 0 0 0 C c 2 d = Dc 2 d = C 0 0 0 (6.19) where A, B, Nd, M and C are defined according to (6.14) and 0 is a zero-matrix of proper size. The matrices returned by c2d are of the same sizes as the ones in (6.19). However, the expressions for the constraints and the prediction of the future outputs, that includes ∆d, f 0 d and ∆z, demands that the system is on the form (6.16). Therefore, the matrices returned by c2d have to be extracted. The linearised and time discretised truck model will then be ∆x(k + 1) = Ad ∆x(k ) + Bd ∆u (k ) + N dd ∆d (k ) + f 0 d ∆z (k ) = M d ∆x (k ) (6.20) ∆y (k ) = C d ∆x(k ) where the system matrices, i.e. Ad, Bd, Ndd, f0d, Md and Cd, are the time discretised versions of the system matrices in (6.14). 6.4 Introducing Integral Action Static errors occurred in the distance and the velocity. This was not desirable, and some sort of compensation was therefore needed. As was mentioned in section 3.4.1 this can be done by introducing integral action in the controller. A problem that might occur, regardless of how the integral action is introduced, is that the measured value may be affected in some way before it is compared to the reference value. The integral action will then not work as expected. 6.4 Introducing Integral Action r e + 75 F u G y yobserver Observer Figure 6.1: Closed-loop system including an observer. Real integral action means that e, defined from figure 6.1 as e=r−y (6.21) should go towards zero, even if for example constant disturbances are present. However, if integral action is introduced in the controller F in the figure above, it will mean that e = r − y observer goes towards zero instead. Thus, to get real integral action y observer = y must hold. If all states are measured and not so much measurement noise is present, this is fulfilled since the observer will then be unnecessary and may be excluded. However, in this thesis, an observer was used. This depended both on that the measured states were affected by quantisation noise, which effect was preferable to reduce, and that one state was not measurable. To get as good integral action as possible, it was important to have small prediction errors, ε, defined as ε = y observer − y (6.22) In the first attempt to model the truck (see section 6.2) the influence of the road slope was neglected. This resulted in large prediction errors. Since the prediction errors determined how accurate the integral action could be, it was of no use in this case. However, when the main part of the road slope was modelled, by including aslope, the prediction errors were reduced and real integral action was achieved. Integral action can be introduced in several different ways. One way, which is described in section 3.4.1, is to penalise the difference of u, according to (6.23), instead of u directly. ∆u (k ) = u (k + 1) − u (k ) (6.23) Due to many reasons, this is a nice method to introduce integral action. One reason is that it is easy to include in the standard MPC-framework. Another reason is that no problems with integrator windup will occur. However, when this method is used it is no longer possible to prevent simultaneous use of the control signals. Two different ways that avoid, or at least reduce, this problem can be read about in section 3.4.11. One is to penalise the cross terms, i.e. ui u j and another is to use binary variables. The drawback with these methods is that the optimisation problem might become non-convex. In a first attempt to introduce integral action in the controller, expression (6.23) was used in combination with cross term penalty (see section 3.4.11). The weights on the cross terms (see (3.85)) were chosen quite large to avoid simultaneous use of the control signals. This worked out well and no static errors were found. However, a test showed that the problem was nonconvex. The weights were therefore decreased until they became small enough to eliminate the non-convexity. Unfortunately, simultaneous use of the control signals arose. 76 State Machine Combined with MPC A second attempt was to use expression (6.23) in combination with the introduction of binary variables. This solved the non-convexity problem mentioned above, but as was written in section 3.4.11 the solution time increased very much. Instead of penalising ∆u , integral action can be created by adding integrator states x&integrator ,dist (t ) = dist desired (t ) − dist (t ) x&integrator ,v (t ) = vdesired (t ) − v (t ) (6.24) to the linear model, given by (6.14). These states are penalised in the cost function by adding weights to the ordinary weight matrix, defined in section 3.4.1. After this modification the resulting weight matrix will be Q1 = Q1 0 0 0 Q1,integrator ,dist 0 0 Q1,integrator ,vel 0 (6.25) The controller will then try to get them equal to zero. If dist desired − dist and vdesired − v are non-zero the integrator states will increase or decrease. Finally, when the integrator states has increased or decreased enough, the controller will ‘understand’ that it has to prioritise these conditions compared to keeping the control signals small. Unfortunately, there are some drawbacks with this way to introduce the integral action. For example, a more complicated switch procedure is needed, the complexity of the optimisation problem increases because of the extra states and integrator wind up might occur. The advantage is that u can be penalised directly. In that way, simultaneous use of the control signals can be prevented, without the problem becoming non-convex. How the integral states should be weighted are design choices. A large penalty will have the same effect as a large Ki in a PID-controller [1], i.e. a faster, but more oscillative, settling towards the reference value. For a truck, it is often better to have a rather slow settling and small oscillations rather than a fast settling and large oscillations. The reason for this is that the driver is more sensitive to oscillations, than to a small static error. Both integrator states are never included in the same time. This depends on that in certain driving cases, i.e. when the truck is following another car, the distance is measured. Then the integrator state for the velocity is not needed, because the ordinary state for the distance, given by (6.2), does the same thing, i.e. goes towards infinity if a static error in the velocity is present. On the other hand, if the truck does not follow another car the ordinary distance state is neglected. Therefore, the integrator state for the distance can be disabled. However, since the distance is not measured, the integrator state for the velocity has to be included. One drawback with using these conventional integrator states, instead of penalising changes in the control signals, is that they are not protected against integrator windup. This means that if a control signal is saturated, and the reference value of a state is not fulfilled, the integrator of that state will windup and become very large. This may in turn cause large overshoots, which in the cruise controller application can cause an uncomfortable behaviour. 6.5 State Estimation 77 To handle the problem with integrator windup, the integrator states are prevented from growing when a control signal is saturated. Five important cases, when integrator windup might occur, are if any of the following conditions are fulfilled • • • • • (measured distance > reference distance) & (full throttle is being used) (measured distance < reference distance) & (full retarder is being used) (measured truck velocity > reference velocity) & (full retarder is being used) (measured truck velocity < reference velocity) & (full throttle is being used) (measured truck velocity ≥ ACC set speed) & (measured distance > reference distance) If any of these five conditions are fulfilled the integrator state is not updated. Note that the check whether the integrator state is growing or not is included in the conditions above. This test is performed to make it possible for the absolute values of the integrator states to decrease although an actuator is saturated. 6.5 State Estimation The state estimator used in the controller is an extended Kalman filter. The theoretical background for this kind of filters can be found in section 3.3.3. The model used in the extended Kalman filter is the time discretised truck model (6.20), linearised around the last estimation of the current state. The theoretical background of linearisation of models can be found in section 3.3.1. In each sample, the following operations are executed in order to estimate the current state of the system ∆xˆ (k ) = xˆ (k ) − x0 (k ) ∆u (k ) = u (k ) − u0 (k ) ∆xˆ (k + 1) = A(k )∆xˆ (k ) + B (k )∆u (k ) + f 0 d (k ) + K (k )( y (k ) − Cxˆ (k )) xˆ (k + 1) = x0 (k ) + ∆xˆ (k + 1) (6.26) In each sample, the system is relinearised. This means that A, B and f0d are time dependent and are calculated according to expression (6.26). The optimal Kalman gain, K, is here calculated from the time variable Riccati equation, as described in section 3.3.3. The Kalman filter is only used to estimate the original system states, i.e. those presented in expression (6.12). This means that the K-matrix is all zero in the rows corresponding to the added integrator states introduced in (6.24). When a vehicle is present ahead of the truck, all original states need to be estimated. In this case, no modifications of the model or the Kmatrix are needed. If the vehicle ahead is lost, a simulation of the distance to the target is started. This is done by setting the first row and column in the K-matrix to zero. By doing this, the first state becomes a pure simulation without any corrections from measurements. The simulation of the distance to the lead vehicle proceeds for a specified time. If this time elapses, without any detection of a curve, the target is assumed lost and the “no target ahead mode” takes over. In the “no target ahead mode”, the estimation of the first state, i.e. the distance to the vehicle ahead, is turned off. This is done by setting the first row, except the first element, in the Amatrix to zero. Further, the first row in the B-matrix and the K-matrix are also set to zero. To 78 State Machine Combined with MPC avoid that the measurement of the first state affects the other states, the entire first column in the K-matrix is also set to zero. Finally, the first row in the f0d-vector is set to zero. The first element in the first row of the A-matrix is not set to zero. This is because this element is used to hold the first state identical to zero. This element is a one, because the first state is a pure integrator. Another solution is to explicitly set the value of the state to zero in each sample. If this is done, both the entire first row in the A-matrix and the point of linearisation of the first state, should be set to zero. As mentioned in section 6.4, the two integrator states introduced in expression (6.24) are added to the ordinary states in (6.12). The first integrator state integrates an error in the distance to the vehicle ahead and the second one integrates an error in the velocity. If the truck follows a vehicle ahead the distance error integration is enabled and the velocity error integration is disabled. An integrator on the velocity error does not need to be present when the distance is used for feedback. In this case an integrator is, thanks to the physics, already present in the system. When the cruise controller is working in the “no target ahead mode”, no distance to a vehicle ahead can be measured. This means that an integrator state for the velocity is needed. Thus, the integrator on the distance error is disabled and instead the velocity error is integrated. This gives zero static velocity error. The disabled integrator state is kept to zero in the same way as mentioned above about the distance state. 6.6 State Machine 6.6.1 Introduction The state machine used in combination with the MPC-controller is very reduced, compared to the one used in the hybrid controller. One of the main tasks for the state machine in the hybrid controller was to choose between the use of engine and brakes. In the MPC-part, this is done directly by the controller itself. However, a state machine is needed anyway for other purposes. The main task for the state machine is in this case to change the weight matrices Q1, Q2 and Q3. By changing the weight on a state its reference tracking priority can be changed. For example, if no target is present ahead of the truck, the weight on the distance state is set to zero. In this case, the distance is of no importance. For practical reasons, the first state is in this case also held to zero. This can be motivated by thinking that although there is no weight on a deviation from the reference distance, the distance state may become very large after a long period of running in the “no target mode”. This could in a truck implementation cause problems if the state finally overflows. It is also nice for structural reasons to know that an unused state is set to zero. Two flags are being used to report whether the radar has contact with a target or not, or if a target is being simulated. The first flag is called target_ahead_visible and is used to tell if the radar reports a target or not. The second flag is called target_ahead and is used to tell if there is a possible target ahead or not. This means that if a target is in sight of the radar, both flags have the value true. If the radar has lost its contact with the target, but it is being simulated as described 6.5, only the last flag has the value true. 6.6.2 States The main discrete states in the hybrid solution have their similarities also in the MPCsolution. The discrete states used are 6.6 State Machine • • • • 79 No target ahead Follow mode Lost target Curve mode The “no target ahead mode” is used when no other vehicle is reported by the radar. The entrance condition is that the flag target_ahead should be false. In this mode the ACC system is supposed to act as an ordinary cruise controller, extended to use the retarder if necessary. In this case the velocity error integration is active and the distance integration is, of course, inactive. The only state that is being penalised is the velocity. The reference velocity is set to vref,ACC and no constraints on the velocity are present. “Follow mode” is the mode used if a vehicle is visible to the radar. To enter this mode both the flags target_ahead_visible and target_ahead should have the value true. In this mode, the distance error integration is enabled and the velocity error integration is disabled. A constraint on the velocity is set to vref,ACC. If the truck has a velocity higher than this limit when the constraint is introduced, the introduction is made in a smooth way. See more about smooth introduction of constraints in section 6.10. In the “follow mode” all states, except the velocity integrator state, are active and updated. The controlled variables are the distance to the vehicle ahead and the truck velocity. In this mode, the reported velocity of the vehicle ahead is stored. This information is to be used later if the target is lost and a simulation of the distance to the vehicle ahead is started in the “lost target mode”. The settings used when the “lost target mode” is active are very similar to those used in “follow mode”. This mode is activated when the flag target_ahead is true and the flag target_ahead_visible is false. Zero reported distance to the lead vehicle means that no vehicle is in range in front of the truck. When the vehicle ahead disappears from the radar, it can be for several reasons. One reason is that the vehicle has turned off. Another is that the radar has lost the contact due to a curve. It is very hard to immediately distinguish these two cases from each other. Therefore, the controller acts as if the car still was in sight for a while. When this occurs the previously stored information about the velocity of the vehicle ahead is used to simulate this vehicle for some specified time. The model of the vehicle ahead used in this simulation is that it has constant velocity. The distance between the truck and the simulated lead vehicle is simulated by disabling the feedback from measurements in the Kalman filter, as described in section 6.5. The vehicle ahead is simulated until either a curve is detected or until the chosen target simulation time has elapsed. If this time runs out without a curve is detected, it is interpreted as that the vehicle ahead has turned off. This means that no vehicle is in front of the truck and the “no target ahead mode” is entered. If the yaw-rate sensor detects a curve before the specified time has elapsed, the assumption is made that the vehicle ahead has disappeared because of the curvature of the road. In this case the “curve mode” is entered. In the “curve mode” the simulation of the vehicle ahead, which began in the “lost target mode”, is continued. This mode is a sub mode of “lost target” and it is entered if the reported yaw-rate exceeds the specified limit set to detect a curve. In every sample, when the yaw-rate sensor reports that the truck is in a curve, the timer used in the “lost target mode” is reset. This means that as long as a curve is reported from the yaw-rate sensor, no timer is counting. When the curvature is below the limit to be reported as a curve, the “lost target mode” is 80 State Machine Combined with MPC entered again and the timer starts counting. If the timer reaches the specified value, the “lost target mode” is left and the “no target ahead mode” is entered. The use of integrator states is controlled by the logic. In the Matlab-code belonging to the different states, flags are used to report to code that is being executed later on to modify the system matrices used in the MPC-controller and in the observer. The same approach is also used in the “no target ahead mode”, when the matrices are modified to be able to hold the distance state to zero. How this works is further described in section 6.5. 6.6.3 Other Logic There is also some other logic in the state machine. This logic is used in all modes and is placed outside the code for the modes presented in the previous section. There is a need for logic that is handling the setting of the distance state, if a target appears after a period of no visible targets ahead. When this is done, the distance state is initialised to the first value reported by the radar after the new vehicle has been detected. At the same time, the distance integration is enabled and the velocity integration is disabled. The distance state is also set if a new target is detected, i.e. a rapid change in the reported distance is detected. This could have been handled by the observer itself, if the observer gain for the distance state would have been set to a high level. If the observer gain is chosen high for a state, noise can easily affect this state. With the used structure, noise is being prevented from affecting the distance state and, at the same time, the state updates quickly if the target is changed. 6.7 Switching Between Actuators One of the main problems, which were to be solved in this thesis, was the switching between the use of engine and brakes. In the hybrid controller solution, this problem was solved by defining some switch conditions that were to be fulfilled before an actuator switch could occur. In the MPC-part of this thesis, this task was supposed to be handled automatically by the controller. The main thought is that the MPC-controller should “understand” that it is not optimal to use the engine and the brakes at the same time. Due to the constraints on the control signals, the MPC-controller will be aware of that the brakes cannot be used to apply a positive torque to the truck. It will similarly know that it is impossible to inject a negative fuel amount to the engine. In this case, zero injected fuel amount means engine brake, which is a small negative torque (approximately 90 Nm). Thus, the controller knows that a small brake can be performed by only using the engine. Only when the desired brake torque exceeds what the engine brake can deliver, it is necessary to use the retarder. A simplified model of the system is found in figure 6.2. Fengine ≥ − Fengine friction Fengine Fretarder m Fretarder ≥ 0 Ftotal = Fengine − Fretarder Figure 6.2: A simplified model of the truck. In this thesis tests have been performed that show that it is possible to solve the actuator switch problem by using the QP-problem that occurs in the ordinary MPC-framework. The motivation to why this works is that it is not optimal, according to the cost function (3.26) with both control signals penalised, to use the engine and the brakes at the same time. Unfortunately, the method used to provide integration, by penalising changes in the control 6.8 State Constraints 81 signals, cannot be used anymore if the controller is supposed to handle the switching as well (see section 6.4). When changes in the control signals are penalised, it does no longer cost anything to use the actuators at the same time. In fact, due to different dynamics in the actuators, it can be cheaper to increase the torque from the engine than to decrease the torque from the brakes. This problem makes it necessary to use explicit integrator states, as in ordinary LQ, when the controller is supposed to handle the actuator switching as well. Another possibility is to use so-called mixed integer programming. In this method, boolean variables can be used to prevent simultaneous use of the actuators. With these variables included, integration can be achieved by penalising changes in the control signals. More about this can be found in section 3.4.11. 6.8 State Constraints When the truck is following a vehicle, the controller might want to effectuate a very high velocity. The maximal velocity allowed must however be restricted, partly because of legal demands and partly because the driver does not want the truck to go too fast. This has been done by adding an upper velocity constraint according to v ≤ v max (6.27) where vmax = min( vmax,legal demand , v ref , ACC ) Since the velocity is second state in the model, the constraint can be expressed as z (k ) ≤ b z (6.28) where z (k ) = (0 1 0 0 0 )x (k ) = M constraints x (k ) bz = v max This constraint is supposed to hold for each sample in the prediction and can therefore be written on vector form as ~ Z 1 ≤ bz ,1 (6.29) where z ( k + 1) M constraints 0 0 L x( k + 1) x (k + 2) ~ z ( k + 2) 0 0 M constraints L = M constraints X (k ) = Z1 (k ) = M + z ( k H p ) ~ b z ,1 = M M O 0 0 L M M constraints x(k + H p ) M v max v max M v max The maximal and minimal acceleration should also be limited. Legal demands, given in [20], defines that the lower limit of the acceleration, a min , shall be –3 m s 2 and the upper limit, 82 State Machine Combined with MPC amax , shall be 2 m s 2 . However, for comfort reasons the upper limit in this thesis has been chosen to 1 m s 2 instead. These boundaries on the acceleration can be expressed as a min ≤ a(k ) ≤ a max (6.30) However, in the standard MPC-framework only “less than”-constraints are allowed. The expression above must therefore be formulated as a (k ) ≤ a max (6.31) − a (k ) ≤ − a min The framework also demands that the constraints are expressed in the different states. Since the acceleration is the derivative of the velocity, this can be done as a(k + 1) = v(k + 1) − v(k ) M constraints x(k + 1) − M constraints x(k ) = Ts Ts (6.32) The last equation is then written on vector form as (compare with the introduction of ΩU in section 3.4.1) Z 2 (k ) Z1 (k ) − Z1 (k − 1) = = Ts Ts = = 1 Ts M constraints − M constraints M 0 0 L M constraints L M O 0 L 0 0 X (k ) − 1 M Ts M constraints M constraints x( k ) 0 M 0 = (6.33) ΩX (k ) − δ Ts The matrices, which describes (6.31), will on vector form be ~ b Z 2 (k ) z ,2 ≤ Ts ~ − ( ) Z k b 2 z ,3 (6.34) ~ ~ ~ where bz , 2 and bz ,3 looks like bz ,1 , but where vmax have been substituted with amax and -amin respectively. Since the model used by the controller, is a linearisation of the real model, it requires that variables relative to the linearisation point, (x0, u0, d0, α0), are used in the prediction of future output signals. These variables are denoted ∆Z and ∆U (see section 3.4.1). Therefore, the constraints must also be formulated in these variables. The constraint matrices can then be formulated as ~ Z 0 + ∆Z ≤ b z (6.35) where 6.8 State Constraints 83 ~ Z0 = M constraints Ω X0 − Ω ∆Z = M constraints Ω ∆X − Ω ~ 0 + − δ 0 , δ 0 0 + − δ δ X0 = x0 x 0 , M x0 δ0 = M constraints x0 0 M 0 ~ ~ ~ = M∆X + δ , ~ bz = b z ,1 ~ Ts bz , 2 ~ − T b s z ,3 and ∆X is defined by (3.52) as ∆X = H∆x (k ) + S∆U + P∆D + Tf 0 d (6.36) The expression (6.35) must be transformed into the standard form, given by (3.45), before it can be used in the optimisation. By inserting (6.36) into (6.35) and solve for ∆U the following expression is received ~ ~ ~ ~ ~ ~ MS∆U ≤ bz − MH∆x(k ) − MP∆D − MTf 0 d − δ − Z 0 (6.37) in which Az and bz can be identified as ~ Az = MS∆U ~ ~ ~ ~ ~ bz = bz − MH∆x (k ) − MP∆D − MTf 0 d − δ − Z 0 (6.38) Here, everything in bz is known before the prediction. As was said in section 3.4.6 problems might occur with ‘hard’ constraints, i.e. constraints that are never allowed to be broken. The figures below have been generated for a case where the initial velocity has exceeded the velocity constraint. In figure 6.3 is the flag, which indicates if the optimisation solver found a feasible solution or not, shown. A zero corresponds to that no feasible solution was found and one corresponds to that it was. feasible solution-flag 1 0.8 0.6 0.4 0.2 0 0 1 2 time [s] 3 4 5 Figure 6.3: The flag, which shows if it is a feasible solution or not, when no slack variable is introduced. Zero corresponds to that no feasible solution was found and one to that it was. 84 State Machine Combined with MPC As can be seen in figure 6.3, no feasible solution was found for t = 0 s and t = 1 s. In figure 6.4 and figure 6.5, the calculated control signals can be found. 3500 100 3000 [mg/stroke] fi,desired 2000 1500 1000 m T ret,desired [Nm] 0 2500 -100 -200 -300 500 0 0 1 2 3 4 time [s] Figure 6.4: The desired retarder torque, Tret,desired, when no slack variable is introduced. 5 -400 0 1 2 3 4 5 time [s] Figure 6.5: The desired injected fuel amount, mfi,desired, when no slack variable is introduced. When infeasibility occurs it can be seen that Tret,desired becomes larger than 3000 Nm and that m fi becomes smaller than 0 mg/stroke. Hence, the optimisation solver has given control signals outside their allowed intervals. This depends on that when one constraint is broken, in this case the velocity constraint, the solver does not bother to fulfil any constraint. However, it is interesting to note that m fi is exactly as much smaller than 0 mg/stroke as Tret,desired is larger than 3000 Nm. This is an implementation choice for the Matlab command quadprog, which is the optimisation solver used in this thesis. However, for solvers in general, nothing can be said about how they handle infeasibility. To solve the infeasibility problem, the constraints were ‘softened’. In this thesis, three different variants of doing this have been tested. In the first variant, one slack variable was added to all constraints in all sample times and it was linearly penalised in the cost function. The optimisation problem then looks like (3.67). When the same simulation as earlier was performed, only feasible solutions were found, as can be seen in figure 6.6. 6.8 State Constraints 85 feasible solution-flag 1 0.8 0.6 0.4 0.2 0 0 1 2 time [s] 3 4 5 Figure 6.6: The flag, which shows if it is a feasible solution or not, for the case when linear penalty is used. Zero corresponds to that no feasible solution was found and one to that it was. In figure 6.7 the values of the slack variable have been plotted. 1 slack variable 0.8 0.6 0.4 0.2 0 0 1 2 time [s] 3 4 5 Figure 6.7: The slack variable, ε, when linear penalty. As can be seen, it becomes non-zero for some samples. One could expect that the slack variable would become non-zero exactly when infeasibility occurs (compare figure 6.3 with figure 6.7). This does however not happen. It depends on that the solution changes, since other control signals are put out. Thus, this variant of ‘softening’ worked as expected. To get a good result it was necessary to choose a large slack variable weight, ρ, as was mentioned in section 3.4.6. In this thesis ρ was chosen to 1015. Note that this value of ρ solves the infeasibility for the problem in this thesis. However, in general, problems with numerical noise might occur if ρ is chosen this large. Another variant that was tested is to still have just one slack variable but penalise it quadraticly instead. The cost function in (3.67) is then substituted by (3.63). In figure 6.8, it can be seen that the infeasibility problems are eliminated as expected. 86 State Machine Combined with MPC feasible solution-flag 1 0.8 0.6 0.4 0.2 0 0 1 2 time [s] 3 4 5 Figure 6.8: The flag, which shows if it is a feasible solution or not, for the case when quadratic penalty is used. Zero corresponds to that no feasible solution was found and one to that it was. In the theory presented in section 3.4.6, it was mentioned that a problem which might occur for this variant is that the slack variable becomes non-zero more often than when linear penalty is used. However, in this thesis, where ρ has been chosen large, this does not happen as can be seen in figure 6.9. 1 slack variable 0.8 0.6 0.4 0.2 0 0 1 2 time [s] 3 4 5 Figure 6.9: The slack variable, ε, when quadratic penalty. The last variant was to have one slack variable for each sample time, as can be seen in (3.61), and to penalise these slack variables linearly. The cost function will then be described by (3.64). This variant did also solve the problem with infeasibility, but the solution time for the optimisation increased greatly. The different solution times for one sample, when H p = 50 and H u = 10 , have been measured and are presented in table 6.2. 6.9 Choice of Prediction Horizon 87 Table 6.2: The solution times for the different variants of slack variable introduction. Variant No slack variable One slack variable, linear penalty One slack variable, quadratic penalty One slack variable in each time sample, i.e. 50, linear penalty Solution time [s]1 1.76 1.71 1.62 4.45 The final choice in this thesis was to use the first variant. It gives short solution time due to few optimisation variables and still eliminates the problem. Moreover, it gives exactly the same solution as the ‘hard’ constrained problem would have done, as long as the slack variable is not used (see section 3.4.6). 6.9 Choice of Prediction Horizon To get nice and smooth settlings for the system, the prediction horizon, H p , should be chosen in a way that it covers a normal settling according to [2]. It means that H pTs ≥ Tsettling (6.39) where Tsettling is the settling time for the system. The settling time is normally determined by applying steps to the inputs of the system. Due to the system is non-linear, different step sizes will give different settling times. Therefore, it is necessary to investigate the worst-case. Because of disturbances and bounds on the input signals, the system will in the worst case never reach stationarity, e.g. in heavy downhill where the truck do not manage to brake. Hence, an appropriate Tsettling had to be determined by testing. The settlings for the closed-loop system became smooth, and it seemed like the controller ‘understood’ how the open-loop system worked if Tsettling was chosen to around 25 s. To get H pTs sufficiently long, either Hp or Ts can be modified. Most often, Hp is chosen as the design variable, but also Ts can be chosen if it is not given by some external circumstances. The drawback of choosing Hp large is that the complexity of both the prediction and of the optimisation increases. This depends on the creation of the large prediction matrices, for example H, S etc, and on the numerous optimisation variables and constraints (see section 3.4.1). As is described in section 3.4.5, the number of optimisation variables, nu ⋅ H u , can be decreased through a variable change U = ΩMUM (6.40) Moreover, it is possible to replace the constraints on U with constraints on U M , if Ω M is chosen in a way that U becomes piecewise constant or piecewise linear. The number of state constraints, i.e. 1 See appendix B.2, for more information about the test procedure. 88 State Machine Combined with MPC Y ≤ by (6.41) is however only dependent on Hp and is not affected by the variable change. In table 6.3 the solution times for one sample, tcalc, are shown for a number of different Hp and Hu. The optimisation solver was Matlab’s quadprog, which uses an ‘active set’-algorithm. Table 6.3: The solution time, tcalc, for different Hp and Hu. Hp 3 20 30 40 50 60 50 50 Hu 3 10 10 10 10 10 25 50 tcalc [s]2 0.35 0.66 0.89 1.3 2.0 2.9 2.7 6.2 Table 6.3 shows that the solution time increases faster than linear with Hp or Hu. It means that if Hp or Hu is increased with a factor two, the solution time will increase with a factor larger than two, at least for values of Hp larger than 20. This yields that it is even more desirable to choose Hp and Hu small, to get a fast algorithm. An optimisation solver, based on an ‘interior point’-algorithm, which is less dependent on the number of active constraints, should perhaps have reduced the solution time [9]. The advantage with a large Hp is that Ts can be small. Since the sample time is the time between two consecutive readings of the inputs, disturbances will be detected earlier and can therefore be corrected for earlier. However, if Hp is small a fast problem is obtained, but Ts must be chosen large, to fulfil (6.39). The disturbances can then affect the system for a longer time before the influences from them are eliminated. This can have large consequences. Assume that a car with a velocity equal to 50 km/h turns in ahead of the truck, which travels with a speed of 90 km/h. It corresponds to a speed difference of 11 m/s. If Ts is chosen as 2 s, the distance between these two will, in the worst case, decrease with 22 m before the car is detected. If the car turns in, in front of the truck at a distance of 20 m, some problems may then occur. To simultaneously avoid situations like the one mentioned above, fulfil (6.39) and still have a rather short solution time, Hp, Hu and Ts were chosen as H p = 50, H u = 10, Ts = 0.5 s or H p = 46, H u = 10, Ts = 0.5 s Which set of constants that are chosen, depends on which Ω M that are used. The first set corresponds to that linear-interpolation between the control signals are not used, and the second set corresponds to that it is. One further reason to choose a prediction horizon that fulfils (6.39), is to reduce the influence of certain disturbances on the settling, for example uphill or downhill. 2 See appendix B.3, for more information about the test procedure. 6.10 Smooth Introduction of New State Constraints 89 6.10 Smooth Introduction of New State Constraints If a new state constraint is introduced it may lead to an infeasible solution, as described in the section 3.4.6 about soft constraints. Infeasibility is avoided by using soft constraints instead of hard constraints. Soft constraints can be interpreted as if the penalty was raised extremely high when the limit on the state was crossed. The controller then tries to escape from this region as fast as possible. If a new constraint is introduced, while the system already is in the forbidden region, the controller tries to change the state of the system to another that is feasible. This action is normally made very brutal, because the constraint penalty must be set very high to prevent the system from entering the region in normal operation. To be able to have this distinct limit at the same time as newly introduced constraints are fulfilled gently, new constraints are introduced slowly. In this thesis, no constraints on the velocity is used in the “no target ahead mode”. When the controller switches to “follow mode” a velocity constraint is introduced. This prevents the velocity of the truck to exceed vref,ACC. For example, if the truck is travelling in the “no target ahead mode” with the velocity vno target and a mode switch is made to “follow mode” an upper velocity constraint is to be introduced at vconstraint. If vno target > vconstraint the truck might brake very hard to fulfil the new constraint. To avoid this, the new constraint is initially set to vno target. The constraint is then lowered towards the desired constraint at vconstraint. How fast this is performed is a design parameter. To notify the MPC-controller about the lowering of the constraint, the constraint fade out is also used in the simulation performed in each sample. This makes the controller aware of that the velocity constraint is falling and the new constraint is therefore fulfilled without any spikes. The method is illustrated in figure 6.10. In the time tinit, the truck is entering the state “follow mode” at a velocity that is higher than the desired limit at vconstraint. The constraint is then lowered linearly until it reaches the actual limit at vconstraint in the time tfinal. When this desired limit is met the smoothing procedure is finished and the constraint is fixed. velocity vconstraint tinit tfinal time Figure 6.10: This figure shows how a new constraint smoothly is introduced at tinit. At tfinal the constraint has reached its intended level vconstraint. 90 State Machine Combined with MPC 7.1 Complexity 91 7 Comparison between hybrid controller and MPC 7.1 Complexity Hybrid controller To begin with, the hybrid controller is easier to understand how it works. The flowcharts in Stateflow give a possibility to visualise the structure in a good way, and the controllers are just ordinary PID-controllers. This also makes it easy to design ad hoc solutions for different ideas. However, since many different driving situations have to be considered in order to obtain a good result, the flowcharts may still become slightly complicated. An advantage of using ordinary PID-controllers is that it is rather simple to get a feeling for how the parameters in the controllers affect the settling of the system. Therefore a good result may be obtained without too much work. One further advantage with the hybrid controller is that it does not need a mathematical model of the system. This holds as long as no time delay has to be compensated for. MPC-controller MPC is on the other hand rather hard to overview. Both because the state machine and everything else is written in plain code and because the math is more complicated. Especially the optimisation solver, but also the linearisation, the observer and the linear algebra can be rather hard to understand. The optimisation also results in that it is harder to get a feeling for how the settling of the system is affected by different design parameters. However, if the MPC-controller is general enough, it might be possible to design a block in Simulink, which a user can add to Simulink-schemes. In that case, even engineers who are not so familiar with the mathematics behind MPC would be able to use it. The state machine in the MPC-controller is smaller than the one in the hybrid controller and there are fewer switch criteria. The smaller number of switch criteria depends both on that the state machine contains less states, but also on that no switch criteria between the engine and the retarder are needed. Further, the switch criteria are simpler in the MPC-controller. The MPC-controller needs a linear, mathematical model to be able to predict future states. This will imply that all the different parts in the truck must be modelled and that the nonlinearities in the resulting model must be linearised. It will increase the work effort needed. Furthermore, the state feedback law demands that all the states in this model are measurable. Since this is not the case for the truck model, an observer is needed. The synchronisation of the different parts in the MPC-controller, for example the state machine and the observer, is one further complicating factor. It is very easy that the state machine changes mode one sample before the observer is changed and vice versa. 7.2 Robustness in practice When comparing the two methods it is interesting to see if there are any differences in the sensitivity to model errors. Some parameters are of special interest for this application. For example, the retarder model used is very uncertain and the actual mass of the truck is not known, but estimated. 92 Comparison between hybrid controller and MPC In this section, tests are performed to show what happens if the truck has an actual mass, mactual, that is different from the one used in the controller, mcontroller. Especially, it is interesting see how the switch strategy in the hybrid solution is affected and how the MPCcontroller handles this model error. In practice, an error in the estimated mass means that a specified actuator torque gives a different truck acceleration than expected. In the hybrid solution, no model is used in the linear controllers. As mentioned in section 5.4, the only explicit mass dependence in this solution is found in the calculation of the switch point aenvironment. Because of this, the primary interest for this case is to find out if excessive mode switches might occur. In the MPCsolution, the mass appears explicitly in the model. If this mass differs from the actual one, the simulation that the MPC-controller performs in each sample will be incorrect. The driving case used to illustrate the behaviour when the mass used in the controller is incorrect is a settling where a new lead vehicle appears in front of the truck at a initial distance of 100 m. The situation is repeated three times at the different road slopes –4°, 0° and 4°. To test how the behaviours of the controllers are affected when the estimated mass is incorrect, three tests with different actual truck masses have been performed Test 1: mcontroller = 25 000 kg and mactual = 20 000 kg Test 2: mcontroller = 25 000 kg and mactual = 25 000 kg (reference case) Test 3: mcontroller = 25 000 kg and mactual = 40 000 kg Test 1 In this test, the mass of the truck is smaller than the controller expects. An applied torque will in this case give the truck a higher acceleration than expected. The plots corresponding to the hybrid solution are found in appendix A.3 and those corresponding to the MPC-solution are found in appendix A.4. As can be seen in these plots, the effect of the incorrect mass is negligible. In the hybrid solution some unmotivated mode switches occur around the time 197 s. At this time instant, the controller is in the “no target ahead mode”. Because of the incorrect mass, for a while the controller cannot decide whether to use the cruise controller or the downhill cruise controller. After approximately one second, the switches have stopped. Test 2 This case is the reference case where the controller is using the correct mass of the truck. The plots from this test are presented in appendices A.1 and A.2. Test 3 In the last test, the actual mass of the truck is larger than the one known to the controller. This means that a certain torque does not give an acceleration as high as expected. Because the larger mass gives a slower dynamics of the system, this case is easier to handle than the one in test 1. As can be seen in appendix A.5 and A.6, there do not occur any controller-related problems in any of the two solutions. After 225 s, the uphill slope is too high. In this case, the engine is not strong enough to keep the velocity in the uphill where the road slope is 4°. It is interesting to note that although the MPC is highly dependent of the model, it handles the model error quite well. 7.3 Hardware requirements 93 7.3 Hardware requirements For an ACC, it is necessary to have a rather short sample time (see section 6.9). For the hybrid controller this is no problem, since rather few calculations shall be performed in each sample time. However, for the MPC many different and heavy calculations have to be done. The major part is the optimisation, but also the generation of the different matrices might become very hard if Hp and Hu are chosen large. As can be seen in section 6.9, the solution time becomes long even on an ordinary PC, but in the truck it will become even longer, since the onboard computer is less powerful. The solution time can probably be shortened, if at least some parts of the MPC is implemented directly in C-code instead of Matlab-code. However, the only realistic way to use MPC in the ACC-application is probably to use socalled explicit MPC (see section 3.4.1). 7.4 Handling time delays in the retarder The time delays were handled in rather different ways in the two controllers. In the hybrid controller no compensation for the time delay was done, because the result in simulation was satisfactory. However, if it shows that compensation is needed in the real case, it might be a little complicated because it has to be done externally, i.e. by an Otto-Smith loop. One drawback with an Otto-Smith loop is that the controller becomes model dependent as can be seen in [16]. In the MPC, the time delay was omitted from the model to decrease the number of states that had to be included in the simulation (see section 6.2). This is not so important since in this case time delays are handled very simply as can be seen in section 3.4.7. 7.5 Switching between the actuators In the hybrid controller, most switches between the actuators are performed when aenvironment = a desired (7.1) as can be seen in section 5.3. This has solved many problems. For example, no bumps appear and it is rather simple to introduce hysteresis to avoid chattering. However, for some cases this sort of switch criterion was not appropriate or enough. In these cases, other switch criteria had to be designed and that takes time. Furthermore, the large number of switch criteria yields that it is easy to do mistakes. In the MPC-controller, the switches between the actuators are solved by the optimisation (see section 6.7), according to the model. This means that no switch criteria have to be designed. This is a major advantage of MPC. Moreover, the switches are performed totally bumpless. 7.6 Integral action In the hybrid controller integral action was included in the lowest layer (see section 5.4), by just including an integral part in the acceleration controllers. Since different controllers controls the engine and the retarder, there is no risk that these actuators are used in the same time. The problem with integrator windup was solved by using a differential PI-controller (see section 3.2) instead of an ordinary PI-controller. For the MPC, on the other hand, introduction of integral action is quite hard. In this thesis, two different ways of doing this were tested. One method was to penalise changes in the control signals (see section 3.4.1 and 6.4), rather than to penalise their absolute values. The advantage of this method was that no integrator windup would occur, and the drawback was that the actuators might be used in the same time. This could be avoided either by penalising 94 Comparison between hybrid controller and MPC the cross terms between the control signals (see section 6.4), or by including extra constraints on them according to (3.86). Regardless of which solution that was chosen, the optimisation problem could, or even would, become non-convex. The best way is still the last, i.e. to include extra constraints, since it guarantees that the control signals never are used simultaneously. However, the optimisation of this problem took a lot more time than an ordinary QP, as can be read about in section 3.4.11. Another method was to add extra integrator states. Simultaneous use of the control signals could then be avoided, but instead problems with windup and how to enable and disable the integrator states when changing driving situation arose (see section 6.4). 7.7 Constraints on the signals and states Introducing constraints on the control signals is for the hybrid controller done by using min/max-functions, for example as u = min( u calculated , u max ) . However, in the case when a state should be limited it might be difficult. This depends on the fact that an ordinary PIDcontroller does not consider the limits when calculating the control signal. In some cases, however, this can be solved by using for example the min-max-structure described in [16]. In an MPC-controller, both sorts of constraints are rather easily included. The optimisation knows the intervals within which the control signals and the states should be kept. Furthermore, it knows how the states react for different values of the control signals, since a model of the system is included. In that way, control signals, which will keep the system within the given boundaries, can be calculated. However, small problems might still occur if ‘hard’ constraints on the states are used, as can be seen in section 6.8. This can easily be solved by introducing ‘soft’ constraints. 8.1 Description of Scenario 1 95 8 Simulations The figures to the first scenario can be found in appendices A.1, A.2 and the figures to the second scenario in appendices A.7, A.8. 8.1 Description of Scenario 1 In the first scenario, referred to as “long test”, three equal appearances of a lead vehicle have been simulated for different road slopes as can be seen in the height profile figure. In the first part of the scenario it is downhill, in the middle part it is flat and in the last part is it uphill. When no lead vehicle is present, the distance and the lead vehicle velocity equal zero. At the time when the lead vehicle appears, the distance and the lead vehicle velocity abruptly change to a non-zero value. In this scenario the desired velocity for the truck is set to 80 km/h and the desired time gap is set to 3 s. The initial distance is chosen to 100 m and the lead vehicle is assumed to hold a constant velocity equal to 70 km/h. This will imply that the truck is supposed to settle towards a distance of 58.3 m. 8.1.1 Comments to simulation results Hybrid controller The scenario starts with downhill and no lead vehicle. In this case, the ECC, which is a combination of the CC and the DC, is the effectuating controller (see section 5.5.1). For the velocity, it can be seen that small oscillations occur in the beginning. This depends both on that it takes some time before the retarder gives the requested torque and that the DC is not parameterised good enough to this model. After a short while the oscillations have stopped and the desired velocity is obtained. At the time instant equal to 50 s a lead vehicle appears. Since the initial distance is longer than 1.2 times the desired, the “long distance”-controller takes over (see section 5.5.2). A small dip in the desired retarder torque can be found when the switch is performed. This depends on that it is impossible to read the control signal from the DC and therefore a bumpless transfer is not obtainable. Since the lead vehicle velocity is lower than the truck velocity, it is possible to calculate a trajectory that gives a perfect settling (see section 5.5.2). This trajectory will imply that a constant retardation is held. That this is the case can be verified in the truck velocity, since it decreases linearly. After about 25 s the desired distance and velocity are reached without any overshoots at all. At that moment the “normal distance”-controller takes over and no bump occurs in the desired retarder torque. This depends on that, in contrast to the switch between the DC and the “long distance”-controller, it is possible to read the control signal from the “long distance”controller. Therefore it is possible to set the integrator part in the “normal distance”-controller to a value that gives the same desired torque. At the time instant equal to 90 s, the lead vehicle is simulated as lost by setting the distance and the lead vehicle velocity equal to zero. However, the truck velocity is unaffected for 4 s. This depends on that the “lost target”-controller takes over and it waits to see if a curve is detected (see section 5.5.4). Since no curve is detected, the desired velocity is to be resumed. As for the “no target”-mode the ECC is the active controller in the “lost target”-mode. Although it is downhill the CC, which controls the engine via the desired injected fuel amount, will be used for some seconds, before the DC takes over. At the time instant 125 s, 96 Simulations the downhill ends and the road becomes flat. A small dip in the velocity can then be seen. However, in the real world, the switch from downhill to flat is never this abrupt and the dip will then probably not occur. The same procedure as for downhill, i.e. a lead vehicle that appears and disappears, is repeated for flat and for uphill. It can be seen that the settlings for the distance and the velocity look the same in all three cases. The difference between them is which of the actuators, i.e. the engine or the retarder, that is used the most. In the downhill, the retarder is of course the mostly used actuator, but this role is then taken over by the engine when the road slope changes to flat or uphill. MPC-controller For the MPC-controller the “no target”-mode is activated in the beginning of the scenario. In this mode, the truck velocity only is controlled (see section 6.6). A small peak in the velocity can be seen at start up, but it is corrected for very fast. After approximately 10 s, the controller has reached the desired retarder torque needed to keep the desired velocity. Then, it holds this torque very steady until the lead vehicle appears. When the lead vehicle appears at the time instant 50 s, the controller will change mode to the “target ahead”-mode. In this mode both the distance and the velocity is controlled. The effectuated retardation during the settling behind the lead vehicle will not be constant. Therefore the velocity will not decrease linearly as in the hybrid-controller case. In the distance and velocity figures it can be seen that a longer time elapses before these variables reaches their desired values when the MPC-controller is used compared to when the hybrid controller is used. This depends on that the MPC-controller uses somewhat smaller control signals. These variables could of course have been penalised harder in the cost function, but then the safety margin against simultaneous use of the actuators would have decreased. When the lead vehicle disappears at the time instant 90 s, the mode in the MPC-controller is changed to the “lost target”-mode. As for the hybrid controller, this mode waits for 4 s to see if a curve is detected. In this case, no curve appear and the “no target”-mode is entered. This mode will resume the desired velocity. It can be seen that the engine is active for a few seconds, as for the hybrid controller, to obtain the desired velocity. This is a critical point for the MPC-controller. When the desired velocity is to be resumed, it is very easy that simultaneous use of the control signals occurs. As can be seen in the plots for the desired retarder torque and injected fuel amount this do not happen. It can also be seen that already before the desired velocity is reached, a switch from the engine to the retarder is performed. This depends on that the MPC-controller predicts future states, and hence knows when to switch between the control signals in order to avoid an overshoot. At the time instant 125 s, the downhill ends and the road becomes flat. A small dip then occurs, but as was mentioned in the hybrid controller case, the slope never changes this abruptly in the real world. The same procedure as for downhill is repeated for flat and for uphill. The difference is that the mostly used control signal changes from being the retarder torque to being the injected fuel amount. The settlings for the three different road slopes look very much the same. However, small differences can be found, especially in the distance plot. This depends on that no explicit action to achieve equal settlings is performed, as in the hybrid controller where a reference trajectory was calculated. The long prediction horizon will nevertheless give settlings that are more equal. 8.2 Description of Scenario 2 97 8.2 Description of Scenario 2 The second scenario is referred to as “cut-in”. In this scenario, the truck follows a vehicle at the desired distance. Suddenly, a new vehicle with high velocity turns in ahead of the truck at a much shorter distance than the desired distance. This vehicle then brakes hard in order to keep the distance to the vehicle that lay ahead the truck before the new one came up. This manoeuvre is supposed to simulate that the truck is overtaken during following of a car. The road condition is ordinary plain road, i.e. no uphill or downhill. The desired truck velocity is equal to 80 km/h and the desired time gap is set to 3 s. The first lead vehicle travels with 70 km/h, which together with the desired time gap will yield that the desired distance equals 58.3 m. The new lead vehicle velocity is initially equal to 90 km/h and then decreases linearly to 70km/h in 10 s. This vehicle appears at a distance of 20 m. In this scenario, the main problem is the short distance to the new lead vehicle. This will result in that the control system wants to brake. However, braking may not be necessary if the overtaking car recedes from the truck fast enough. Then, the distance will increase even if the truck keeps its velocity. If, on the other hand, the relative velocity between the truck and the new lead vehicle is low, it might be unsafe not to brake (see section 5.5.2). 8.2.1 Comments to simulation results Hybrid controller In the beginning of the scenario, the truck follows the first lead vehicle. The truck is initiated at the correct desired distance with the same velocity as the lead vehicle. Due to the correct values of the distance and the truck velocity, the first mode will be the “normal distance”mode (see section 5.5.2). In this mode the ECC, which is a combination of the CC and the DC, is used. Since the road is flat, the CC will be the effectuating controller in the ECC. The integrator in the CC is zero at the beginning. This leads to a small dip in the truck velocity. After a few seconds, the integrator has increased and an overshoot of about 0.1 km/h can be seen in the truck velocity. The overshoot is not eliminated. Furthermore, it can be seen that the distance initially is equal to 58.3 m, but decreases towards 55 m, due to the high truck velocity. This behaviour may seem strange, but it depends on that a mode switch to the “hold”-mode has occurred. This mode is used if the distance and the truck velocity are close enough to their desired values (see section 5.5.2). In this mode, the controller just holds the velocity constant. If the road conditions are steady, as in this case, the holding of the velocity will imply that the control signals also becomes constant as can be seen in the figure for the desired injected fuel amount. At the time instant 70 s, the new lead vehicle appears. Since the new distance is shorter than before and the new lead vehicle velocity is higher than earlier, a switch to the “cut-in”-mode is performed. In this mode, a check is done if the relative velocity is larger than 5 km/h. In this case, the truck is supposed to keep its velocity. As can be seen by comparing the truck velocity and the lead vehicle velocity, the relative speed will be larger than 5 km/h the first 7.5 s. In the same time interval, the truck velocity is unaffected as expected. When the relative velocity falls below 5 km/h, a timer is initiated. As long as the relative velocity is larger than zero, the truck will maintain its velocity for 5 s, before braking is allowed. This will be the case between the time interval 77.5 s and 80 s. At the time instant 80 s, the relative velocity 98 Simulations equals zero. A switch to the corresponding distance controller is then performed, despite that the timer has not reached 5 s yet. At this time, the distance is approximately 47 m, which implies that the “normal distance”controller is to be used. This controller brakes somewhat, which results in a truck velocity of approximately 68.2 km/h. In the figures for the control signals it can be seen that the braking is almost nothing but a release of the gas. The retarder torque is only 120 Nm. At the time instant 140 s, the distance and the truck velocity are close enough to the desired values again to cause a mode switch to the “hold”-mode. MPC-controller The truck is initiated at the correct desired distance with the same velocity as the lead vehicle. Since the truck follows another vehicle, the first mode will be the “target ahead”-mode (see section 6.6). In contrast to the hybrid controller, there is no “hold”-mode in the MPCcontroller. The control signals are still rather constant, since the lead vehicle holds a constant velocity. However, small peaks and dips occur in the injected fuel amount. This depends on that the velocity oscillates due to quantisation. At the time instant 70 s, the new lead vehicle appears. A small dip in the truck velocity can then be seen, due to that the injected fuel amount becomes equal to zero. This depends on the short distance to the new lead vehicle. In this case, the distance error, i.e. the difference between the desired and the actual distance, is dominating in the cost function. However, the relative velocity is large, which implies that the distance increases fast. After a short while, when the distance has increased enough, the velocity error is instead the dominating term in the cost function. Since the lead vehicle velocity is higher than the truck velocity, the truck must increase its velocity to decrease the velocity error. This can be seen as the large peak in the truck velocity figure. However, after 5 s the lead vehicle velocity has decreased enough to make the distance error the dominating error again. In order to decrease the truck velocity enough to obtain the desired distance, the engine is shut off and the retarder is activated as can be seen in the figures for the control signals. However, both the distance error and the velocity error are at this time relative small, which gives a small desired retarder torque. Moreover, the retarder is activated for only 4 s. Thereafter the engine is activated again. The desired distance and truck velocity is achieved without any notable overshoots. As in the beginning of the scenario, small peaks and tops in the velocity and the injected fuel amount can be seen. The reason is the same, i.e. the quantisation of the truck velocity. Conclusions 99 9 Conclusions In this thesis control strategies for an adaptive cruise controller have been developed. Two different solutions have been tested. First, a hybrid controller was tested and second, an MPCcontroller was tested. One of the main objectives with this thesis was to present strategies that solve the switch problem between the use of engine and brakes. Both the above-mentioned solutions solve this problem, but in different ways. The hybrid solution was found to be a rather nice solution with low mathematical complexity. If a graphical programming interface like Stateflow is used, the structure of the modes in the controller may be visualised, which makes it easier to understand the purpose of each mode and how they interact. The controllers used in the different modes are ordinary PIDcontrollers. This makes it easy to change and understand the meaning of the parameters, which in turn implies that most engineers who have been in contact with control theory may handle the controller and choose the parameters. In the hybrid controller solution, a switch strategy involving a dynamic switch point called aenvironment was used. The variable aenvironment predicts the acceleration of the truck if neither the engine nor the brakes are used. The controller structure used is a so-called cascade structure. The controller in the outer loop returns a desired acceleration based on the relative velocity and the distance to the lead vehicle. If the desired acceleration exceeds the switch point aenvironment, the desired acceleration to the inner loop controller is forwarded to the engine controller and if it falls below aenvironment, the retarder controller is used instead. Because of the dynamic behaviour of the switch point, the point is automatically adjusted for variations in road slope and other external load disturbances. In contrast to the hybrid solution, the MPC-solution has rather high mathematical complexity. This complexity may be reduced for the ACC-engineer if first a standard MPC-function is developed. The MPC may then be handled even with only basic control theory knowledge, because of the intuitive function of the MPC combined with the similarities to an ordinary LQ-controller. Although the interface between the controller and the engineer may be simplified as described above, it will probably always be more complicated compared to the hybrid solution. An advantage of an MPC-controller is that it is looking forward into the future and it therefore has the possibility to use the actuators in another way if a future control signal saturation is predicted. Another advantage with MPC is that it handles time delays in the system in a structured way. The MPC-solution solves the switch problem in either of two ways. The first way is to use the fact that the MPC-controller knows the model of the truck, including the operational limits of the control signals. If this is combined with a penalty of the use of control signals, the MPCcontroller will normally not use the engine and the brakes simultaneously because it is not optimal. The second way to solve the switch problem when an MPC-controller is used is to replace the ordinary QP-problem with an MIQP-problem. In the latter, also binary variables are allowed and it is therefore possible to explicitly prohibit simultaneous use of engine and brakes. In the MPC-part of this thesis it is shown that it is possible to use an extended Kalman filter to extend linear MPC to also handle models that are slightly nonlinear. Further, so-called “soft constraints” have been evaluated and are found to solve the problem that infeasible solutions might occur due to constraints on signals in the system. 100 Conclusions Propositions to Future Work 101 10 Propositions to Future Work 10.1 Explicit MPC As have been previously mentioned in the thesis, one of the main problems with MPC is the time it takes to solve the optimisation problem in each sample. Unfortunately, the long solution time makes it impossible today to use the strategy in the reality in applications where short sampling time is necessary. To be able to run MPC at high sampling rates on slow hardware, research is performed on a method called explicit MPC. The gain with this method is that the hard optimisation work is moved from the slow hardware used on-line, to more powerful computers off-line. Because of this, it is sometimes said that the solution is precalculated. The result from the optimisation is stored in a table and the on-line calculations are reduced to tree-searches in this table. An extension to this thesis is to use explicit MPC and test the result in a real truck. More information about the method can be found in [10] and [11]. 10.2 Interior point solver In this thesis, a so-called active-set optimisation solver has been used. There exists another type of solver called interior-point solver. The latter has the advantage, compared to the first one, that the computational complexity is no worse than some polynomial function of the parameters such as the number of constraints or the number of variables [9]. This can be compared to that, in the worst case, the active-set method can have an exponential complexity in the parameters. Especially, the differences in performance appear when there are many active constraints. Normally, there is at least one active constraint in the ACC-application. This is because the actuators do not operate simultaneously, i.e. there is always at least one actuator with the lower control signal constraint active. An extension to the work already done in this thesis is to see how much the computational time can be decreased if an interior point solver is used. 10.3 More advance reference signal pre-filtering The ability of the MPC-framework to preview future reference signals has not been fully used in this thesis. One way to make the MPC-controller act smarter is to make “smart guesses” of how the reference signals will change in the future. In the ACC-application, the reference distance and the reference velocity is based on how the lead vehicle acts. It is obviously hard to make a completely reliable guess. Because the controller re-evaluates its control signal calculations in every sample, a slightly wrong guess in one sample is not crucial. If the reference signals are filtered in a smart way, it is perhaps possible to make the controller to understand, for example, that it is not necessary to lower the speed if a cut-in situation occurs. 102 Propositions to Future Work Bibliography 103 11 Bibliography [1] Torkel Glad and Lennart Ljung (1989). Reglerteknik Grundläggande teori. Studentlitteratur Lund. ISBN 91-44-17892-1. [2] Torkel Glad and Lennart Ljung (1997). Reglerteori, Flervariabla och olinjära metoder. Studentlitteratur Lund. ISBN 91-44-00472-9. [3] David Björkdahl (2002). Control principles for adaptive cruise control. Thesis work, Scania CV AB Södertälje. [4] Tomas Selling (2001). A modular truck model for real time simulations. Thesis work, Scania CV AB Södertälje. [5] Gianantonio Bortolin (2000). Brake Blending for improved dynamic response of secondary brake performance. Thesis work, Scania CV AB Södertälje. [6] Fredrik Eriksson and Leif Pudas (2002). Adaptive cruise control using rapid prototyping. Thesis work, Scania CV AB Södertälje. [7] Stefan Pettersson (1999). Analysis and design of hybrid systems. Dissertation, Chalmers University of Technology Göteborg Sweden. [8] Johan Löfberg (2001). Linear Model Predictive Control – Stability and Robustness. Licentiate thesis, Linköping University Linköping Sweden. [9] Jan Marian Maciejowski (2002). Predictive Control with Constraints. Addison-Wesley Pub Co, 1 edition. [10] Tor A. Johansen and Alexandra Grancharova (2002). Approximate explicit constrained linear model predictive control via orthogonal search tree. [11] Tor A. Johansen. Approximate Explicit Receding Horizon Control of constrained Nonlinear Systems. [12] Current position and development trends in air-brake systems for MercedesBenz commercial vehicles. [13] LU Jiang et.al. (2000). “New Adaptive Cruise Control Method” from Journal of Beijing Institute of technology. Vol. 9, No. 4, Beijing, p. 428-433. [14] Jerry D. Woll (1997). Monopulse Radar for Intelligent Cruise Control. SAE Technical Paper Number 972669, SAE Future Transportation Technology Conference, San Diego, California, USA, 1997 Aug 06 – 1997 Aug 08. [15] Scania (2003). Scania data sheets. http://www.scania.se/products/trucks/datasheet/pages/eng/eng480sp.html. (Acc. 2003-01-17) 104 Bibliography [16] Torkel Glad et. al. (2002). Digital Styrning Kurskompendium. Course literature ISY LiTH Linköping. [17] Per-Erik Danielsson et. al. (2002). Bilder och Grafik 2002. Course literature ISY LiTH Linköping. [18] Willibald Presti et.al. (2000). The BMW Active Cruise Control ACC. SAE Technical Paper Number 2000-01-0344, SAE 2000 World Congress, Detroit, Michigan, USA, 2000 March 06 – 2000 March 09. [19] Kailath Thomas et. al. (2000). Linear Estimation. Prentice Hall, New Jersey, USA. ISBN 0-13-022464-2. [20] International organisation for standardisation (2000), Road Vehicles – Adaptive Cruise Control Systems – Performance requirements and test procedures. Appendix A Appendix A Simulation results 105 106 Appendix A A.1 Hybrid controller plot Hybrid solution: mactual = mcontroller = 25 000 kg, vref,ACC = 80 km/h, thw = 3 s Distance 100 distance [m] 80 60 40 20 0 0 50 100 150 time [s] 200 250 300 350 Velocity truck and lead vehicle velocity [km/h] 90 80 70 60 50 v 40 30 vlead 20 10 0 0 50 100 150 time [s] 200 250 300 350 250 300 350 250 300 350 250 300 350 Injected fuel amount injected fuel amount [mg/stroke] 250 200 150 100 50 0 0 50 100 150 time [s] 200 Retarder torque 1800 1600 1400 torque [Nm] 1200 1000 800 600 400 200 0 0 50 100 150 time [s] 200 Road slope 0 -50 height [m] -100 -150 -200 -250 -300 -350 -400 0 50 100 150 time [s] 200 Appendix A 107 A.2 MPC-controller plot MPC solution: mactual = mcontroller = 25000 kg, vref,ACC = 80 km/h, thw = 3 s Distance 100 distance [m] 80 60 40 20 0 0 50 100 150 time [s] 200 250 300 350 Velocity truck and lead vehicle velocity [km/h] 90 80 70 60 50 v 40 30 vlead 20 10 0 0 50 100 150 time [s] 200 250 300 350 250 300 350 250 300 350 250 300 350 Injected fuel amount 250 injected fuel amount [mg/stroke] 200 150 100 50 0 0 50 100 150 200 Retarder torque 2000 time [s] torque [Nm] 1500 1000 500 0 0 50 100 150 time [s] 200 Road slope 0 -50 height [m] -100 -150 -200 -250 -300 -350 -400 0 50 100 150 time [s] 200 108 Appendix A A.3 Robustness test Hybrid solution: mactual = 20000 kg, mcontroller = 25000 kg, vref,ACC = 80 km/h, thw = 3 s Distance 100 distance [m] 80 60 40 20 0 0 50 100 150 time [s] 200 250 300 350 Velocity truck and lead vehicle velocity [km/h] 90 80 70 60 50 v 40 30 vlead 20 10 0 0 50 100 150 time [s] 200 250 300 350 250 300 350 250 300 350 250 300 350 Injected fuel amount injected fuel amount [mg/stroke] 250 200 150 100 50 0 0 50 100 150 time [s] 200 Retarder torque 1400 1200 torque [Nm] 1000 800 600 400 200 0 0 50 100 150 time [s] 200 Road slope 0 -50 height [m] -100 -150 -200 -250 -300 -350 -400 0 50 100 150 time [s] 200 Appendix A 109 A.4 Robustness test MPC solution: mactual = 20000 kg, mcontroller = 25000 kg, vref,ACC = 80 km/h, thw = 3 s Distance 100 distance [m] 80 60 40 20 0 0 50 100 150 time [s] 200 250 300 350 Velocity truck and lead vehicle velocity [km/h] 90 80 70 60 50 v 40 30 vlead 20 10 0 0 50 100 150 time [s] 200 250 300 350 250 300 350 250 300 350 250 300 350 Injected fuel amount 250 injected fuel amount [mg/stroke] 200 150 100 50 0 0 50 100 150 200 Retarder torque 1800 time [s] 1600 1400 torque [Nm] 1200 1000 800 600 400 200 0 0 50 100 150 200 Road slope time [s] 0 -50 height [m] -100 -150 -200 -250 -300 -350 -400 0 50 100 150 time [s] 200 110 Appendix A A.5 Robustness test Hybrid solution: mactual = 40000 kg, mcontroller = 25000 kg, vref,ACC = 80 km/h, thw = 3 s Distance 100 distance [m] 80 60 40 20 0 0 50 100 150 time [s] 200 250 300 350 Velocity truck and lead vehicle velocity [km/h] 90 v 80 70 60 50 40 30 vlead 20 10 0 0 50 100 150 time [s] 200 250 300 350 250 300 350 250 300 350 250 300 350 Injected fuel amount injected fuel amount [mg/stroke] 250 200 150 100 50 0 0 50 100 150 time [s] 200 Retarder torque 3000 2500 torque [Nm] 2000 1500 1000 500 0 0 50 100 150 time [s] 200 Road slope 0 -50 height [m] -100 -150 -200 -250 -300 -350 -400 0 50 100 150 time [s] 200 Appendix A 111 A.6 Robustness test MPC solution: mactual = 40000 kg, mcontroller = 25000 kg, vref,ACC = 80 km/h, thw = 3 s Distance 100 distance [m] 80 60 40 20 0 0 50 100 150 time [s] 200 250 300 350 Velocity truck and lead vehicle velocity [km/h] 90 v 80 70 60 50 40 30 vlead 20 10 0 0 50 100 150 time [s] 200 250 300 350 250 300 350 250 300 350 250 300 350 Injected fuel amount 250 injected fuel amount [mg/stroke] 200 150 100 50 0 0 50 100 150 200 Retarder torque 3000 time [s] 2500 torque [Nm] 2000 1500 1000 500 0 0 50 100 150 200 Road slope 0 time [s] -50 height [m] -100 -150 -200 -250 -300 -350 -400 0 50 100 150 time [s] 200 112 Appendix A A.7 Hybrid controller plot: Cut-in situation Hybrid solution: mactual = mcontroller = 25 000 kg, vref,ACC = 80 km/h, thw = 3 s 60 55 distance [m] 50 45 40 35 30 25 20 0 50 100 150 200 250 200 250 time [s] Truck velocity 70.5 truck velocity [km/h] 70 69.5 69 68.5 68 0 50 100 150 time [s] Lead vehicle velocity lead vehicle velocity [km/h] 90 85 80 75 70 0 50 100 150 time [s] 200 250 Injected fuel amount injected fuel amount [mg/stroke] 60 50 40 30 20 10 0 0 50 100 150 time [s] 200 250 Retarder torque 120 100 torque [Nm] 80 60 40 20 0 0 50 100 150 time [s] 200 250 Appendix A 113 A.8 MPC-controller plot: Cut-in situation Hybrid solution: mactual = mcontroller = 25 000 kg, vref,ACC = 80 km/h, thw = 3 s Distance 60 55 distance [m] 50 45 40 35 30 25 20 0 50 100 time [s] 150 200 250 150 200 250 200 250 150 200 250 150 200 250 Truck velocity 72 71.5 truck velocity [km/h] 71 70.5 70 69.5 69 68.5 68 67.5 0 50 100 time [s] Lead vehicle velocity lead vehicle velocity [km/h] 90 85 80 75 70 0 50 100 time [s] 150 Injected fuel amount 90 injected fuel amount [mg/stroke] 80 70 60 50 40 30 20 10 0 0 50 100 time [s] Retarder torque 800 700 600 torque [Nm] 500 400 300 200 100 0 0 50 100 time [s] 114 Appendix A Appendix B 115 Appendix B Test Procedures Description of the hardware, software and driving situation used in the different measurements of the solution time for one sample. B.1 Relinearisation In the test that compares how the solution time changes, depending on if relinearisation is performed to if it is not, the following computer and program was used: Processor: RAM: OS: Matlab: Pentium III, 1 GHz 128 MB Windows NT 4 6.1.0.450 Release 12.1 In this test, the truck had no lead vehicle. The truck velocity was initiated to 80 km/h, which was below the velocity constraint. For each set of prediction horizon and control horizon, the test was performed two times when relinearisation was done and two times when it was not. The value presented in the table is the mean value of the solution times for the two tests in each case. In order to eliminate the influence from the initiation of the simulation, the time measurement was not begun until the simulation time was equal to 10 seconds. At that moment a clock was started that measured the time it took to simulate 90 seconds. This time is referred to as the solution time. The sample time used was 0.5 seconds, which implies that 181 samples were performed. By dividing the solution time with the number of samples the solution time per sample was obtained. Between two measurement simulations a reference simulation was performed in order to see that the time had not changed, i.e. the conditions had not changed. The reference simulation was chosen to the case when the prediction horizon and the control horizon both were equal to 3 and when relinearisation was performed in each sample time. B.2 Number of Slack Variables and their respective Penalty In the test that compares the solution times for different numbers of slack variables and different ways to penalise the slack variables, the following computer and program was used: Processor: RAM: OS: Matlab: Pentium III, 1 GHz 128 MB Windows NT 4 6.1.0.450 Release 12.1 In this test the truck was supposed to follow a car with velocity equal to 80 km/h. The truck was initiated at the correct distance but with a velocity of 85 km/h. In order to provoke infeasibility the upper limit for the velocity was set to 75 km/h. 116 Appendix B The test procedure was similar to the one mentioned above. In this case, however, the different number of slack variables and different slack variable penalties were changed instead of control horizon, prediction horizon and relinearisation/not relinearisation. In this case, the reference simulation was that no slack variable was used. B.3 Prediction Horizon and Control Horizon In the test that shows how the solution time changes, for different prediction horizons and control horizons, the following computer and program was used: Processor: RAM: OS: Matlab: Pentium III, 600 MHz 128 MB Windows 98 6.1.0.450 Release 12.1 In this test the truck was supposed to follow a car with velocity equal to 80 km/h. The truck was initiated at the correct distance and with the correct velocity. The upper limit for the velocity was set to 85 km/h, in order to avoid infeasibility. The test procedure was similar to the one in case 1. However, in this case relinearisation in each sample time was enabled and it was only the prediction horizons and the control signal horizons that were changed. In this case, the reference simulation was chosen equal to the one in case 1. På svenska Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare – under en längre tid från publiceringsdatum under förutsättning att inga extraordinära omständigheter uppstår. Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervisning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säkerheten och tillgängligheten finns det lösningar av teknisk och administrativ art. Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsmannens litterära eller konstnärliga anseende eller egenart. För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/ In English The publishers will keep this document online on the Internet - or its possible replacement - for a considerable time from the date of publication barring exceptional circumstances. The online availability of the document implies a permanent permission for anyone to read, to download, to print out single copies for your own use and to use it unchanged for any non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional on the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility. According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement. For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its WWW home page: http://www.ep.liu.se/ © Daniel Axehill and Johan Sjöberg

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement