# Inertial Navigation and Mapping for Autonomous Vehicles Martin Skoglund

Linköping studies in science and technology. Dissertations. No. 1623 Inertial Navigation and Mapping for Autonomous Vehicles Martin Skoglund Department of Electrical Engineering Linköping University, SE–581 83 Linköping, Sweden Linköping 2014 Cover illustration: The cover illustration is left intentionally black! Linköping studies in science and technology. Dissertations. No. 1623 Inertial Navigation and Mapping for Autonomous Vehicles Martin Skoglund [email protected] www.control.isy.liu.se Division of Automatic Control Department of Electrical Engineering Linköping University SE–581 83 Linköping Sweden ISBN 978-91-7519-233-8 ISSN 0345-7524 Copyright © 2014 Martin Skoglund Printed by LiU-Tryck, Linköping, Sweden 2014 To Maria Abstract Navigation and mapping in unknown environments is an important building block for increased autonomy of unmanned vehicles, since external positioning systems can be susceptible to interference or simply being inaccessible. Navigation and mapping require signal processing of vehicle sensor data to estimate motion relative to the surrounding environment and to simultaneously estimate various properties of the surrounding environment. Physical models of sensors, vehicle motion and external influences are used in conjunction with statistically motivated methods to solve these problems. This thesis mainly addresses three navigation and mapping problems which are described below. We study how a vessel with known magnetic signature and a sensor network with magnetometers can be used to determine the sensor positions and simultaneously determine the vessel’s route in an extended Kalman filter (EKF). This is a socalled simultaneous localisation and mapping (SLAM) problem with a reversed measurement relationship. Previously determined hydrodynamic models for a remotely operated vehicle (ROV) are used together with the vessel’s sensors to improve the navigation performance using an EKF. Data from sea trials is used to evaluate the system and the results show that especially the linear velocity relative to the water can be accurately determined. The third problem addressed is SLAM with inertial sensors, accelerometers and gyroscopes, and an optical camera contained in a single sensor unit. This problem spans over three publications. We study how a SLAM estimate, consisting of a point cloud map, the sensor unit’s three dimensional trajectory and speed as well as its orientation, can be improved by solving a nonlinear least-squares (NLS) problem. NLS minimisation of the predicted motion error and the predicted point cloud coordinates given all camera measurements is initialised using EKF-SLAM. We show how NLS-SLAM can be initialised as a sequence of almost uncoupled problems with simple and often linear solutions. It also scales much better to larger data sets than EKF-SLAM. The results obtained using NLS-SLAM are significantly better using the proposed initialisation method than if started from arbitrary points. A SLAM formulation using the expectation maximisation (EM) algorithm is proposed. EM splits the original problem into two simpler problems and solves them iteratively. Here the platform motion is one problem and the landmark map is the other. The first problem is solved using an extended RauchTung-Striebel smoother while the second problem is solved with a quasi-Newton method. The results using EM-SLAM are better than NLS-SLAM both in terms of accuracy and complexity. v Populärvetenskaplig sammanfattning Vi människor utför dagligen en stor mängd navigerings- och karteringsrutiner utan att ens reflektera över dessa. Allt ifrån att resa mellan hem och arbete till att upptäcka nya spännande miljöer i en ny stad. För navigering i välbekanta miljöer använder vi våra sinnen och den mentala karta av omvärlden som är uppbygd av tidigare upplevelser. Att navigera i okända miljöer är svårare eftersom vi måste strukturera en ny mental karta av alla intryck och samtidigt uppfatta var i kartan vi befinner oss. Lyckligtvis finns en stor mängd hjälpmedel såsom, kartor, kompasser, satellitpositioneringssystem, med mera, som underlättar för oss. Navigering och kartering i okända miljöer är även en viktig byggsten för obemannade farkosters ökade autonomi eftersom externa positioneringssystem kan vara störningskänsliga eller helt enkelt otillgängliga. Navigering och kartering kräver, ofta omfattande, signalbehandling av farkosternas sensordata för att uppskatta hur den rör sig relativt omvärlden och att samtidigt uppskatta olika egenskaper i omvärlden. Fysikaliska modeller av sensorer, farkosters rörelse och yttre påverkan används tillsammans med statistiskt motiverade metoder för att lösa dessa problem så bra som möjligt. Denna avhandling behandlar huvudsakligen tre navigerings- och karteringsproblem som beskrivs nedan. Säkerhetskritiska maritima miljöer, såsom hamnar, kylvattenanläggningar vid kärnkraftverk och andra skyddsobjekt kräver ständig övervakning. För att detektera och göra målföljning av främmande fartyg och undervattensfarkoster i dessa miljöer kan man använda sig av sensorer placerade i ett nätverk på havsbotten. Dessa nätverk fungerar endast tillförlitligt om sensorernas position och orientering på havsbotten är känd. Vi studerar hur ett fartyg med känd magnetisk signatur och ett sensornätverk med tre-axliga magnetometrar kan användas för att bestämma sensorernas position och samtidigt bestämma fartygets rutt med ett så kallat extended Kalman filter (EKF). Detta är ett så kallat samtidig lokalisering och karteringsproblem, på engelska simultaneous localisation and mapping (SLAM), men med omvänd mätrelationen. Analys av hur noggrant sensorernas position kan bestämmas för en given rutt redovisas och det omvända, alltså hur noggrant en rutt kan bestämmas när sensorernas positioner är kända. Dessutom visas med känslighetsanalys att om fartyget är utrustat med positioneringssystem så kan fel i sensororientering och fel i den magnetiska signaturen undertryckas och positonsnoggrannheten för sensorerna förbättras. Systemet utvärderas med hjälp av simuleringar. Det andra problemet som behandlas är modellering en fjärrstyrd obemannad undervattensfarkost, på engelska remotely operated vehicle (ROV), och sensorfusion. ROVar används i en mängd krävande undervattenstillämpningar såsom tunnel- och skrovinspektion, svetsning på oljeriggar där djupet är för stort för vanliga dykare och arkeologiska expeditioner. ROVar har ett begränsat utrymme för navigerings- och karterinssensorer och dessa måste även vara relativt billiga. Dessutom saknas det ofta externa positioneringssystem under vattnet och därför krävs oberoende robusta navigeringsmetoder. Vi studerar hur tidigare bestämda vii viii Populärvetenskaplig sammanfattning hydrodynamiska delmodeller för en ROV kan användas tillsammans med farkostens sensorer för att förbättra navigeringsprestanda. De hydrodynamiska modellerna beskriver farkostens rörelse i det omgivande vattnet där rörelsen genereras av de fem propellrarna. Vi redovisar modeller för alla sensorer uttryckta i ett kroppsfixt system och dessa används tillsammans med de hydrodynamiska modellerna i ett EKF. Experimentella data från sjökörningar i Vättern används för att utvärdera systemet och resultaten visar att framför allt linjär hastighet relativt vattnet kan bestämmas noggrant. Det tredje problemet som behandlas är SLAM med tröghetssensorer (accelerometrar och gyroskop) och optisk kamera i en sammansatt sensormodul. Detta problem sträcker sig över tre publikationer. I den första publikationen studerar vi hur en SLAM-skattning, bestående av en skalenligt punktmolnskarta, sensormodulens tre-dimensionella bana och hastighet samt dess orientering, kan förbättras genom att lösa ett olinjärt minstakvadratproblem. Den initiala SLAM-skattningen görs med EKF-SLAM och det förbättrade estimated fås genom att minimera det predikterade rörelsefelet över hela banan och de predikterade punktmolnskoordinaterna givet alla kameramätningar och där tröghetssensorerna behandlas som en känd insignal. Resultaten är utvärderade med väldigt tillförlitliga data där sensormodulen är fastspänd i verktygspositionen på en industrirobot och miljön uppmätt för hand. I den andra publikationen visar vi hur det olinjära minstakvadratproblemet för SLAM kan initialiseras med en sekvens av nästan frikopplade problem med enkla, och oftast linjära, lösningar. Fördelen med att använda denna metod, istället för EKF-SLAM, är att den kan användas på mycket större dataset. Vi visar även att resultatet från det olinjära minstakvadratproblemet blir bättre med den föreslagna initialiseringsmetoden än om den startas från godtyckliga punkter. Flera steg i metoden utvärderas med Monte Carlo simuleringar och experimentella data. I den tredje publikationen studeras en alternativ formulering till det olinjära minstakvadratproblemet med hjälp av expectation maximisation (EM) algoritmen. EM är en metod som används för att hantera komplexa problem genom att dela upp det i två enklare problem och lösa dessa iterativt. I den föreslagna metoden betraktas sensormodulens bana som det ena problemet och det andra problemet utgörs av punktmolnskartan. EM metoden initialiseras med den ovan beskrivna metoden och resultat på experimentell och simulerad data visar sig ge mindre medel-fel än med den olinjära minstakvadratmetoden. Dessutom är metoden beräkningsmässigt mer effektiv. Acknowledgments The quote “I travel light. But not at the same speed. “ – Jarod Kintz, perfectly summarises that my long journey has finally reached its end. This would of course not have been possible without the support and encouragement from colleagues, friends and family. My first acknowledgement is directed to my supervisor Prof Fredrik Gustafsson, your inspiring tempo, professional sharpness and relaxed attitude to the never-ceasing flood of tight deadlines is impressive. Thanks for dragging me along this fascinating ride. Thanks! I also like to thank my co-supervisor Prof Thomas B Schön. Thomas, you have a very professional attitude and is an excellent researcher and reviewer with a bunch of cool estimation skills up your sleeve. Thanks! Various parts of this thesis has also been proof-read by Dr Zoran Sjanic and Dr Gustaf Hendeby. Thanks for all the comments and letting me borrow your time! All remaining eRrors are mine! Thanks to Dr Gustav Hendeby and Dr Henrik Tidefelt for help with figures in Shapes and LATEX questions and for swiftly improving the thesis template. Working at the Division of Automatic Control has been a great experience. Thanks to all the nice colleagues and the curious students! Prof Svante Gunnarsson responsively manages the division creating a nice atmosphere, thanks! Our secretary Ninna Stensgård, and her two predecessors Åsa Karmelind and Ulla Salaneck make sure that administrative and practical matters work flawlessly. Thank you! Thanks to the Industry Excellence Center LINK-SIC founded by Vinnova for financial support. A special thanks Dr Martin Enqvist for managing everything related to LINK-SIC smoothly. Dr Zoran Sjanic, it’s been great working together and I hope we can continue on this quest! Besides being a clever guy, you are also a good friend with a funky sense of humor that I enjoy. Thanks! Dr Magnus Axholt, your super positive attitude was vital when we worked with HMD calibration. I also really enjoyed learning more about augmented reality and optical systems. I am really impressed by your entrepreneurial skills! Thanks! Dr Jonas Callmer and I have shared many hours of studies and a bizarre amount of coffee whilst quoting South Park. Unfortunately we don’t meet as often since you joined the dark side outside university. That will hopefully change when you sell SenionLab for a trillion balubas! Dr Karl Granström –master of beer induced discussions – and I shared office for the first years of the PhD studies. Thanks for all the good times we have had at RT and outside work! Dr David Törnqvist gave me a smooth start in the group when supervising my Master’s Thesis. You have been my next-door neighbour in the corridor, fellow musician, dive mate and a good friend. Thanks David! Thanks Dr Karl-Johan Thore for being an excellent study companion and friend. ix x Acknowledgments I’m happy that you finally moved to the right side of the city! Johan Reunanen, you inspired me to move to Linköping to study and I also joined your vaniljekrem-gang in Oslo. We got to know each other when studying a preparatory year of Natural Sciences in our hometown Sundsvall and I hope we get to meet more often. Johan Bjugert experimentally proved that conserving eggs in the freezer is a bad idea! It was fun living together in Ryde on BG’s second floor. I hope we get to meet more often now when you are back in Sweden. Do you still prepare an extra serving of Pasta Carbonara for your trumpet? Thanks to my old friend Jens struuut! Södervall for all these years! Thanks Per Buffy Keller for all the fun in Dublin and elsewhere! Clas Veibäck, it’s nice to have an office mate again. Especially with someone who is as eager as I to put on a Maalouf album on our blazing sound system! Thanks to Henrik Tidefelt and Nina Fjällström for all the fun we have together! Congratulations to the marriage and for letting us be a part of your day! These years have been accompanied by a lot of great music. Thanks to Miles Davis, Håkan Hardenberger, Ibrahim Maalouf, Wynton Marsalis, Esbjörn Svensson Trio, Goran Kajfeš, Wolfgang Amadeus and Leopold Mozart, Igor Stravinsky, Brad Mehldau, Al Di Meola, Avishai Cohen, Pat Metheny, Keith Jarrett and the great Ludwig van. Over the years I have made many friends. Dr André Carvalho Bittencourt, Lic Sina Khoshfetrat Pakazad, Dr Patrik Axelsson, Lic Ylva Ljung, Dr Christian Lundquist, Dr Per Skoglar, Dr Daniel Peterson, Dr Christian Lyzell, Dr Johan Sjöberg, Dr Jeroen Hol, Dr Gustaf Hendby, Lic Manon Kok, Dr Emre Özkan, Dr Saikat Saha, Dr Daniel Axehill, Jonas Linder, Dr Fredrik Lindsten, Lic Michael Roth and possibly more people. Thank you for making my life much more enjoyable!! Outside work I have had particularly many great laughs together with Dr Hanna Fager, Dr Oskar Leufvén, Dr Jonas Callmer and Dr Karl Granström. Thanks! Thanks to the Falkenberg people for all the fun we have together! Thanks to my late mother Helena for all love, care and the somewhat useful stubbornness I inherited. My brother John and father Bernt, I hope you feel as excited and proud as I do! Thanks for everything! Finally, thanks and a thousand kisses to Maria, Stella and the unborn lill-räpen for all your love and support! Linköping, September 2014 Martin Skoglund Contents Notation I xv Background 1 Introduction 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Included Contributions . . . . . . . . . . . . . . . . . . . . 1.2.2 Additional Publications . . . . . . . . . . . . . . . . . . . 1.2.3 Optical See-Through Head Mounted Display Calibration 1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 5 7 10 10 12 2 Models 2.1 Kinematics . . . . . . . . . . . . 2.1.1 Translational Kinematics 2.1.2 Rotational Kinematics . 2.1.3 Rigid Body Kinematics . 2.2 Rigid Body Dynamics . . . . . . 2.3 Inertial Sensors . . . . . . . . . 2.3.1 Gyroscopes . . . . . . . . 2.3.2 Accelerometers . . . . . 2.4 Magnetometers . . . . . . . . . 2.5 Vision Sensors . . . . . . . . . . 2.5.1 The Pinhole Camera . . 2.6 Camera Measurement Models . 2.6.1 Direct Parametrisation . 2.6.2 Inverse Depth . . . . . . 2.6.3 Epipolar Geometry . . . 2.7 Computer Vision . . . . . . . . . . . . . . . . . . . . . . . . 15 16 16 17 20 21 22 22 23 23 24 24 27 27 28 29 34 3 Estimation 3.1 Sensor Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 37 xi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii Contents 3.1.1 Smoothing and Filtering . . . . . 3.2 Optimisation . . . . . . . . . . . . . . . . 3.2.1 Iterated Extended Kalman Filter 3.2.2 Nonlinear Least Squares . . . . . 3.3 Problem Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 39 41 44 46 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 49 49 51 53 53 55 56 57 58 59 5 Concluding remarks 5.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 63 64 Bibliography 65 4 SLAM 4.1 Introduction . . . . . . . . . 4.1.1 Probabilistic Models 4.2 EKF-SLAM . . . . . . . . . . 4.3 Batch SLAM . . . . . . . . . 4.3.1 Graphs . . . . . . . . 4.3.2 Initialisation . . . . . 4.3.3 NLS-SLAM . . . . . . 4.3.4 EM-SLAM . . . . . . 4.4 IMU and Camera . . . . . . 4.4.1 Linear Triangulation II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Publications A Silent Localization of Underwater Sensors using Magnetometers 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 System Description . . . . . . . . . . . . . . . . . . . . 2.2 State Estimation . . . . . . . . . . . . . . . . . . . . . . 2.3 Cramér-Rao Lower Bound . . . . . . . . . . . . . . . . 3 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Magnetometers Only . . . . . . . . . . . . . . . . . . . 3.2 Magnetometers and GNSS . . . . . . . . . . . . . . . . 3.3 Trajectory Evaluation using CRLB . . . . . . . . . . . . 3.4 Sensitivity Analysis, Magnetic Dipole . . . . . . . . . 3.5 Sensitivity Analysis, Sensor Orientation . . . . . . . . 4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 83 85 86 87 89 90 90 91 92 93 94 96 99 B A Nonlinear Least-Squares Approach to the SLAM Problem 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 Problem Formulation . . . . . . . . . . . . . . . . . . . . 3 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 105 106 107 108 . . . . . . . . . . . . xiii Contents Landmark State Parametrisation . . 3.2 3.3 Camera Measurements . . . . . . . . 4 Solution . . . . . . . . . . . . . . . . . . . . . 4.1 Initialisation . . . . . . . . . . . . . . 4.2 Nonlinear Least-Squares Smoothing 5 Experiments . . . . . . . . . . . . . . . . . . 5.1 Experimental Setup . . . . . . . . . . 5.2 Results . . . . . . . . . . . . . . . . . 6 Conclusions and Future Work . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 109 110 110 111 114 114 114 115 118 C Modeling and Sensor Fusion of a Remotely Operated Underwater Vehicle 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 ROV Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Hydrodynamic Models . . . . . . . . . . . . . . . . . . . . . 4.2 Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Kinematic model . . . . . . . . . . . . . . . . . . . . . . . . 5 Sensor Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Inertial Measurement Unit and Magnetometer . . . . . . . 5.2 Doppler Velocity Log . . . . . . . . . . . . . . . . . . . . . . 6 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . 7 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Velocity Model . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Angular Velocity Model . . . . . . . . . . . . . . . . . . . . 7.3 Position Estimates . . . . . . . . . . . . . . . . . . . . . . . . 8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 123 124 126 127 127 128 131 131 132 132 133 134 134 135 135 136 136 137 137 139 D Initialisation and Estimation Methods for Batch Optimisation ertial/Visual SLAM 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Position and Orientation . . . . . . . . . . . . . . . . 2.2 IMU Measurements . . . . . . . . . . . . . . . . . . . 2.3 Camera Measurements . . . . . . . . . . . . . . . . . 3 SLAM Initialisation . . . . . . . . . . . . . . . . . . . . . . . 3.1 Feature Tracks . . . . . . . . . . . . . . . . . . . . . 3.2 Track Clustering . . . . . . . . . . . . . . . . . . . . . 3.3 Rotation Initialisation . . . . . . . . . . . . . . . . . 3.4 Linear SLAM . . . . . . . . . . . . . . . . . . . . . . . 143 145 149 149 150 150 151 151 153 155 155 of In. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv Contents Iterative Outlier Removal . . . . . . . . 3.5 Nonlinear Least-Squares SLAM . . . . . . . . . Heuristic Motivation of the Linear Initialisation Monte Carlo Simulations . . . . . . . . . . . . . 6.1 Efficiency of the Linear Initialisation . . 6.2 Sensitivity to Initial Rotation Errors . . 6.3 Iterative Outlier Removal . . . . . . . . 7 Real Data Experiments . . . . . . . . . . . . . . 7.1 Clustering Results . . . . . . . . . . . . . 8 Conclusions and Future Work . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . 4 5 6 E EM-SLAM with Inertial/Visual Applications 1 Introduction . . . . . . . . . . . . . . . . 2 Expectation Maximisation . . . . . . . . 3 EM-SLAM . . . . . . . . . . . . . . . . . 3.1 E-step . . . . . . . . . . . . . . . . 3.2 M-step . . . . . . . . . . . . . . . 4 Models . . . . . . . . . . . . . . . . . . . 4.1 IMU Parametrisation . . . . . . . 4.2 Camera Measurements . . . . . . 5 Nonlinear Least-Squares . . . . . . . . . 6 Computation Complexity . . . . . . . . . 7 Obtaining an Initial Estimate . . . . . . 8 Results . . . . . . . . . . . . . . . . . . . 8.1 Simulations . . . . . . . . . . . . 8.2 Real Data Experiments . . . . . . 9 Conclusions and Future Work . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 159 159 160 161 161 162 164 165 171 176 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 183 185 186 187 188 188 190 190 191 192 193 194 194 196 196 201 Notation Abbreviations Abbreviation BA CRLB CCD CMOS DOF DVL EKF EM E-RTS GNSS GPS IMU IDP KF LiDAR MAP MEMS ML NLS RADAR RANSAC ROV SLAM SFM SIFT SONAR UAV UUV Meaning bundle adjustment Cramér-Rao lower bound charge coupled devices complementary metal oxide semiconductor degrees of freedom Doppler velocity log extended Kalman filter expectation-maximisation extended Rauch-Tung-Striebel smoother global navigations satellite system global positioning system inertial measurement unit inverse depth parametrisation Kalman filter light detection and ranging maximum a posteriori micro electrical mechanical systems maximum likelihood nonlinear least square radio detection and ranging random sampling consensus remotely operated vehicle simultaneous localisation and mapping structure from motion scale-invariant feature transform sound navigation and ranging unmanned aerial vehicle unmanned underwater vehicle xv xvi Notation Symbols and Operators Notation xt x0:t yt y1:N ut θ θ̂ et wt Q R f (xt ) h(xt ) N (µ, Σ) p(Y |θ), pθ (Y ) p(θ|Y ) R ω p v a RT q det × ∇ R P ([X, Y , Z]T ) kxkP −1 = √ x T P −1 x SO(3) E arg max θ arg min θ ∼ Meaning State vector State trajectory up to time t Measurement vector Data batch of length N also denoted y1:N = Y = {y1 , . . . , yN } Input vector Parameter vector Estimate of θ Measurement noise Process noise noise Process noise variance Measurement variance Process model Measurement model Gaussian distribution with mean µ and variance Σ Data likelihood Posterior density Rotation matrix Angular velocity Position Velocity Acceleration Transpose of matrix or vector Unit quaternion q = [q0 , q1 , q3 , q4 ]T Determinant Cross product Gradient operator Set of real numbers Projection operator P ([X, Y , Z]T ) = [X/Z, Y /Z]T P -weighted norm of vector x Special Orthogonal Group Essential Matrix Maximising argument with respect to θ Minimising argument with respect to θ Distributed according to. Or up to an unknown scale λ according to a ∼ b ⇐⇒ a = λb. Part I Background 1 Introduction This thesis is about navigation, mapping and modeling for mobile robotics. The specific applications studied are sensor network localisation, underwater vehicle modeling and batched simultaneous localisation and mapping (SLAM) initialisation and estimation methods for inertial sensors and monocular cameras. This chapter introduces and motivates the research topics in this thesis. It also summarises the research contributions by the author of the appended publications and other publications. 1.1 Motivation Autonomous systems are becoming increasingly important in today’s society. The main drive for industrial automatisation is increased efficiency leading to a larger competitive advantage. In our homes we are enjoying the benefits of many automatised processes such as dishwashers, heat exchangers and robotic lawnmowers. We use these systems because it allows us to do things that are more rewarding, such as building autonomous systems. The design, construction and evaluation of these systems is an iterative process which may involve machines, computers, physical models, mathematical models and a fair bit of experience. Today, autonomous systems are not only restricted to industries and our homes but are becoming integrated into our normal day lives. For instance, yesterday’s fantasies about self-driving cars are now becoming reality1 . Autonomous robotics see, Figure 1.1, is a special branch in mobile robotics in which the focus is to automatise as many subsystems of the robot as possible. A research-intensive 1 http://googleblog.blogspot.se/2014/04/the-latest-chapter-for-self-driving-car. html 3 4 1 Introduction Physics Mathema0cs Es0ma0on SLAM Disturbances Sensors Models Sta0s0cs Programming Planning Dynamics Algorithms Op0misa0on Obstacle Avoidance Communica0on Explora0on Tracking Errors Safety Control Decisions Ar0ﬁcial Intelligence Design Stabilisa0on Robustness Feedback Actuators Figure 1.1: Parts of autonomous robotic systems. Figure 1.2: Left: Combined camera and inertial sensors in a single unit. Prototype made by XSens. Right: The Saab V-200 Skeldar helicopter. By courtesy of Saab AB. part of this subsystem is onboard sensor data interpretation which is a cornerstone for making truly autonomous robots without the need for external support systems. The processed sensor data is used for various purposes such as navigation, mapping, control and exploration. SLAM emerged in the field of mobile robotics as a tool to enhance navigation and to infer properties of the surrounding environment by means of the onboard sensors. As the name suggests, SLAM is the joint problem of localisation and mapping which, on their own, are active research fields. In Figure 1.2 a multisensor unit and an unmanned autonomous vehicle are shown. Localisation, or navigation, problems assume that the surrounding environment is, to some extent, known a priori. Such information can for instance be; topographic maps, radio beacons, star constellations, magnetic fields, global naviga- 1.2 Contributions 5 Figure 1.3: Left: The Saab Seaeye Falcon ROV. By courtesy of Saab AB. Right: Position trajectory based on fusion of the ROV hydrodynamic model and magnetometer in magenta and the cyan curve marks the cable which the ROV roughly followed. tion satellite systems (GNSS) satellite ephemeris, which are integrated directly on moving platforms. If localisation of moving objects, i.e., tracking, is done using external systems such as; ground based radar, closed-circuit television, cellular networks, and other types of sensor networks, the location of the sensor nodes needs to be accurately known. In Figure 1.3 a navigation example using a remotely operated vehicle (ROV) is shown. Mapping problems consider estimation of the surrounding environment with examples such as; topographic maps, bathymetric maps, magnetic fields, object shape. It is assumed that all necessary information about the moving platforms is available such that the uncertainty w.r.t., to the estimated maps is negligible. This also covers cases when moving platforms are measured by sensor networks and the node locations need to be calibrated. In Figure 1.4 a small experimental setup and the corresponding SLAM estimate are shown. SLAM research initially spent a lot of effort on indoor platforms typically equipped with wheel encoders for odometry and line sweeping laser range scanners and these are still common. Today, this field of research is huge and heterogeneous with all kinds of platform and sensor combinations. A small table with some sensors and platforms that have been used in SLAM applications are shown in Table 1.1. 1.2 Contributions Published and submitted publications which constitute the second part of this thesis are listed below in chronological order together with a description and the contribution of each paper. 6 1 Introduction 0 −0.1 Smoothed Ground Truth EKF Landmarks [m] −0.2 −0.3 −0.4 −0.5 −0.6 0 −0.05 0.3 0.25 −0.1 0.2 0.15 −0.15 0.1 −0.2 0.05 0 [m] [m] Figure 1.4: Left: Image of an environment used for SLAM experiments using camera and inertial sensors. Right: SLAM results using EKF and (smoothed) nonlinear least-squares. Sensor\Platform Camera Stereo Camera RADAR SONAR IMU LiDAR Flying 1 6 10 16 21 Ground 2 7 11 13 17 22 Pedestrian 3 8 18 Vessel 4 12 14 19 23 Submarine 5 9 15 20 Table 1.1: Some SLAM publications using various platforms and sensors. 1 (Karlsson et al., 2008; Caballero et al., 2009; Bryson and Sukkarieh, 2009; Lupton and Sukkarieh, 2009, 2008; Bryson and Sukkarieh, 2008), 2 (Wang and Dissanayake, 2012; Thrun and Montemerlo, 2006; Callmer et al., 2008), 3 (Davison et al., 2007; Davison, 2003; Eade, 2008; Klein and Murray, 2007), 4 (Callmer et al., 2011), 5 (Kim, 2013), 6 (Jung and Lacroix, 2003), 7 (Konolige and Agrawal, 2008; Cummins and Newman, 2010), 8 (Lupton and Sukkarieh, 2012; Karlsson and Bjärkefur, 2010; Strasdat et al., 2011; Jung and Taylor, 2001), 9 (Eustice et al., 2006; Mahon et al., 2008), 10 (Sjanic and Gustafsson, 2010), 11 (Marck et al.; Gerossier et al., 2009), - (), 12 (Callmer et al., 2011; Mullane et al., 2010), 13 (Choi et al., 2005), - (), 14 (Wrobel, 2014), 15 (Newman et al., 2003; Ribas et al., 2006), 16 (Bryson et al., 2010), 17 (Wijesoma et al., 2006), 18 (Lupton and Sukkarieh, 2012), 19 (Han and Kim, 2013), 20 (Kim and Eustice, 2009), 21 (Fossel et al., 2013), 22 (Bosse and Zlot, 2008), 23 (Han and Kim, 2013). 1.2 Contributions 1.2.1 7 Included Contributions Silent Localization of Underwater Sensors Using Magnetometers J. Callmer, M. Skoglund, and F. Gustafsson. Silent localization of underwater sensors using magnetometers. EURASIP Journal on Advances in Signal Processing, 2010, 2010. doi: 10.1155/2010/709318. URL http://dx.doi.org/10.1155/2010/709318. Article ID 709318. Paper A presents a method for localisation of underwater sensors equipped with triaxial magnetometers using a friendly vessel with known magnetic characteristics. Underwater sensor networks can be used to detect and track surface vessels but more importantly, underwater vessels which may pose threats in security critical maritime environments. For these networks to function properly the location of each sensor need to be accurately known which can be difficult to obtain in fast deployment scenarios. We here demonstrate how the sensor positions and the vessel trajectory can be estimated simultaneously in an extended Kalman filter (EKF) using only magnetometer measurements from the sensors. Cramér-Rao lower bound (CRLB) analysis shows the attainable node localisation accuracy for a given trajectory and the attainable tracking performance for a given network. The CRLB analysis could thus serve as a guide on sensor network design. Sensitivity analysis indicates that when using GNSS measurement of the vessel trajectory, errors in sensor orientation and magnetic dipole strength are suppressed and the localisation accuracy is enhanced. The results are evaluated using simulated data. This is also the only publication in this thesis without real, experimental, data. This inexpensive solution to the difficult sensor localisation problem may also be used for re-localisation of sensors if conditions have changed. This is joint work primarily between the first and the second author who produced the ideas, theory, implementation and most of the writing. A Nonlinear Least Squares Approach to the SLAM Problem Z. Sjanic, M. A. Skoglund, F. Gustafsson, and T. B. Schön. A nonlinear least squares approach to the SLAM problem. In Proceedings of the IFAC World Congress, volume 18, Milan, Italy, 28-2 Aug./Sept. 2011. Paper B presents an algorithm for visual/inertial SLAM based on the maximum a posterior (MAP) estimate of the whole 6 degrees-of-freedom (DOF) trajectory (including velocities) and 3D map, solved using nonlinear least-squares (NLS) optimisation. An initial estimate is acquired using EKF-SLAM with inertial data as input to the motion model. Scale invariant feature transform (SIFT) image features are used for tracking and data association and the inverse depth parametrisation (IDP) is used for landmark parametrisation since it is known to handle depth uncertainty, due to EKF linearisation errors, better than using the direct 3D parametrisation. 8 1 Introduction The inherent sparsity structure of the full SLAM problem is efficiently utilised in the NLS solver and the result is a metrically correct position, orientation, velocity and landmark map estimate. The proposed algorithm is evaluated on experimental data using a sensor platform mounted on an industrial robot enabling accurate ground truth reference. The results from this algorithm can, for instance, be used on unmanned aerial vehicles to compute detailed terrain maps necessary for safe landing or in underwater localisation and mapping and other GNSS denied environments. This is joint work primarily between the first and the second author who produced the ideas, theory, experiments, implementation and most of the writing. Parts of this work has also been used in Nilsson (2010); Nilsson et al. (2011) where the sensor is a forward looking infrared (IR) camera attached to a car, and in Karlsson and Bjärkefur (2010) where the authors use stereo camera and laser camera in an indoor environment. Modeling and Sensor Fusion of a Remotely Operated Underwater Vehicle M. A. Skoglund, K. Jönsson, and F. Gustafsson. Modeling and sensor fusion of a remotely operated underwater vehicle. In Proceedings of the 15th International Conference on Information Fusion (FUSION), Singapore, 9-12 July 2012, pages 947–954. IEEE, 2012. Paper C presents how a complex hydrodynamic model for a remotely operated vehicle (ROV) can be used to robustify navigation based on onboard sensors. ROV’s are small and relatively cheap unmanned underwater vehicles (UUV). They are used in situations such as: tunnel or hull inspections; deep sea missions beyond human diver depths; mine hunting and mine disposal. Due to their limited payload capacity, and lack of external underwater localisation support, the onboard navigation systems have to be robust. The paper shows how previously estimated model structures are put together in a large model describing the full 6-DOF motion of the ROV, including angular and linear velocities. The model potentially provides an independent source of vehicle speed and angular rate. We also provide models for all the onboard sensors, expressed in the body frame of the ROV, which are fused with the hydrodynamic model in an EKF. The results are based on data from the field tests performed in the Master’s thesis work of Jönsson (2010) which the first author supervised. We show that, in particular, the vehicle speed can be accurately predicted as compared with the doppler speedometer. This is joint work primarily between the first and the second author. The second author provided initial models and experiments. The first author did most of the implementation and most the writing. 1.2 Contributions 9 Initialisation and Estimation Methods for Batch Optimisation of Inertial/Visual SLAM M. A. Skoglund, Z. Sjanic, and F. Gustafsson. Initialisation and estimation methods for batch optimisation of inertial/visual SLAM. Submitted to IEEE Transactions on Aerospace and Electronic Systems, June 2014. Paper D presents a complete initialisation method for inertial/visual batch SLAM which can be used to obtain metrically correct map and navigation state estimates. Batch formulations of inertial/visual SLAM are nonlinear and nonconvex problems which need a good initial estimate to converge to the global optimum. In Paper B the full SLAM problem was initialised using EKF-SLAM which is not a feasible solution for large data sets. In this paper we present a multi-step algorithm that solves a series of almost uncoupled problems. The combination of rotation estimation and appearance based data association using only vision together with visual/inertial methods leads to almost linear formulations which can be solved easily. The initialisation method is demonstrated on both simulated data and a small feasibility study on experimental data using an industrial robot, to get access to ground truth, is also performed. This is joint work primarily between the first and the second author who produced the ideas, theory, experiments, implementation and most of the writing. EM-SLAM with Inertial/Visual Applications Z. Sjanic, M. A. Skoglund, and F. Gustafsson. EM-SLAM with inertial/visual applications. Submitted to IEEE Transactions on Aerospace and Electronic Systems, June 2014. Paper E presents an expectation-maximisation (EM) approach to a maximum likelihood (ML) batch formulation of inertial/visual SLAM. The EM algorithm introduces a set of so-called latent, or hidden, variables. By doing so, the problem can be split into two, hopefully, simpler problems. The first problem is the expectation, the so-called E-step, with respect to the conditional density of the hidden variables. The second problem, known as the M-step, is to maximise the result obtained in the E-step with respect to the unknown parameters. These two problems are solved iteratively until some convergence criterion is met. The EM-SLAM algorithm proposed here considers the platform motion as hidden variables and the landmark map as the unknown parameters. The E-step is solved using an extended Rauch-Tung-Striebel smoother (E-RTS) with constant size state vector which becomes computationally cheap. The M-step is solved efficiently with a quasi-Newton method having fewer variables than the full NLS formulation in which both the state sequence and the map are seen as parameters in an ML fashion. EM-SLAM is compared with NLS-SLAM both in terms of performance and complexity. The proposed method is evaluated in real experiments and also in simulations on a platform with a monocular cam- 10 1 Introduction era attached to an inertial measurement unit. The initial estimate for both the parameters and the hidden variables are obtained using the method in Paper D. It is demonstrated to produce lower root mean square error (RMSE) than with a standard Levenberg-Marquardt solver of NLS problem, at a computational cost that increases considerably slower. This is joint work primarily between the first and the second author who produced the ideas, theory, experiments, implementation and most of the writing. 1.2.2 Additional Publications This section contains published work not included in this thesis. 1.2.3 Optical See-Through Head Mounted Display Calibration Augmented reality (AR) is used to denote visual information overlaid, augmented, on human vision. Optical see-through head mounted displays (OSTHMD) are wearable AR systems, which means that the OSTHMD displayed graphics moves with the user, see Figure 1.5. To augment the real world with graphics in such a system three main problems need to be solved. First, the pose (position and orientation) of the OSTHMD with respect to the inertial frame needs to be known. Second, the calibration problem which constitutes the static transformations of the relative position and rotation of the semitransparent graphics screen with respect to the user’s eye and an inertial frame. Finally, models of the environment where the graphics should be superimposed have to be created. In the two articles, the first two of the three problems are addressed. The pose of the user’s head, and consequently the OSTHMD, is provided using a visual tracking system along with a visual landmark coordinate. The calibration problem is solved using the data from the tracking system and measurements acquired by the subject through a bore-sighting exercise. The user aligns graphics (a crosshair) displayed on the screen with a measured point (a diode) in the inertial frame, see Figure 1.5, and several such alignments are collected from different screen points and the subject’s locations in order to excite the parameter space. We adopt the theoretical framework for camera calibration founded in the computer vision and photogrammetry domains to OSTHMD calibration. The calibration problem itself is rather ill-posed since the measurements are few compared to the parameter space and the signal-to-noise ratio (SNR) is low. The work in these two publications reflects some labour intense engineering/research where several months were spent on the AR system which consists of several hardware components interacting through several software layers. This is joint work primarily between the first and the second author who produced the ideas, theory, experiments and implementation. While the first author did most of the writing. 1.2 Contributions 11 Figure 1.5: Left: A Kaiser ProView 50ST Optical see-through head mounted display equipped with IR diodes for the PhaseSpace IMPULSE motion tracking system placed on a mannequin. Right: Example of a bore-sighting exercise with the reindeer-looking OSTHMD system. Note the diode in the far back of the room which the subject (myself) aims at. Optical See-Through Head Mounted Display Direct Linear Transformation Calibration Robustness in the Presence of User Alignment Noise M. Axholt, M. Skoglund, S. D. Peterson, M. D. Cooper, T. B. Schön, F. Gustafsson, A. Ynnerman, and S. R. Ellis. Optical see-through head mounted display direct linear transformation calibration robustness in the presence of user alignment noise. In Proceedings of the 54th Annual Meeting of the Human Factors and Ergonomics Society, volume 54, pages 2427–2431, San Francisco, CA, USA, 27-1 Sept./Oct. 2010. Human Factors and Ergonomics Society. The abstract from the paper is included below. The correct spatial registration between virtual and real objects in optical seethrough augmented reality implies accurate estimates of the user’s eyepoint relative to the location and orientation of the display surface. A common approach is to estimate the display parameters through a calibration procedure involving a subjective alignment exercise. Human postural sway and targeting precision contribute to imprecise alignments, which in turn adversely affect the display parameter estimation resulting in registration errors between virtual and real objects. The technique commonly used has its origin in computer vision, and calibrates stationary cameras using hundreds of correspondence points collected instantaneously in one video frame where precision is limited only by pixel quantization and image blur. Subsequently the input noise level is several order of magnitudes greater when a human operator manually collects correspondence points one by one. This paper investigates the effect of human alignment noise on view parameter estimation in an optical see-through head mounted display to determine how well astandard camera calibration method performs at greater noise levels than documented in computer vision literature. Through Monte-Carlo simulations we show that it is particularly difficult to estimate the user’s eyepoint in depth, but 12 1 Introduction that a greater distribution of correspondence points in depth help mitigate the effects of human alignment noise Parameter Estimation Variance of the Single Point Active Alignment Method in Optical See-Through Head Mounted Display Calibration M. Axholt, M. A. Skoglund, S. D. O’Connell, M. D. Cooper, S. R. Ellis, and A. Ynnerman. Parameter estimation variance of the single point active alignment method in optical see-through head mounted display calibration. In Proceedings of the IEEE Virtual Reality Conference, pages 27–34, Singapore, Republic of Singapore, Mar. 2011. The abstract from the paper is included below. The parameter estimation variance of the single point active alignment method (SPAAM) is studied through an experiment where 11 subjects are instructed to create alignments using an optical see-through head mounted display (OSTHMD) such that three separate correspondence point distributions are acquired. Modeling the OSTHMD and the subject’s dominant eye as a pinhole camera, findings show that a correspondence point distribution well distributed along the user’s line of sight yields less variant parameter estimates. The estimated eye point location is studied in particular detail. The findings of the experiment are complemented with simulated data which show that image plane orientation is sensitive to the number of correspondence points. The simulated data also illustrates some interesting properties on the numerical stability of the calibration problem as a function of alignment noise, number of correspondence points, and correspondence point distribution. Insights from Implementing a System for Peer-Review C. Lundquist, M. A. Skoglund, K. Granström, and T. Glad. Insights from implementing a system for peer-review. IEEE Transactions on Education, 56(3):261–267, 2013. Finally, the following paper about undergraduate teaching and peer-review has been published. 1.3 Thesis Outline The thesis is divided into two parts, with background material in the first part and with edited versions of published papers in the second part. The first part consists of material introducing and explaining the background to the publications. The publications on their own include detailed background material. The first part of the thesis is organised as follows. Chapter 2 presents model structures for motion, sensors and computer vision. Chapter 3 introduces the sensor fusion concept which is used for both filtering and smoothing. The connection to optimisation is also explained. Chapter 4 gives a brief overview of SLAM estimation methods and explains the concepts used in the publications. The first 1.3 Thesis Outline 13 part of the thesis ends with Chapter 5, which presents conclusions and discusses future work. The second part of the thesis consist of edited versions of the papers. The papers are only edited to comply with the layout of the thesis template. 2 Models In this chapter the most important model structures which are used in the publications in Part II are outlined. The models describe rigid body kinematics and dynamics, the pinhole camera, inertial measurements, magnetometers and three different landmark parametrisations for camera measurement models. This is the basis for inference in state-space, and other parametric, models which are described in the next chapter. In a navigation context the state vector may include: • Position p, velocity v, acceleration a and possibly higher order derivatives. • Unit quaternion q parametrising orientation, angular velocity ω and higher order derivatives. For localisation and tracking problems the state vector may include: • Stationary point targets as landmark coordinates m. • Stationary sensors p which may also include sensor orientation q and other quantities. • Non-stationary targets possibly constrained to different motion models. • Extended targets, stationary or non-stationary. • Binary correspondence variables c relating measurements to target identities and possibly other metadata. In case navigation and localisation are estimated jointly it is natural to include the appropriate quantities mentioned above in a large state vector. 15 16 2 Models 2.1 Kinematics Kinematics describes the motion of bodies without considering the forces causing the motion. Kinematic transformations between coordinate frames consist of length preserving translations and rotations as illustrated in Figure 2.1. zw yc zw zc zc y w yw w cw xw x xc w c yc xc (b) A translation is a displacement of the origin while the axes are kept aligned. (a) A rotation is a displacement of the axes while the origin is kept fix. Figure 2.1: Rigid body transformations. 2.1.1 Translational Kinematics A translation is a displacement of the body origin p while the axes are kept aligned. The translation is a vector, pw , from the origin of the frame w to the coordinate p expressed in the w-frame and it is also called the position. The translational motion equations can be derived from the time-derivative of a translation p̈ = v̇ = a, (2.1) where v is the velocity and a is the acceleration. It is of course possible, and sometimes necessary, to introduce higher order terms such as ȧ which is known as jerk, but most often it can be considered to be noise i.e., ȧ = wa . In (2.1) there is no indication of the dimension in which the motion occurs. This is however no restriction since the motion is independent in each dimension. The corresponding matrix form is ṗ 0 I 0 p 0 v̇ 0 0 I v 0 (2.2a) = + wa , ȧ 0 0 0 a I ⇐⇒ ẋ = Ax + Bwa , (2.2b) where the matrices 0 and I are of appropriate dimensions. Assuming that the input wa is constant over the sampling interval, T , the ordinary differential equa- 2.1 17 Kinematics tion (2.2) is solved as xt+1 = e AT ZT xt + e Aτ dτ Bwa , (2.3) 0 which results in pt+1 I v t+1 = 0 at+1 0 2.1.2 TI I 0 3 T 2 p T I 2 I t T62 T I vt + 2 I wa . I a t I (2.4) Rotational Kinematics Rotational kinematics is more complicated than its translational counterpart and this is due to the fact that rotation representations are nonlinear. This means that some ordinary operations defined on vector spaces, such as addition, are not defined here. Proper rotations are linear maps that preserve length and the handedness of the basis. There are many parametrisation alternatives to rotation matrices and perhaps the most common ones are Euler angles and the unit quaternion. For a thorough overview of rotation parameterisations see e.g., Shuster (1993). In this thesis, rotation in 3D is the typical case considered and as a common denominator for all parameterisations is that they can be transformed into a corresponding rotation matrix R ∈ SO(3). Definition 2.1 (Special Orthogonal Group SO(3)). A matrix R ∈ R3×3 belongs to the special orthogonal group SO(3) if and only if it holds R T R = I, det R = 1. (2.5a) (2.5b) Figure 2.1a describes a rotation from the w frame to the c frame. This can be parametrised by the rotation matrix Rcw . The inverse rotation direction using (2.5a) is (Rcw )−1 = (Rcw )T , hence (Rcw )T = Rwc . Spatial rotations intuitively only have three degrees of freedom yet they still need at least five parameters (Hopf, 1940) in order to represent a global description see Stuelpnagel (1964) for a discussion. Rotations matrices are impractical for several reasons e.g., when rotations are estimated since it is difficult to design estimators directly on SO(3) and it is costly to use nine parameters. Luckily there are many alternatives for rotation parametrisation and an appealing option is the unit quaternion. Quaternions were invented by Hamilton (1844) as a tool to extend the imaginary numbers. The unit length quaternion is widely used in aerospace industry, mechanics, computer graphics, among others, since it allows a computationally efficient and singularity free rotation representation. The unit quaternion is a vector of real numbers q ∈ R4 . A definition is given in Kuipers (2002, page 104), who splits the vector a scalar part, q0 , and a vector, q = e1 q1 + 18 2 Models e2 q2 + e3 q3 , where e1 , e2 , e3 is the standard orthonormal basis in R3 . With Kuipers definition the full vector representation is q0 " # q q q = 1 = 0 . (2.6) q q2 q3 Quaternions can be used to represent a rotation in R3 but in order to do so they must be constrained to the unit sphere q ∈ S 3 , i.e., q T q = 1 , hence the name unit quaternion. Let # " cos δ (2.7) q= sin δn which is a unit quaternion describing a rotation of an angle 2δ around the unit vector n ∈ R3 . Then a rotation using a unit quaternion is x̃ b = q ba x̃ a q ab (2.8) where x̃? = [0, x? ] are the vectors’ quaternion equivalents and denotes the quaternion multiplication, see e.g., Törnqvist (2008); Hol (2011). The unit quaternion can also be used to construct a rotation matrix, see for instance Shuster (1993, page 462), where the superscript ab is omitted for the sake of readability, 2 2(q1 q2 + q0 q3 ) 2(q1 q3 − q0 q2 ) (q0 + q12 − q22 − q32 ) R(q) = 2(q1 q2 − q0 q3 ) (2.9) (q02 − q12 + q22 − q32 ) 2(q2 q3 + q0 q1 ) = 2 2 2 2 2(q1 q3 + q0 q2 ) 2(q2 q3 − q0 q1 ) (q0 − q1 − q2 + q3 ) 2 2 (2q0 + 2q1 − 1) 2(q1 q2 + q0 q3 ) 2(q1 q3 − q0 q2 ) = 2(q1 q2 − q0 q3 ) (2q02 + 2q22 − 1) 2(q2 q3 + q0 q1 ) , (2.10) 2(q1 q3 + q0 q2 ) 2(q2 q3 − q0 q1 ) (2q02 + 2q32 − 1) where the last equality is established using q02 + q12 + q22 + q32 = 1. It should be noted that the unit quaternion is not a global parametrisation because the q and −q describe the same rotation i.e., R(q) = R(−q) which is easily verified from (2.9). This is seldom a practical problem since a solution with e.g., positive q0 component can be chosen. A local parametrisation which is popular within the computer vision community is the so-called exponential coordinates, see e.g., Ma et al. (2003). It uses a threedimensional vector ω ∈ R3 such that for any R ∈ SO(3) there exists R = e[ω]× where −ω3 ω2 0 0 −ω1 [ω]× = ω3 (2.11) −ω2 ω1 0 is the matrix form of the cross product. It is should be noted that any vector of the form ω + 2πkω, where k being an integer, have the same rotation matrix which also means that for a given rotation matrix there are infinitely many exponential coordinates. The rotation matrix can be efficiently computed from the 2.1 19 Kinematics exponential parameters using Rodrigues’ formula R = e[ω]× = I + [ω]× [ω]2× (1 − cos(kωk)) , sin(kωk) + kωk kωk2 (2.12) which follows from the Taylor expansion of the exponential function, see Ma et al. (2003, Therorem 2.9) for a proof. A differential equation for rotations can be established by using some properties of rotation matrices. Using the definition of SO(3) and taking the time derivative of both sides of (2.5a) as d d (R RT ) = I, dt dt T Ṙ RT + R Ṙ = 0, (2.13) ⇐⇒ (2.14) which implies that the matrix products are skew-symmetric Ṙ RT = −(Ṙ RT )T . (2.15) We can thus define the right hand side as and since RT R Ṙ RT = [ω]× , (2.16) Ṙ = [ω]× R . (2.17) = I we have that By solving the differential equation (2.17) with ω constant R(t) = e[ω]× t R(0). (2.18) the exponential coordinates are obtained. Similarly, the differential form of the unit quaternion parametrisation is given by 0 −ωx −ωy −ωz q0 −q1 −q2 −q3 ω 1 ω 0 ωz −ωy q1 1 q0 −q3 q2 x = q̇t = x (2.19) ω , 0 ωx q2 2 q3 q0 −q1 y 2 ωy −ωz ωz q3 −q2 q1 q0 ωz ωy −ωx 0 {z } | {z } | S(ω) e S(q) which has the nice property of being bilinear. Note that due to the algebraic constraint the differential form of the unit quaternion is a differential algebraic equation (DAE). For a complete derivation of these relations see e.g., Törnqvist (2008); Hol (2011). The unit quaternion is thus solved as q̇ = 1 S(ω)q =⇒ 2 1 q(t) = e 2 S(ω)t q(0) (2.20) (2.21) where it is assumed that the angular velocity is constant. Defining the skewsymmetric matrix S = 21 S(ω)t, q1 = q(t) and q0 = q(0) the quaternion length 20 2 Models remains unchanged kq1 k22 = ke S q0 k22 = q0T (e S )T e S q0 = q0T e S T +S q0 = q0T q0 = kq0 k22 , (2.22) since S = −S T . The matrix exponential of the orientation dynamics can further be simplified ! s sin ||ω||T 1 S(ω)T ||ω||T Ts 2 s sq = qt = e 2 S(ω) S(ω) q0 (2.23) I + cos q ≈ I + 0 0 2 ||ω|| 2 where the approximation is based on the small angle approximation which is good if the sampling rate Ts is high. It is now simple to introduce noise and bias as T T e T qt = I + s S(ω + bω + wω ) q0 = I + s S(ω + bω ) q0 + s S(q (2.24) 0 )wω 2 2 2 using the relation (2.19). It is important to note that the nonlinear constraint q T q = 1 need to be handled correctly in any estimator. In the case of filtering there are mainly two options. The first is to introduce a fictitious measurement y = 1 − q T q which should be zero and the second option is to normalise the updated quaternion as q := q/||q|| which is a projection onto the unit sphere. Using unit quaternions for filtering may turn out problematic since the process noise covariance and the true state covariance is singular due to the over-parametrisation. Also, if a smoothed estimate is sought problems are unavoidable using unconstrained formulations, see Shuster (2003). Thus, there is room for improving the smoothing approach using unit quaternions as in Paper B. A practical option when quaternions are simulated in continuous time is to introduce a feedback γ 1 q̇ = S(ω)q + (1 − q T q)q, (2.25) 2 2 which drives the quaternion to unit length, here γ is a proportional positive gain. This approach was used in the hydrodynamic model in Paper C. 2.1.3 Rigid Body Kinematics Combining translational and rotational kinematics can be done in several ways and is very much a design task that depends on the system at hand. A straightforward model is given by the combination of the constant acceleration model and (2.2) and the unit quaternion (2.19) as 0 p 0 ṗ 0 I 0 v̇ 0 0 I 0 v 0 (2.26) + w , = 0 a I a ȧ 0 0 0 1 q̇ 0 0 0 0 2 S(ω) q and it is also quite common to extend the model with states for angular acceleration, and possibly bias states for gyroscopes and accelerometers and this is done in Paper D. 2.2 21 Rigid Body Dynamics Another model is obtained by input to the system ṗ 0 I 0 p 0 v̇ 0 0 0 v I = + q̇ 0 0 0 q 0 considering acceleration and angular rates to be # 0 0 " a + b a 0 1 + I 2 S(ω + bω )q I 0 0 " # w 0 a . w ω 1e 2 S(q) (2.27) This significantly reduces the state space at the cost of limited noise modeling and fault detection capabilities. It should be noted that the biases are not observable using only accelerometers and gyroscopes in here, but must be inferred from other observations, e.g., camera measurements. Models with inertial sensors as input signals are used in Paper B and Paper E. In Paper A the kinematics of a vessel is described by a coordinated turn model which essentially constrains the object to follow a circular path. Such models are rather flexible and may be used to simulate more advanced models. It may be convenient to express some parts of a motion model in the body fixed frame since forces and torques are often naturally represented in this frame. This is the case in Paper C when modelling of a ROV is considered since the expressions for linear and rotational velocities due to external forces become much simpler. 2.2 Rigid Body Dynamics Rigid body dynamics in classical mechanics investigates the motion of objects as caused by forces and torques. The fundamental equations describing these relations are given by Euler’s axioms of motion. The first axiom is ! N X d (mṗ) = Fj , (2.28) dt i j=1 which gives the relation of a body’s acceleration p̈ in an inertial frame i with mass m which is due to external forces Fj and p is the centre of mass. This is the straightforward extension of Newton’s second law of motion for particles to rigid bodies expressed as conservation of linear momentum. The corresponding law for angular momentum, known as Euler’s second law, states that the change of angular momentum of the body is equal to all external torques about the origin ! N X d (J ω) = rj × Fj , (2.29) dt i j=1 where J is the inertia matrix, ω is the angular velocity of the rigid body, rj are vectors that point from the center of rotation to the points where the external forces Fj are applied. It is useful to express Euler’s laws in body referenced velocities since the mass and inertia are then constant. This can be done considering the derivative of vectors in a rotating reference frame, b, using the relation ! ! dr dr = + ω × r. (2.30) dt i dt b 22 2 Models in which the body referenced derivative is the same if the inertial frame is not rotating, i.e., ω = 0. Applying (2.30) to (2.28) results in mp̈ b = N X Fj − ω × mṗ b , (2.31) j=1 where it is assumed that the mass is constant. Similarly, the expression for angular velocity (2.29) is J ω̇ b = N X rj × Fj − ω b × J ω b . (2.32) j=1 For a complete derivation, see e.g., Fossen (2011). 2.3 Inertial Sensors Inertial sensors measure specific force and angular velocity. These have great importance for many navigation systems since they provide an independent source for computing the position and orientation of a moving platform relative to some initial pose by integrating the accelerations twice and the angular velocity once, respectively. It may sometimes be the only means of navigation when other systems are unavailable. For instance, GNSS are prohibitive in many places such as underwater environments or indoors, and are also subject to jamming and spoofing. Magnetometers are passive sensors that can be used to calculate the orientation w.r.t., the local magnetic field. However, this field is rather weak and easily disturbed by ferrous objects. Inertial sensors are not subject to such external errors but rather errors due to the internal workings of the sensors. These contribute to the integration drift which means that the true pose of the sensor will deteriorate when the measurements are integrated. This drift is roughly inversely proportional to the price of the sensors. In this work we will only consider micro electrical mechanical systems (MEMS) type IMUs due to their relatively low cost and small size. There is a rich body of literature on navigation in general and modeling of IMU’s in particular, see for instance Titterton and Weston (1997); Britting (1971). In all applications here the IMU’s are rigidly mounted to the moving platform, in a so-called strap-down configuration, and hence sensor readings are naturally referenced in this frame. 2.3.1 Gyroscopes Commonly, MEMS gyroscopes are based on measuring the rotation induced Coriolis force acting on a vibrating structure. A gyroscope mounted on a moving body, outputs gyroscope signals yωt = ωt + eωt , (2.33) where ωt is the angular velocity of the body w.r.t., an inertial frame and eωt is noise. Due to unmodeled effects in the sensors, a bias, bωt , could be added 2.4 23 Magnetometers to (2.33) giving yωt = ωt + bωt + eωt , (2.34) and bωt could also be included in the state vector with slowly varying dynamics. It is here assumed that the local, flat, navigation frame is inertial since otherwise (2.33) has to include another two components for the earth rotation rate and the turn rate of the navigation frame w.r.t., to earth. This assumption is good only if navigation is not performed over large distances over earth and that the −1 ≈ 7.27 · 10−5 rad/s, is small compared to the senearth rotation rate, ω ei = 2π 24 h sor noise. Furthermore, sensor alignment, orthogonality of sensor axes, sensor gain and other parameters are assumed known and accounted for by a factory calibration. 2.3.2 Accelerometers Accelerometers using MEMS technology are based upon measuring deflections of a cantilever beam to which a so-called proof mass is attached. The mass deflects the beam when external acceleration along the sensitivity axis is applied to the sensor. Accelerometers do not only measure acceleration when used on earth, since they are subject to the gravitational field. Hence, accelerometers measure the specific force which is a sum of the free acceleration and the gravity field. A measurement model is then e e ya = abt − gb + eat = Rbe t (at − g ) + eat , aet (2.35) ge where is the acceleration in the navigation frame is the local gravity vector and eat is measurement noise. Since the local navigation frame is considered to be inertial, Coriolis and centripetal acceleration can be neglected. It is clear that the measured specific force depends on the attitude Rbe of the platform and as a consequence, errors in attitude will introduce so-called gravity leakage into the measurements. In case the body frame does not coincide with the sensor a Coriolis term ω × ω × r I MU b have to be added to (2.35) as done in Paper C, where r I MU b is the offset vector, and the result must also be rotated by the relative rotation between these two systems. In the papers considering IMU and monocular vision the body center is in the origin of the IMU thus the offset and rotation between the frames are needed in the camera measurement equation. 2.4 Magnetometers A simple magnetometer measurements model is e ym = Rbe t m + emt , (2.36) where me is the local earth magnetic field vector and emt is the measurement noise. This model can be used for computing the magnetic north if me is known. In Paper A magnetometers are used slightly differently. Here the objective is to track a magnetic object in order to determine locations of the sensor-nodes containing magnetometers in a sensor network. In this case the object is a vessel 24 2 Models with known magnetic signature and it is modelled as a single dipole m = Reb mb giving rise to the magnetic flux density µ0 ym = (3r(mT r) − |r|2 m), (2.37) 4π|r|5 where µ0 is the permeability of the medium and r is the distance from the sensor to the vessel. The magnetic flux due to the dipole strength decays cubically with the distance. It is here assumed that the effect of the local magnetic field is measured and accounted for before the vessel enters the survey area. For a thorough description on tracking of metallic objects Wahlström (2013) is recommended. 2.5 Vision Sensors Digital vision sensors capture incident rays of light from the scene onto a sensor array through an optical system. Light reflected by objects in a scene can be mathematically described by bidirectional reflectance distribution functions (BRDF). The physical nature of light and image formation is a complex topic which is well beyond the scope of this thesis. There are mainly two manufacturing techniques for digital vision sensors. The first is semiconductor charge coupled device (CCD) in which each analog photo sensor (pixel) produces an electrical charge from the incoming light. The second sensor is complementary metal oxide semiconductor (CMOS) which directly converts light into voltage. CCD sensors are usually of global shutter type meaning that image is sampled at one time instant. In contrast, CMOS sensors usually sample row by row in a so-called rolling shutter fashion. Rolling shutters are far more complicated to use in geometrical vision problems when either the camera or scene is moving and we will therefore only consider global shutter cameras. 2.5.1 The Pinhole Camera The by far most commonly used camera model is the pinhole camera. It is a mathematical model describing how points in R3 relate to points in R2 on the image plane through a central projection. The model has five intrinsic, or internal, parameters, and for most systems it is an approximation. Additionally, the position and orientation of the camera centre with respect to some other reference frame is also needed if the camera is part of a system, e.g., a stereo rig. Furthermore, most lenses introduce artifacts such as radial and tangential distortion which are important to compensate for. Figure 2.2 illustrates a frontal pinhole projection. In the pinhole camera model the coordinate frames are: • c − Camera frame with the optical centre as the origin where its z − axis coincides with the optical axis, also known as the principal axis. • w − World frame. • i − Image frame. The image plane is orthogonal to the optical axis and has 2.5 25 Vision Sensors xp p yp mc mn zc c x xi c h f yi yc Figure 2.2: Pinhole projection with image the plane placed in front of the camera centre. the principal point h as its origin. The distance between the c and the i frame is called the focal length f . • n − Normalised image frame. • p − Pixel frame with center in the upper left corner as viewed from the optical centre c. A point mc = [x c , y c , z c ]T in the c-frame relates to a point mn = [x n , y n ]T in the n-frame as " # " n# f xc x = . (2.38) yn zc yc It is convenient to write (2.38) as a linear system which can be done by appending unit elements to the vectors giving the homogeneous coordinate representation n xc n x 1 0 0 0 y c x y n (2.39) ∝ z c y n = 0 1 0 0 c . z 1 1 0 0 1 0 | {z } 1 Π0 With focal length f = 1 the normalised image frame n and the image frame i coincides. The projection (2.39) can be expressed as λmn = Π0 mc , (2.40) where λ ∈ R+ is an arbitrary scale factor. It is also convenient to define the projection as a function P ([X, Y , Z]T ) = [X/Z, Y /Z]T which is as a map in Euclidean space P : R3 → R2 . A projection model which is suitable for omnidirectional cameras √ is the spherical perspective projection which has a scale described by λ = X 2 + Y 2 + Z 2 , whereas in a planar perspective projection λ = Z. This means that same equations can be used for both cases and only the depth is 26 2 Models described differently. More advanced models for omnidirectional cameras are accounted for in Scaramuzza et al. (2006) The digitalisalisation of the image plane i is the pixel plane p " p# " #" # " # fx sα x i hx x i x = + , 0 fy y i yp hy y i or in homogeneous coordinates p i x fx sα hx x y p 0 f hy y i , = y 1 0 0 1 1 | {z } (2.41) (2.42) K where K is referred to as the intrinsic parameter matrix or simply the calibration matrix. The parameters of the calibration matrix are the focal lengths in pixel units fx = sx /f and fy = sy /f where f is the focal length expressed in meters per pixel and sx , sy are the pixel sizes in metric units. The center of the image plane h = [hx hy ]T , is the principal point coordinate which is shifted w.r.t., to the pixel frame which has its origin in the upper left corner, see Figure 2.2. The location of the optical center is not needed for pure computer vision problems but it is important when vision sensor are combined with, for instance, inertial sensors. The skew parameter sα = fx tan α can safely be assumed sα ≈ 0 (Hartley and Zisserman, 2004) in most cameras. Cameras, especially in the lower price-range, suffer from non-negligible distortion due to the optical lenses involved and the mounting of the sensor. This can to some extent be compensated for by tangential-, radial- and pincushion distortion models (Ma et al., 2003). Camera matrices and distortion models can be estimated in standard camera calibration software, see e.g., Bouguet (2010); Zhang (2000). m c mw w Rwc mc w cw c ) (R , Figure 2.3: Camera to world transformation. Points in another frame, say w, can be expressed in the camera frame c. In the pinhole camera model, such transformation is called extrinsic since it does not depend on the intrinsic camera calibration matrix K. Figure 2.3 describes the relation between a point mw in world coordinates w expressed in camera coordi- 2.6 27 Camera Measurement Models nates c mc = Rcw (mw − cw ) = Rcw mw − Rcw cw , which can be written in homogeneous coordinates as c w x " x # y c Rcw −Rcw cw y w c = w . 0 1 z z 1 1 (2.43) (2.44) Combining the extrinsic end intrinsic parameters, coordinates in the world frame can be expressed in pixel coordinates as w p p x x x fx sα hx 1 0 0 0 "Rcw −Rcw cw # y w y p (2.45) ∝ z c y p = 0 fy hy 0 1 0 0 w . 1 0 z 1 1 0 0 1 0 0 1 0 1 In compact notation (2.45) can be written as xp = P xw , (2.46) and the matrix P is often just called the camera. In the problems studied the calibration matrix and distortion models are known which allows to working directly on normalised camera measurements mn and thus the pinhole camera works as a projective map in Euclidean space. 2.6 Camera Measurement Models Parametrisation of camera measurements can be done in several ways and the most suitable choice typically depends on the application at hand. For instance, in a filtering context with explicit landmark coordinates included in the state vector measurement models are naturally expressed using the state. Geometrical properties, such as the epipolar constraint, can be used to define implicit measurement models. In this section three different parameterisations that are used in the publications are described. 2.6.1 Direct Parametrisation From the rigid body transformation in (2.43), the projection of a single measurement of a landmark mw onto the normalised image plane is on the form w w m y m = P (mct ) + etm = P (Rcw t (m − ct )) + et (2.47) m where cw t is the position of the moving camera in the world frame and et is the so-called re-projection error. This is the preferred error being minimised since it has a sound interpretation and it is straightforward to include intrinsic and extrinsic parameters. We call this the direct parametrisation and it generalises 28 2 to n landmark measurements as m1 cw w w y P (Rt (m1 − ct )) . .. . = + etm1:n , . . mn w w y P (Rcw t (mn − ct )) Models (2.48) which is the common situation in most practical applications. Depending on which parameters are sought the model can be used in different ways. If, for instance, only landmarks are unknown, then the problem is referred to as mapping or reconstruction and if only the camera pose is unknown the corresponding problem is called localisation or navigation. 2.6.2 Inverse Depth In (2.47) the measured world point is represented naturally by its Euclidean coordinate. This may however be a poor choice if some points are at a much greater distance from the camera than other points in the scene. This is because the re-projection error cost function becomes very flat in the XY Z-space (Torr and Zisserman, 2000) since the XY coordinates are divided by a large number. This is partly the motivation for using so-called key-frames where only images with sufficient baseline are used for triangulation, see e.g., (Nistér, 2000) for a discussion. The relation to stereo cameras is that range resolution is inversely proportional to the baseline between the cameras and the corresponding change in disparity (image difference for a stereo pair). This phenomena have severe effect in the context of filtering since linearisation errors will be large and may cause the filter to diverge. Bryson et al. (2010) tackles this by tracking landmark coordinates to find the two views with maximal angular separation, parallax, for each landmark and then triangulate the landmark depth based on these two observations alone, known as delayed initialisation. A slightly different approach was proposed in Montiel and Davison (2006) who introduced the Inverse Depth Parametrisation (IDP). It uses six parameters where the first three are the coordinate of the camera from which the landmark was first observed cw t and the three parameters describing the vector from the camera to the landmark encoded by two angles ϕ w , θ w and the inverse depth ρ w 1 d(ϕ w , θ w ), ρw w w cos ϕ sin θ sin ϕ w sin θ w w w d(ϕ , θ ) = . cos θ w mw = cw t + (2.49a) (2.49b) The angles are computed from the normalized pixel coordinates as n gw = Rwc t m , ϕ w = arctan2(gyw , gxw ), θ w = arctan2 ||[gxw gyw ]T ||2 , gzw , (2.50) (2.51) (2.52) 2.6 29 Camera Measurement Models where arctan2 is the four-quadrant arctangent function and the inverse depth is initiated with an educated guess. IDP has a small linearisation error even for large uncertainty in depth and it is easy to represent the range of depth uncertainty including infinity in a confidence region. Also, using IDP delayed landmark initialisation is no longer necessary. Obviously more states are needed, however, as soon as depth uncertainty is small enough the IDP landmarks can be converted into standard Euclidean coordinates. The corresponding measurement equation at time t for a landmark initiated at time j in the camera frame is w w w mct = Rcw ρtw cw t t − cj + d(ϕt , θt ) , ytc = P (mct ). (2.53a) (2.53b) The IDP model, including sensor frame offset, was used for fusion with IMU measurements in Paper B. The same sensor setup was used in (Pinies et al., 2007) showing that feature initialisation and prediction in difficult cases, such as forward motion, can be handled better using IDP with support of IMU. 2.6.3 Epipolar Geometry An important concept in computer vision is how to relate the relative pose of two cameras through point observations. This is called the epipolar constraint and is usually credited to the publication of Longuet-Higgins (1981) and is described in most books on computer vision, see e.g., Ma et al. (2003) or Hartley and Zisserman (2004). Given two images acquired from different vantage points and let R and c denote the rotation and translation from camera from the first to the second camera, respectively. Without loss of generality the first camera is located at the origin with no rotation. Thus, a world coordinate expressed in the first camera is simply mw = m1c . The same point in the second camera is then related by a rigid body transformation m2c = R m1c + c. (2.54) Since camera coordinates are related to homogeneous image coordinates through their unknown depths λ as mc = λmn we have λ2 m2n = R λ1 m1n + c. (2.55) The intuition is that the three vectors connecting the two cameras and the point all lie in the same plane and therefore the triple product of the vectors is zero. The triple product is the inner product of one vector with the cross product of the other two. In this case it can be constructed by first eliminating the translation from the right hand side of (2.55) using c×c =0 (2.56) T from left. Then multiply from left with m2n to construct the two triple products (m2n )T c × λ2 m2n = (m2n )T c × R λ1 m1n . (2.57) 30 2 Models in which the left hand side is zero since c × λ2 m2n is perpendicular to m2n . Now the epipolar (or bilinear) constraint, is obtained (m2n )T c × R λ1 m1n = 0, =⇒ (m2n )T c × R m1n = 0, (2.58) since λ1 is non-zero. The product of the translation and the rotation is called the essential matrix E = c × R = [c]× R where [c]× is the matrix form of the cross product 0 −cz cy c 0 −cx . [c]× = z −cy cx 0 (2.59) (2.60) This means that each correspondence gives an equation m2T Em1 = 0, (2.61) where the n superscript has been dropped for notational convenience. Note that E is homogeneous meaning that it is only defined up to an unknown scale. It has 5 or 6 degrees of freedom, depending on how it is constructed. If the translation is only known up to an unknown positive scale, i.e., it is also a homogeneous quantity, then E will have 5 DOF. This is typically the case when E is estimated from image data since the absolute scale of the scene is unknown without other information. If the scaling is known then E have 6 DOF. In case the calibration of the cameras are unknown the normalised image coordinates will also be unknown and the epipolar constraint then encodes the fundamental matrix F = K2−T EK1−1 where K1 and K2 denote the calibration of the cameras at the two views. The epipolar constraint is a generic property which may hold for any two vantage points if there are correspondences. It can therefore be used to compactly represent information in terms of a rigid body transformation up to a scale of the translation. Epipolar geometry can also be used for simplifying correspondence search and verification. If the relative pose is known, a point in the first frame corresponds to a line in the other one, and vice versa, according to l2 ∼ Em1 , T l1 ∼ E m2 , (2.62a) (2.62b) where ∼ account for the unknown scale. This also highlights that cameras are bearing sensors. Thus, one may search for the correspondence along a line in the other image, see Figure 2.4. With uncertainty in the relative pose a band is the typical search region. Correspondence candidates may also be deemed correct or false by evaluating a cost defined by the point’s distance to the line kl2T m2 k where l2 is normalised. This is also the basis for efficient correspondence search using stereo cameras. 2.6 Camera Measurement Models 31 Figure 2.4: The two top figures show five of the SIFT matches between the two frames which are were used to compute the essential matrix. The bottom left image shows the matchs in the first frame and the bottom right figure epipolar lines with matches in the second frame. 32 2 Models Essential Matrix Estimation Essential matrix estimation is used to find the relative pose between cameras from image correspondences such that the epipolar constraint is (2.61) is satisfied for all correspondences. The linear eight-point algorithm, see e.g., (Ma et al., 2003)[p. 121] is a simple essential matrix estimation algorithm. The algorithm computes an estimate of the essential matrix from a minimum of eight correspondences by computing a singular value decomposition (SVD). This starts from noting that the epipolar constraint can be transformed m2T Em1 = 0 ⇐⇒ (m1 ⊗ m2 )T E s = 0 (2.63) where E s ∈ R9×1 is the stacked matrix of column vectors of E and ⊗ denotes the Kronecker product. Using n correspondences results in T m11 ⊗ m12 s .. E = 0 (2.64) . T n m1 ⊗ m2n | {z } A Es. which is linear in the unknown An approximation of the unit length E s is the solution to (2.64) which is given by minimising ||AE s ||. This is done by computing the SVD of A and define E s as the singular vector corresponding to the smallest singular value. The required number of correspondences is eight since E has nine entries which are only defined up to scale which means that there are only eight unknowns. Due to noise, the recovered matrix will often not lie in the essential space, however, a solution is to project the matrix onto this space. To further suppress noise the number of correspondences should be more than eight. There are however situations where the solution to the algorithm is not well defined if e.g., all correspondence points lie in the same plane or the baseline c ≈ 0. This is partly because the actual degrees of freedom are not taken into account in the algorithm since it considers eight unknowns. Another weakness is the sensitivity to outliers since there is no mechanism for handling these directly. Despite these limitations, the eight-point algorithm was successfully used in Paper D and Paper E for initialisation of rotations without considering potential outliers. However, outliers was later removed in an iterative fashion using IMU data. The result from eight-point-like methods can be improved by minimising the reprojection error (2.47) using nonlinear optimisation. This requires the landmark coordinates to be known. Instead a suitable error which is only parametrised by the essential matrix is a so-called Sampson error. For n correspondence {m1i ↔ 2.6 33 Camera Measurement Models m2i }ni=1 it looks like V (E) = n X i=1 m2i T Em1i 2 (Em1i )21 + (Em1i )22 + (E T m2i )21 + (E T m2i )22 , (2.65) where (Em)i picks out the i-th entry of the epipolar line Em. This Sampson error is a first order approximation of the re-projection error, see e.g., Hartley and Zisserman (2004), and it is extensively used in geometric computer vision problems. An approach which exploits the 5 DOF imposed by the construction of the essential matrix is the five-point algorithm proposed by (Nistér, 2004). The idea Nister used was that the essential matrix has two non-zero eigenvalues which are equal and this gives rise to a cubical matrix constraint. The algorithm computes the coefficients of a tenth degree polynomial and finds its roots. A perhaps simpler implementation can be found in (Li and Hartley, 2006) which solves the tenth degree polynomial using a hidden variable resultant. An important observation is that the five point method seems less sensitive than the eight-point algorithm to cases when e.g., all points lie on a plane. It also attains higher accuracy because a minimal solver may better exploit the geometric constraints (Li and Hartley, 2006). RANSAC Efficient parameterisation which allows fast and/or closed form solutions plays an important role in computer vision algorithms. This is because the set of correspondences between any two images will often contain gross outliers due to association errors which need to be handled. This is usually done by evaluating several candidate models in a RANSAC loop (Fischler and Bolles, 1981). In practice, higher accuracy is obtained by evaluating as many candidates as possible and this is why fast solutions are important since it allows more trials given the computational resources. The acronym RANSAC comes from RANdom Sampling And Consensus Fischler and Bolles (1981) and its paradigm has had a remarkable impact in computer vision. It is an iterative method which aims at finding a model such that the number of inliers is maximised. All that is needed in its basic form is a set of correspondence candidates and some error metric to evaluate the model. The RANSAC algorithm starts by randomly selecting the minimal set needed to compute a model and then all the other correspondences are evaluated by the error metric. A correspondence pair is labeled as an inlier if it falls below a predefined threshold and is otherwise considered to be an outlier. For each model the number of inliers is saved and this is often referred to as scoring. The Sampson error (2.65) is a popular candidate for scoring the essential matrices within the RANSAC loop. The model with the maximum number of inliers, i.e., the largest consensus set, is considered to be the best. Optionally, a final step is to estimate the model again using all the inliers by e.g., minimising (2.65). 34 2 Models The greatest advantage of RANSAC is the ability to robustly estimate the parameters of a model when the measurements are contaminated with outliers while simultaneously finding an inlier set. The number of samples needed for a robust estimate also depends on the number of correspondences needed to estimate the model. It is therefore desirable to have a minimal parametrisation since it is then more likely that samples only contains inliers. It can also be used only as a sampling based method for outlier rejection. A clear limitation is that there is no bound on how many times the loop has to be run in order meet some stopping criterion such as a specific inlier/outlier ratio. RANSAC typically works poorly if there are less than fifty percent of inliers. Despite (or due to) these limitations there are probably hundreds of RANSAC versions published which have different characteristics. For instance, the Maximum Likelihood version called MLESAC (Torr and Zisserman, 2000) is a good option which modifies the score by penalizing outliers. An example of essential matrix estimation using RANSAC is shown in Figure 2.5 where the integrated yaw angle of the estimated rotation matrices is compared with the yaw angle obtained by integrating the gyroscope signal. The yaw angle estimate from the camera works well for the first 15 seconds and then deteriorates rapidly from the integrated yaw angle of the gyroscope. The gyroscope can be considered a reliable ground truth for this short experiment. This highlight some of the different characteristics of the two sensors. Gyroscopes can handle fast dynamics but may deteriorate over time, due to noise integration, while cameras may have problems with fast dynamics they can be used to estimate orientation accurately for moderate dynamics over long time. 2.7 Computer Vision The intention with this section is only to provide some basic concepts of computer vision. The interested reader is referred to the two survey papers by Fraundorfer and Scaramuzza (2012); Scaramuzza and Fraundorfer (2011) and references therein, for a modern and detailed exposition. In order to use the camera measurement models some image processing is needed. Image characteristics having strong response in a detector are used to define socalled interest points. For this to work, the scene has to have some structure. The most commonly used detectors are so-called blob-detectors based on difference of Gaussians (DoG) such as the scale invariant feature transform (SIFT) (Lowe, 1999) or based on distinctive corners such as the Harris detector (Harris and Stephens, 1988). The process of tracking interest points over several image frames, also known as correspondence generation, relies on selecting well localised interest points and some local interest point description which can be located in the following frames. Although we have not used patch based methods for tracking in the publications a brief description is given here for completeness. For Harris-like features a patch U around the interest point x can be used to search for its cor- 2.7 35 Computer Vision 250 Camera Gyroscope 200 yaw [o] 150 100 50 0 −50 0 5 10 15 20 time [s] 25 30 35 40 Figure 2.5: Integrated yaw angle from the 5-point algorithm using RANSAC on the Sampson error compared with the integrated gyroscope measurements on data from a flying platform. responding location in the next frame J only assuming a simple displacement model in the image J(x + d) = U (x). (2.66) This is a popular model which usually works well for small perspective changes and it is used in most Kanade-Lucas-Tomasi (KLT) trackers as explained in Shi and Tomasi (1994). More advanced transformation models for patch matching allowing for rotation, scaling and deformation as J(x + Dx + d) = U (x), (2.67) where D ∈ R2×2 is a deformation matrix, is particularly useful in longer sequences. The SIFT detector also stores a feature descriptor for each detected interest point, see Figure 2.6. In the standard setting the descriptor vector has 128 values corresponding to an 8-bin histogram of gradient orientations for each of the 16 local sub-regions around the feature point. The SIFT descriptor can be used for generating correspondences without using any transformation model simply by comparing descriptors using any suitable metric, e.g., the Euclidean distance. This way, matches are found using the appearance of feature descriptors and not their locations. Thus, the set of matches which corresponds to the same feature is called a feature track (Thormählen et al., 2008). Such methods are particularly useful if there is a great deal of uncertainty about motion and/or structure or at loop-closing. In Paper D a purely appearance based approach is used for an initial correspondence search. That is, without assuming anything about the mo- 36 2 Models Figure 2.6: Ten SIFT features are shown with their center location given by the circles. Descriptors computed at different scales with sixteen histograms of gradient orientations indicated by the arrows in each square. Thus, a feature may have multiple orientations depending on their local appearance. tion between frames correspondences are found only by matching SIFT feature descriptors and not using their point location information. SIFT descriptors and their predicted locations can also be used directly for tracking which was done in Paper B. A key advantage of SIFT is that feature descriptors are rather insensitive w.r.t., changes in scale, illumination, rotation and even viewpoint changes up to 60◦ (Fraundorfer and Scaramuzza, 2012). These properties make SIFT suitable for matching over wide baselines which also implies that good tracking results and essential matrix estimates can be obtained using a lower sampling frequency compared to KLT-like trackers. 3 Estimation Estimation is the problem of taking measured data, yt , to reveal properties, xt and θ, of systems which are of interest. Basic ingredients are model structures representing the assumptions made on the interaction and time evolution of xt , θ and yt . As tools for solving estimation problems, state space models xt = f (xt−1 , ut , θ) + wt , yt = h(xt , θ) + et , (3.1a) (3.1b) are commonly used model structures. State-space formulations can also be described in terms of their probability density functions (PDFs) xt ∼ p(xt |xt−1 , θ), yt ∼ p(yt |xt , θ), (3.2a) (3.2b) where (3.2a) is known as the state transition density, (3.2b) is known as the measurement likelihood and the symbol ∼ corresponds to a distribution relation. 3.1 Sensor Fusion Sensor fusion is the process of combining multi-sensory data in a clever way to obtain a filtered state estimate, x̂t , or a state sequence estimate {x̂i }ti=0 = x̂0:t . A state-space model is the key component of many sensor fusion algorithms and model parameters, θ, are usually not of interest. Sensor fusion often use maximum likelihood (ML) (Fisher, 1912) estimators which have the form θ̂ ML = arg max p(Y |θ). θ 37 (3.3) 38 3 Estimation Another important estimator is the maximum a posteriori (MAP) θ̂ MAP = arg max p(θ|Y ) = arg max θ θ p(Y |θ)p(θ) = arg max p(Y |θ)p(θ), p(Y ) θ (3.4) where the second equality is known as Bayes’ therorem and the last equality is using the fact that the maximising argument is independent of the normalising constant p(Y ). 3.1.1 Smoothing and Filtering The smoothed state x̂0:t defines an estimate of the whole state trajectory and it can be obtained as the MAP estimate of the state sequence given the measurements x̂0:t = arg max p(y0:t |x0:t )p(x0:t ). (3.5) x0:t With Gaussian noise and initial state x0 ∼ N (x̄0 , P ) the negative log of (3.5) times two becomes a Gaussian MAP estimation problem x̂0:t = arg min x0:t kx̄0 − x0 k2P −1 + t X kxi − f (xi−1 )k2Q−1 + i=1 arg min x0:t t X kyi − h(xi )k2R−1 = i=1 V (x0:t ), (3.6) where the terms not directly depending on x0:t have been left out, see for instance Rao (2000) for details. This is also a nonlinear least squares formulation, a topic which will be treated in Section 4.3.3. Note that if only a part of the whole batch is considered at each time step a so-called moving horizon estimation problem is obtained. Alternatively, the process and measurements can be viewed as equality constraints and the constrained formulation of (3.6) is then x̂0:t = arg min x0:t subject to kx̄0 − x0 k2P −1 + t X i=1 xi = f (xi−1 ) + wi , yi = h(xi ) + ei . kwi k2Q−1 + t X kei k2R−1 , (3.7a) i=1 (3.7b) (3.7c) MAP estimation can be extended to include other parameters in f and h, besides the state, resulting in a joint smoothing and parameter estimation problem. The particular benefit with this optimisation viewpoint is that it is straightforward to add constraints which is not easily done in a filtering context. In the case of linear dynamics and linear measurement equations (3.6) becomes a convex optimisation problem that can be efficiently implemented as a RauchTung-Striebel (RTS) smoother Rauch et al. (1965). A straightforward, yet approximate, extension to nonlinear systems is given by the extended-RTS (E-RTS) smoother where the forward filter is realised using an EKF. This method is used in Paper E. Note that in this nonlinear setting the E-RTS may be improved using iterations, step control, and other techniques, since in the end we are just solving a nonlinear optimisation problem. 3.2 39 Optimisation Algorithm 1 Extended Kalman Filter Available measurements are Y = {y1 , . . . , yN }. Require an initial state, x̂0|0 , and an initial state covariance, P0|0 , and use the models (3.1). 1. Time Update x̂t|t−1 = f (x̂t−1|t−1 , ut ), (3.8a) Pt|t−1 = Ft Pt−1|t−1 FtT + Qt , 2. Measurement Update (3.8b) St = Ht Pt|t−1 HtT + Rt , Kt = (3.9a) Pt|t−1 HtT St−1 , (3.9b) x̂t|t = x̂t|t−1 + Kt yt − h(x̂t|t−1 ) , (3.9c) Pt|t = Pt|t−1 − Kt Ht Pt|t−1 . (3.9d) Where ∂f (xt , ut ) Ft , ∂xt ∂h(xt ) , Ht , , ∂xt (xt )=(x̂t|t−1 ) (xt ,ut )=(x̂t−1|t−1 ,ut ) while, Qt and Rt are the covariance matrices of wt and et , respectively. (3.10) Extended Kalman Filter A popular estimator is the extended Kalman filter which is described in many books, see e.g., Kailath et al. (2000). The EKF works in a two step procedure summarised in Algorithm 1. As with the E-RTS there are no convergence guarantees since the involved functions are nonlinear. 3.2 Optimisation Many batch and filtering problems can be formulated in terms of optimisation programs to which there are many software packages readily available. A quite general optimisation program is minimise V (θ) (3.11a) subject to cE (θ) = 0 cI (θ) ≤ 0, (3.11b) (3.11c) θ where V : Rn → R is the objective function, cE are equality constraints, cI are inequality constraints and θ are the variables. The Lagrange function, often just called the Lagrangian, is obtained by taking the optimisation problem (3.11) and augmenting the objective function with a weighted sum of the constraints as L(θ, λ, ν) = V (θ) + λT cE (θ) + ν T cI (θ), (3.12) with associated dual variable vectors, λ and ν. The first order necessary optimal- 40 3 Estimation ity conditions are ∇V (θ) + λT ∇cE (θ) + ν T ∇cI (θ) = 0 cE (θ) = 0 cI (θ) ≤ 0 ν0 diag(ν)cI (θ) = 0 (3.13a) (3.13b) (3.13c) (3.13d) (3.13e) which have to be satisfied at the optimum (θ ∗ , λ∗ , ν ∗ ) and (3.13a) is the gradient of the Lagrangian (3.12). These equations are often referred to as the Karush-KuhnTucker (KKT) conditions, see e.g., (Boyd and Vandenberghe, 2004; Nocedal and Wright, 2006) for detailed explanations and some historical notes. For convex problems the KKT conditions are also sufficient whereas for non-convex problems a KKT point is merely a candidate solution. Specific classes of problems can be identified depending on e.g., the properties of V , the choice of variables, among others. If, for example, the objective function is V (θ) = 12 θ T Gθ + θ T c, G is a symmetric matrix, and the constraints are linear in θ, then a quadratic program (QP) is obtained. Many problems can be transformed into an equivalent QP and an instructive example on linear Kalman filtering is given below. Example 3.1: KF measurement update as a Quadratic Program The Kalman filter iterates two equations for the state x̂t|t−1 = Ax̂t−1|t−1 (3.14a) x̂t|t = x̂t|t−1 + Pt|t−1 C T (CPt|t−1 CtT −1 + R) (yt − C x̂t|t−1 ), (3.14b) where (3.14a) is the time update and (3.14b) is the measurement update. Equivalently, the left hand side of the measurement update can be specified as the following QP {x̂t|t , êt } = arg min x,et subject to −1 etT R−1 et + (x − x̂t|t−1 )T Pt|t−1 (x − x̂t|t−1 ) (3.15a) yt = Cx + et , (3.15b) or alternatively, the unconstrained version is −1 x̂t|t = arg min (yt − Cx)T R−1 (yt − Cx) + (x − x̂t|t−1 )T Pt|t−1 (x − x̂t|t−1 ). (3.16) x The actual gain from these formulations, compared to the standard KF equations (3.14), is that constraints can be added easily and modification of the objective function becomes straightforward. Such an example is modelling of so called heavy-tailed noise which is a direct approach to handle non-Gaussian residuals. This can be treated by adding another variable zt to (3.15), as in (Mattingley and 3.2 41 Optimisation Boyd, 2010), resulting in {x̂t|t , êt , ẑt } = arg min x,et ,zt subject to −1 etT R−1 et + (x − x̂t|t−1 )T Pt|t−1 (x − x̂t|t−1 ) + λkzt k1 (3.17a) yt = Cx + et + zt , (3.17b) where λ ≥ 0 is a design parameter controlling how much the kzt k1 -norm (`1 ) should be favoured. This is not a QP anymore, however it can be transformed into another QP which has the form {x̂t|t , êt , ẑt } = arg min x,et ,zt subject to −1 etT R−1 et + (x − x̂t|t−1 )T Pt|t−1 (x − x̂t|t−1 ) + λ1T u (3.18a) yt = Cx + et + zt , − u ≤ zt ≤ u. (3.18b) (3.18c) For small λ some elements in zt will be exactly zero, whereas for large λ all elements will be zero and the original problem is obtained. This means that the filter is more robust with respect to non-Gaussian errors. The `1 -norm can be interpreted in a statistical sense as zt being Laplace distributed p(zt ) = L(0, 2σ 2 /λ), see e.g., Hastie et al. (2009). `1 -regularisation is also used to find sparse parameter estimates and these metods are known as least absolute shrinking and selection operator (LASSO) Tibshirani (1996). Constrained programs can be transformed into unconstrained counterparts using the Lagrangian and additional penalisation terms to account for dropped constraints. Many software packages can solve constrained problems by parsing them as unconstrained ones, or vice versa, if it is more efficient to solve them this way. Popular unconstrained methods are steepest descent, (quasi-) Newton and trust-region, see e.g., Nocedal and Wright (2006); Dennis and Schnabel (1983). 3.2.1 Iterated Extended Kalman Filter An interesting parallel between filtering and optimisation is that the measurement update in the EKF (3.9) is the solution to the following NLS problem x̂t|t = arg min x 1 1 kyt − h(x)k2R−1 + kx − x̂t|t−1 k2P −1 = arg min V (x) 2 2 t t|t−1 x (3.19) which is the nonlinear counterpart of (3.16). The EKF solution is given by a full, single step, in a Gauss-Newton procedure (Bertsekas, 1994). Note that in the filtering case it does not matter if the state dynamics are nonlinear or not. The pure EKF may be a poor choice if e.g., the predicted state is far from the true one. Albeit Gauss-Newton does not promise global (or even local) convergence, iterations may improve the estimate. This approach is used in the iterated-EKF (IEKF), see e.g., Jazwinski (1970); Bell and Cathey (1993); Bar-Shalom and Li (1993), in 42 3 Estimation which the measurement update for the state is iterated a few times ∂h(x) Hi = , ∂x Ki = x=xi Pt|t−1 HiT (Hi Pt|t−1 HiT + Rt )−1 , (3.20a) (3.20b) xi+1 = x̂t|t−1 + Ki yt − h(xi ) − Hi (x̂t|t−1 − xi ) , (3.20c) where the first iteration is exactly the same as in the EKF measurement update. The measurement updated state and the covariance approximation is x̂t|t = xi+1 , Pt|t = Pt|t−1 − Ki Hi Pt|t−1 , (3.21a) (3.21b) where the updated covariance is only updated using the last Kalman gain and measurement Jacobian, Ki and Hi , respectively. The IEKF assumes that each step reduces the cost (3.19) V (xi+1 ) ≤ V (xi ), (3.22) without evaluating if that was the case. Modifying the update (3.20c) as xi+1 = x̂t|t−1 + Ki yt − h(xi ) − Hi (x̂t|t−1 − xi ) , = xi + x̂t|t−1 − xi + Ki yt − h(xi ) − Hi (x̂t|t−1 − xi ) , (3.23) a step control parameter α can be introduced to ensure cost reduction. The modified measurement update of the IEKF is then on the form (3.24) xi+1 = xi + αi x̂t|t−1 − xi + Ki yt − h(xi ) − Hi (x̂t|t−1 − xi ) , = xi + αi pi , (3.25) where the step length 0 < α ≤ 1 and the search direction p is chosen such that (3.22) is satisfied in each step. We call this approach IEKF-L. More advanced line search strategies that employs conditions on the curvature for sufficient decrease could of course be used. The IEKF with and without line search is illustrated in Example 3.2. Example 3.2: Bearings Only Tracking and IEKF An applied example of IEKF bearings-only tracking is studied, which also can be found in Gustafsson (2012) as Example 8.1. For simplicity, the target is stationary, i.e., wt = 0 and Q = 0, at the true position x∗ = [1.5, 1.5]T . The bearing measurement function from the j-th sensor S j at time t is j j j yt = hj (xt ) + et = arctan2(yt − Sy , xt − Sx ) + et , (3.26) where arctan2() is the two argument arctangent function, Sy and Sx denotes the y and the x coordinates of the sensors, respectively. The noise is et ∼ N (0, Rt ). 3.2 43 Optimisation The Jacobians of the dynamics and measurements are −(y − Syj ) . H = j j j (x − Sx )2 + (y − Sy )2 x − Sx 1 j F = I, (3.27) With the two sensors having positions S 1 = [0, 0]T and S 2 = [1.5, 0]T . The filter is initialised with x̂0|0 = [1, 1]T and P0|0 = I2 . In Figure 3.1 the IEKF is compared with the EKF for a single realisation with et ∼ N 0, π10−3 I2 for the left plot and ten noise-free measurements for the right plot with the same covariance as in the left plot. If this artificial measurement covariance is decreased by a few orders of magnitude, then the EKF covariance does not capture the true uncertainty but the IEKF does. For this simple example 10 iterations are used since then the IEKF 1.7 2 1.6 1.8 1.5 y [m] y [m] 1.6 1.4 1.4 1.3 1.2 1.2 Initial True EKF IEKF 10 iterations 1 0.8 1 1.2 1.4 x [m] 1.6 1.8 2 2.2 Initial True EKF IEKF 10 iterations 1.1 1 0.9 1 1.1 1.2 1.3 1.4 x [m] 1.5 1.6 1.7 Figure 3.1: Left: The IEKF is iterated 10 times and comes closer to the true state with a covariance that captures the uncertainty. Right: 10 noise-free measurements are given and the IEKF, again iterated 10 times, converges to the true position much faster than the EKF. have converged and the result clearly improves the estimate. However, it may be sufficient to perform fewer iterations to speed up execution. The cost function for this example is nearly convex close to the true target. This means that if we start close enough it should be safe to take apply the IEKF without any modifications. It was however also verified that the actual cost decreased in each step. In the left plot in Figure 3.2 the target is in the same location as before but the initial guess is x̂0|0 = [0.2, 1.5]T . Given one measurement from each sensor the EKF performs one update and the IEKF performs 3 iterations. It is obvious that both the target location and the covariance estimate is much better when using the IEKF. Starting with the initial guess x̂0|0 = [0.3, 0.1]T with P0|0 = 2I2 where the curvature of the cost function is bit more difficult. The EKF and IEKF-L are given perfect measurement but with assumed covariances Q = R = π10−3 I2 . The position estimate is shown in the right plot in Figure 3.2. Note that the IEKF-L covariance is consistent as opposed to the EKF. The rapid convergence of the IEKF-L is not 44 3 3 Estimation 2.5 2.5 2 2 y [m] y [m] 1.5 1.5 1 1 Initial True EKF IEKF 3 iterations Iteration 0.5 0.2 0.4 0.6 0.8 1 1.2 x [m] 1.4 1.6 Initial True EKF IEKF 3 iterations Iteration 0.5 0 0 1.8 0.5 1 1.5 x [m] 2 2.5 3 Figure 3.2: Left: The target estimate and the covariance is much better using the IEKF. The dashed lines corresponds to the iterated estimate in the IEKF. Note that the first step in the IEKF is the same as the EKF. The level curves illustrates the NLS cost with a minimum at the true target position. Right: The EKF has poor performance and under-estimates the covariance while IEKF-L gives good results by step size reduction and just 3 iterations. surprising since the variable step is based on cost decrease at each iteration. More iterations does not give any significant improvement and the steps gets short. 3.2.2 Nonlinear Least Squares Nonlinear least squares problems are obtained when the objective function is a sum of squared errors as in the MAP smoothing problem (3.6) or the constrained form (3.7a). Popular solvers are Gauss-Newton and Levenberg-Marquardt which approximates nonlinear functions by appropriate linear functions. The goal in NLS is to estimate parameters, θ, by minimising a parametrised residual vector, e ε(θ) ∼ N (0, Σθ ). For notational convenience the residuals are normalised according to their assumed covariance /2 e ε(θ) = Σ−T ε(θ), θ (3.28) where Σ−T /2 denotes the transposed matrix square-root and then the Mahalanobis /2 /2 e notation can be dropped since ke ε(θ)k2 −1 = (Σ−T ε(θ))T (e ε(θ)Σ−T ) = kε(θ)k22 . θ θ Σθ The residuals are said to be minimised in a least-squares sense by the cost V (θ) = N N t=1 t=1 1X 1X T ||εt (θ)||22 = εt (θ)εt (θ). 2 2 (3.29) 3.2 45 Optimisation Algorithm 2 Gauss-Newton 1. Require initial an estimate θ and Jacobian J(θ) 2. Compute a search direction p by solving J(θ)J(θ)T p = −J(θ)ε(θ). (3.33) 3. Compute a step length, α, such that the cost (3.29) is decreased. This can be done using line search, see e.g., Nocedal and Wright (2006, page 297) or Boyd and Vandenberghe (2004, page 464). 4. Update the parameters θ := θ + αp. (3.34) 5. Terminate if a stopping criteria is met. Such criteria can be; the change in cost is small, the number of iterations has exceeded some threshold, among others. 6. Otherwise, return to step 2. dε T (θ) Define ε(θ) = [ε1 (θ) ε2 (θ) . . . εN (θ)]T and the Jacobian J(θ) = dθ , then the gradient and the Hessian of (3.29) with respect to the parameters are given by N ∇V (θ) = dV (θ) 1 X dε (θ) = = J(θ)ε(θ), εt (θ) t dθ 2 dθ (3.30) t=1 and N d 2 εtT (θ) d 2 V (θ) 1X T = J(θ)J(θ) + , ε (θ) t 2 dθ 2 dθ 2 (3.31) t=1 respectively. The extension to multivariable residuals is easily obtained by stacking the vectorisation of the individual residuals which again gives a scalar cost function. The Gauss-Newton method can be seen as a modified Newton method which applies only to NLS problems. It is computationally cheap since the Hessian of the objective function is approximated as d 2 V (θ) ≈ J(θ)J(θ)T , (3.32) dθ 2 thus there is no need for second order derivatives. The approximation is good when the initial θ is close to the optimum but it may be bad if some residuals are large. An option is then to include the second order terms in (3.31) or approximate them with some secant method. The Gauss-Newton method as an algorithm is summarised in Algorithm 2. The Gauss-Newton method may encounter problem if the Jacobian is singular or ill-conditioned. A straightforward remedy is given by the Levenberg-Marquardt algorithm which modifies the normal equations as (J J T + µI)p = −J ε. (3.35) where µ is a positive number and the θ dependence have been omitted. The µ parameter acts as an interpolation of the Gauss-Newton method and gradient 46 3 Estimation descent. This can be seen by noticing that Gauss-Newton corresponds to µ = 0 and for large µ the step will be approximately in the gradient descent direction which typically is the preferred option if the initial solution is far from the optimum. The parameter is adaptively updated by some method and should typically decrease as the minimum is approached. Another modification suggested by Marquardt (1963) is to take larger steps in the direction in which the gradient is small by (J J T + µ diag{J J T })p = −J ε, (3.36) and thus speed up convergence. Note that in practice parameters are often subject to constraints, for instance; unit quaternions should have unit length; physical landmarks need to be in front of the camera (Hartley, 1998); motion is constrained by motion models. When parameters are updated by a simple increment such constraints may be violated. It is therefore important to have a stable local parametrisation for the update followed by some procedure such that the parameters do not violate the constraints, for instance normalisation of an updated quaternion. Such an approach is used in Paper D since the gyroscope rates are natural parameters for the local update and the unit quaternion for global rotation parametrisation. 3.3 Problem Structure For problems with many parameters the key to efficiency is to utilise the specific structure of each problem in the corresponding Jacobian and the Hessian approximation. Although the normal equations can be computed efficiently by numerical matrix factorisations, such as QR, LDL and SVD it may not be a good option if the factorisation needs to be updated. For large problems even the computation of the normal equations may be infeasible due to e.g., memory or time constraints. Explicit, rather than numerical, factorisations are then a good option. A well-known trick for efficient equation system solving is to use the Schur complement and an example of this is given below. Example 3.3: Schur Complement In batch problems such as NLS-SLAM and bundle adjustment (BA) there is a natural sparsity in the Jacobian and thus in the Hessian approximation. Partition the normal equations (3.33) as " T #" # " # " #" # " # T Jc Jc Jc Jm pc Jc εc Hcc Hcm pc Jc εc =− ⇐⇒ =− (3.37) T pm Jm εm Hmc Hmm pm Jm εm Jm JcT Jm Jm where Jc and Jm are the Jacobian w.r.t., the camera poses and landmarks respectively. The primary sparsity structure in BA comes from the fact that each landmark observation only depends on one camera pose which means that Hcc and Hmm are block diagonal. Solving (3.37) for pc and pm can be done efficiently using 3.3 47 Problem Structure the Schur complement of e.g., Hmm as −1 −1 Jm εm Hmc pc = −Jc εc + Hcm Hmm Hcc − Hcm Hmm (3.38a) −1 (−Jm εm − Hmc pc ) pm = Hmm (3.38b) where the much smaller reduced system (3.38a) is solved first and pc is backsubstituted into (3.38b). And since Hmm is block diagonal, its inverse is cheap to compute. Furthermore, it is often the case that all landmarks are not observed the whole time and this gives rise to a secondary sparsity structure in the Hcm matrix which can be exploited. For very large problems the normal equations may be solved by quasi-Newton methods such as L-BFGS (Nocedal, 1980) which efficiently approximate the inverse Hessian. Other approaches are conjugate gradients or in a distributed way using the alternating direction method of multipliers (ADMM) (Boyd et al., 2011). These methods converge slowly, yet they may be the only feasible option for large systems where direct methods are not suitable. In general there are few, if any, guarantees that the global optimum will be found by any nonlinear program. However, careful selection of the initial starting point greatly improves the chance of reaching the optimum. 4 SLAM In this chapter a brief overview of some SLAM estimation methods is given. More thorough descriptions are given in the three appended SLAM publications. 4.1 Introduction SLAM problems are combinations of localisation and mapping type problems solved simultaneously. There are mainly two strategies and these are either based on filtering or on batch optimisation. The early research almost exclusively focused on filtering methods which recursively incorporate the measurements estimating the posterior filtering density p(xt , m |y1:t ) and the list of filtering based acronyms is long. In summary, many of them are either using particle filters Montemerlo et al. (2002) or extended Kalman filters, Smith et al. (1990). Figure 4.1 illustrates a SLAM setup with a moving platform and observing landmark features from different locations. 4.1.1 Probabilistic Models The target in SLAM is to either maximise the posterior density of the complete trajectory and map p (x0:t , m |y0:t ) , (4.1) p (xt , m |y0:t ) , (4.2) or the filtering density which is obtained by marginalising old states. Here x0:t is the whole state trajectory, xt is the current state, m is a static map and are the y0:t measurements relating to the state and the map. For notational convenience, correspondence 49 50 4 SLAM p3 p1 Z Z X b Y n m1 Y b p4 p2 b m4 n m5 m2 X n m6 m3 m7 m8 Figure 4.1: A moving platform with body coordinate system (b) is observing an environment represented by point landmarks, m1 , . . . , m8 . Also, a global, fixed navigation coordinate system, (n), is drawn. variables and input signals are not made explicit. The SLAM posterior (4.1) can for instance be factorised as x,m x p (x0:t , m |y0:t ) = p(x0:t |y0:t )p(y0:t |x0:t , m) = p(x0:t )p(y0:t |x0:t )p(y0:t |x0:t , m), (4.3) where the first factor is the process model, the second factor is the measurement likelihood independent of the map, e.g., GPS or IMU measurements, and the third is the measurement likelihood of measurements that depend on the map and the process. From (4.3) the filtering and smoothing forms are straightforwardly obtained. As was shown in Section 3.1.1 the smoothing density becomes an NLS problem if the noise sources are Gaussian as is the common assumption in most algorithms. The batch formulation is the target in GraphSLAM (Thrun and Montemerlo, 2006; Thrun et al., 2005). Similar to FastSLAM (Montemerlo et al., 2002, 2003), the posterior is factorised as p(x0:t , m |y0:t ) = p(x0:t |y0:t )p(m |x0:t , y0:t ), (4.4) where p (x0:t |y0:t ) is the posterior of trajectories. This density is obtained by marginalising the landmark parameters which introduces links, relative pose constraints, between any two poses measuring the same landmark and the result is a pose graph. This is in close relation to the reduced system in (3.38a) obtained using the Schur complement. GraphSLAM maintains a pose graph as a Gaussian with mean and covariance but only computes the conditional mean of the map using the factorisation Nm Y p (m |x0:t , y0:t ) = p mi |x0:t , y0:t , i=1 (4.5) 4.2 51 EKF-SLAM which is the same approach as in FastSLAM. This means that each landmark can be treated independently given the true trajectory, thus avoiding to keep track of the full correlation structure of the map which is the main draw-back with EKF-SLAM. 4.2 EKF-SLAM EKF-SLAM is probably the most common SLAM method and it is often straightforward to implement as described in e.g., Durrant-Whyte and Bailey (2006); Smith et al. (1990). For a thorough treatment, the book by Thrun et al. (2005) serves as a standard reference. In feature based SLAM, coordinates in the global frame are explicitly represented as landmarks, m, which are part of the state vector. The standard assumption is that the landmarks are stationary but dynamic objects can naturally be included in the state vector (Bibby and Reid, 2007). Assume that measurements arrive in the same rate as the dynamic model. Then the landmark and the measurement models are given by xt = f (xt−1 ) + wt , mt = mt−1 , yt = h(xt , mt ) + et . (4.6a) (4.6b) (4.6c) The EKF given in Algorithm 1 applies to (4.6) with just a few modifications. The prediction step in EKF-SLAM is given by x̂t|t−1 = f (x̂t−1|t−1 ), m̂t|t−1 = m̂t−1|t−1 , (4.7a) (4.7b) xx xx Pt|t−1 = Ft Pt−1|t−1 FtT + Q, (4.7c) xm xm Pt|t−1 = Ft Pt−1|t−1 , (4.7d) mx Pt|t−1 (4.7e) = mx Pt−1|t−1 FtT , where Ft , ∂f (xt−1 ) . ∂xt−1 (xt−1 )=(x̂t−1|t−1 ) (4.7f) Note that only the vehicle state covariance and the cross terms are updated while that the map mean and covariance remains unchanged. The full covariance matrix is " xx # P Pxm P = mx . (4.8) P P mm When new landmarks are initialised they are appended to the state vector but the vehicle state dimension stays the same. If the map estimation is only used locally for vehicle state estimation i.e., odometry, then old landmarks can be removed 52 4 SLAM from the filter. The measurement update for EKF-SLAM is given by " Kt = Pt|t−1 HtT (Ht Pt|t−1 HtT + Rt )−1 , # " # x̂t|t x̂ = t|t−1 + Kt yt − h(x̂t|t−1 , m̂t|t−1 ) , m̂t|t m̂t|t−1 (4.9a) (4.9b) Pt|t = Pt|t−1 − Kt Ht Pt|t−1 , (4.9c) where Ht , h ∂ h(xt , mt ) ∂xt i ∂ h(xt , mt ) ∂ mt (4.10) . (xt ,mt )=(x̂t|t−1 ,m̂t|t−1 ) The measurement Jacobian (4.10) is often rather sparse since the sensor will typically only observe a part of the landmark state at each time instant and an efficient implementation exploits this structure. Since the measurements are assumed independent the measurement update can be processed iteratively avoiding the need for inverting a large matrix in the Kalman gain computation (4.9a). In Paper B EKF-SLAM is used for intialisation of the trajectory and map which makes it limited to small problems. In Figure 4.2 the horizontal speed estimate from EKF-SLAM and NLS-SLAM is shown. 0.6 Horizontal Speed [m/s] 0.5 0.4 0.3 0.2 0.1 0 0 1 2 3 4 5 6 7 8 Time [s] Figure 4.2: The smoothed horizontal speed of the camera in red and EKF in blue. The true speed is 0.1 m/s except for when the robot stops and changes direction, this happens at about 4 seconds and 6 seconds. 4.3 Batch SLAM 53 The EKF-SLAM framework also applies to sensor network calibration, as in Paper A. Obviously there is then no need for correspondence search since the sensor identities (landmarks) are known and the map size is also fixed. As was pointed out in Section 2.4 measurements are here obtained in the sensor node magnetometers assuming a known magnetic dipole model of the survey vessel. This is the reverse measurement relation to the ordinary SLAM concept. 4.3 Batch SLAM Batch methods, also known as Full SLAM, have recently come to dominate SLAM research both offline and as a sub-system in online applications. Some beneficial properties of batch methods are: • Loop closing and thus drift compensation is easier. • The effect of linearisation errors can be reduced through iterative refinement. • Efficient optimisation routines can be utilised. • Data association decisions can easily be undone. • The inherent primary sparsity is utilised. • The complete map correlation structure does not have to (but can be) computed. In both filtering and batch SLAM applications the system unavoidably grows with the exploited space since more memory is needed to store the map and possibly historical motion estimates. However, in most realistic scenarios only a few parameters are affected by the measurements and therefore only small parts of the system, or its factorisation, need to be updated. A concrete example is the square-root smoothing and mapping (SAM) Dellaert and Kaess (2006) which solves the Full SLAM problem incrementally in real-time. Efficient solutions to batch formulations are utilising the inherent sparsity in the Jacobian J or equivalently the associated information (inverse covariance) matrix I = J J T . In contrast, the filter covariance, and likewise the filter information matrix, is full since past vehicle states are marginalised as shown in Paskin (2003). The batch sparsity ideas are exploited in a SLAM context by (M. Kaess and A. Ranganathan and F. Dellaert, 2008; Grisetti et al., 2011; Thrun and Montemerlo, 2006; U. Frese, 2005; Paskin, 2003) and many others. Furthermore, Dellaert and Kaess make no special distinction between SAM and BA since they are both often treated as NLS problems in batch form. 4.3.1 Graphs Recently, optimisation based batch methods on graphs, and especially pose graphs (Lu and Milios, 1997), have attracted much interest and will therefore be given some attention here as well. The GraphSLAM algorithm, proposed by (Thrun 54 4 SLAM 0 100 200 300 400 500 600 0 100 200 300 nz = 58797 400 500 Figure 4.3: Left: Structure of a Jacobian from the NLS-SLAM problem in Paper B. The upper left diagonal block corresponds to the process derivatives, the left lower block corresponds to the derivatives of the camera measurements w.r.t., the IMU/camera states and the lower right block are derivatives of the camera measurements w.r.t., the landmark parameters. Right: Jacobian matrix from NLS-SLAM problem in Paper D. The structure is more complicated since there are three blocks of measurements (rows) corresponding to the accelerometer measurements, camera measurements and gyroscope measurements, respectively. The parameters (columns) are velocity, acceleration, landmarks, acceleration bias, gyroscope bias, and angular velocity, respectively. and Montemerlo, 2006; Thrun et al., 2005), is such a method. It is assumed that the data association is known and there is no general mechanism in GraphSLAM for treating false associations once the graph is constructed. The graphinterpretation is another way of describing dependencies among parameters as in the Jacobian and the Hessian. Examples of two SLAM graphs are shown in Figure 4.3. For instance, the information matrix associated with the Full SLAM posterior is the graph of a Markov fandom field (Thrun et al., 2004). The various graph representations are primarily used to gain insight into the underlying inference problem and to enable efficient solution strategies. For most scenarios the corresponding graph contains loops since some places may be visited several times. Inference on the graph can therefor only be done approximately e.g., with the sum-product algorithm or “loopy” belief propagation. Junction trees eliminates these cycles by clustering them into single nodes (Paskin, 2003) leaving a graph which is a tree and thus exact belief propagation is possible. 4.3 55 Batch SLAM The full-SLAM and BA graphs are matrices with known structure when correspondences are fixed. In the tutorial on graph-based SLAM (Grisetti et al., 2011) a clear distinction is made between the sensor specific data association used to construct the graph, which they call the front-end, and the strategies used to optimise the graphs which are referred to as the back-end. We will here refer to the front-end as the initialisation. 4.3.2 Initialisation Depending on the application the initialisation can be anything from cheap to very computationally expensive. Pose graphs are intuitive to use when the sensor model is invertible which is the case with ground robots having laser range scanners in 2D or 3D. By pair-wise matching of the laser scans, e.g., using iterative closest point (ICP), see for instance Besl and McKay (1992), local relative poses are obtained. The pose graph nodes are then the trajectory of robot poses and the edges contains observations and odometry. There are also no natural 3D-3D point correspondences when using laser measurements only since measurements do not provide information around the point. Initialisation in visual applications typically consists of feature tracking and essential matrix estimation within a RANSAC loop. Node posi-ons Raw data Graph Construc-on (Ini-alisa-on) Graph Op-misa-on (NLS) SLAM es-mate Node posi-ons and measurements Figure 4.4: Iterative correspondence search and graph optimisation. There are few (if any) visual methods that use pose graphs only. For instance, Kim and Eustice (2009); Eade et al. (2010) use view-based matching for whole images combined with odometry and the relative poses obtained are used to construct a pose graph. Both of them are using view-based recognition based on matching entire frames to each other, similar to ICP, thus they are not doing any local tracking. Strasdat et al. (2011) combines a sub-mapping BA approach much like, PTAM (Klein and Murray, 2007), and connects them via a pose graph in a separate optimisation thread. It is good to incorporate iterative an correspondence search in the optimisation thread, as illustrated in Figure 4.4, since associations may become too unlikely by some metric once the optimisation has started. One such error metric for visual tracking is the bi-directional error (Kalal et al., 2010) which is the difference between the image coordinate of the forward-track and the end image coordinate of the backward-track (reversed point tracking) which should be small for well-localised features (Hedborg et al., 2011, 2012). This idea 56 4 SLAM is similar to what is used in the Paper D where iterates between state estimation and correspondence search is done. Along the same line Dellaert et al. (2003) employ simultaneous SFM and correspondence estimation using EM to learn 3D models. While a multimodal pose graph model is devised Pfingsthorn and Birk (2012) optimising correspondences using Gaussian mixtures which admits corrections for poor initialisation. 4.3.3 NLS-SLAM In state-space formulations of SLAM the map is included in the state vector without process noise xt = f (xt−1 ) + wt , mt = mt−1 , yt = h(xt , mt ) + et . (4.11a) (4.11b) (4.11c) This formulation is however equivalent to xt = f (xt−1 ) + wt , yt = h(xt , m) + et , (4.12a) (4.12b) meaning that the map can be excluded from state vector and simply viewed as a parameter. The idea with NLS-SLAM to use an initial estimate of the map and the whole state-space sequence x0:t and then minimise all the measurement errors and the state trajectory errors as the NLS problem {x̂0:t , m̂} = arg min x0:t , m kx̄0 − x0 k2P −1 + t X kxi − f (xi−1 )k2Q−1 + i=1 t X kyi − h(xi , m)k2R−1 . i=1 (4.13) In Paper D, the dynamic model is considered being exact, i.e., it is just used to define a static map for the parameter-space and is left out of the optimisation. Then the following NLS problem is obtained {x̂0:t , m̂} = arg min x0:t , m t X i=0 kyim − hm (xi , m)k2R−1 + kyia − ha (xi )k2R−1 + kyiω − hω (xi )k2R−1 , m a ω (4.14) where hm denotes the direct parametrisation (2.47) for camera, ha denotes accelerometer measurement function and hω denotes gyroscope measurements. Besides the IMU measurements, this is a standard BA formulation. Since both the camera and accelerometer measurement functions are nonlinear w.r.t., the state and the map, a good initial value is needed. It is the sole purpose of Paper D to show how such an initial point can be obtained through a sequence of almost linear steps with only SIFT features and IMU data as input. 4.3 57 Batch SLAM 4.3.4 EM-SLAM In Paper E the map is also considered being a static parameter, θ := m, according to xt = f (xt−1 , ut , wt ), yt = ht (xt , θ) + et , (4.15a) (4.15b) where the IMU is input to the motion model. In an ML setting, the joint likelihood of measurement and states parametrising the map is pθ (y1:t , x1:t ) = t Y pθ (yi |xi )p(xi |xi−1 ), (4.16) i=1 where the state trajectory is considered being a latent variable. This density can be maximised using expectation-maximisation (Dempster et al., 1977) by solving two coupled, but hopefully easier, problems iteratively. This is done by computing the expected value of the log of (4.16) t Y y Q(θ, θk ) = Eθk log p (y |x )p(x |x ) = 1:t θ i i i i−1 (4.17) i=1 =− N X ( Eθk t=1 ) 1 2 ky − ht (xi , θ)kR−1 y1:t + const., m 2 i (4.18) where the measurement errors are assumed Gaussian. The motion model and other terms independent of the map are lumped into a constant parameter. The expected value cannot be obtained in closed form, instead we approximate (4.17) as Q(θ, θk ) ≈ − N 1X kyi − hi (x̂i|t , θ)k2R−1 + m 2 i=1 s Tr(R−1 ∇x hi (x̂i|t , θ)Pi|t (∇x ht (x̂i|t , θ))T ) , (4.19) where, ∇x denotes the Jacobian w.r.t., to x, x̂i|t is the smoothed estimate of the s latent variable and Pi|t is its covariance. The smoothed estimate is obtained with an E-RTS smoother. The trace term compensates for the use of the estimated latent variables instead of the true ones. The Q(θ, θk ) function is then maximised w.r.t., the map θ using the quasi-Newton method BFGS, see e.g., Nocedal and Wright (2006) as further explained in Section 3.2 in Paper E. To start the EMSLAM iterations an intial estimate of the map and the state is needed and it is here obtained using the results in Paper D. It should be noted that the splitting of batch SLAM into state estimation and mapping is also the key in GraphSLAM and FastSLAM as discussed in Section 4.1.1. 58 4 4.4 SLAM IMU and Camera Fusion of IMU and monocular vision information is conceptually straightforward but can be a challenging task in practice. A nice introduction to this sensor setup is given in (Corke et al., 2007) and the thorough exposition in Hol (2011) is also recommended. The complementary characteristics of these sensors are attractive since the unbounded error growth in position and orientation from IMU integration can be corrected using the camera. Landmarks Trajectory Estimated Landmarks Estimated Trajectory 6 4 4 2 y [m] 2 y [m] Landmarks Trajectory Estimated Landmarks Estimated Trajectory 6 0 0 −2 −2 −4 −4 −6 −6 −6 −4 −2 0 x [m] 2 4 6 −6 −4 −2 0 x [m] 2 4 6 Figure 4.5: Bearings-only SLAM example illustrating the depth ambiguity. A similar example can also be found in Bailey (2003) but here only the initial velocity 0.25m/s of the platform is known. Left: Landmarks are initialised at approximately the true range and the resulting estimate is consistent. Right: The landmarks are initialised at two times the true range and the estimated trajectory scales accordingly giving a biased estimate. This corresponds to the velocity estimate of the platform being 0.5m/s. A calibrated stereo camera gives the ability to directly measure 3D coordinates of landmarks in metrical space. However, the baseline between cameras in a stereo rig are physically limited, thus the depth resolution is often limited to close range scenarios since the range resolution itself decreases with range. Motion-stereo does not have this limitation since the baseline is created arbitrary by motion. Obvious downsides of motion-stereo is the need for recovery of the pose, e.g., using epipolar geometry, and the inability to estimate the scale which is due to the depth ambiguity which is illustrated in Figure 4.5. A comparison of stereo and monocular vision based SLAM approaches are presented in Lemaire et al. (2007) and the introductory chapters in Chli (2009) gives a nice overview of vision based SLAM in general. Apart from SLAM applications the combination of an IMU and monocular camera can be used for: • IMU supported structure from motion (SFM); • vision supported inertial navigation, i.e., ego-motion estimation; 4.4 59 IMU and Camera • loosly or tightly coupled SFM; • camera IMU fusion for augmented realitys and; • image stabilisation. In the context of SLAM, observability is a non-negligible issue and some examples are the case of constant velocity Bryson and Sukkarieh (2008), motion along the optical axis or if landmarks are far away. Short term unobservability is not problematic in filtering problems but may lead to complete failure in batched solutions due to rank deficiency. Simple means for avoiding some of these problems are to use measurement counters. It is however non-trivial to characterise the unobservable subspace in SLAM completely which the substantial amount of publications on this issue is a proof of, see e.g., (Lupton and Sukkarieh, 2012; Hesch et al., 2013; Perera et al., 2009; Kim and Sukkarieh, 2004; Andrade-Cetto and Sanfeliu, 2005; Lee et al., 2006; Wang and Dissanayake, 2008) and the many references therein. Observability is also highlighted in the vision aided IMU systems of Mourikis and Roumeliotis (2007); Mourikis et al. (2009); Li and Mourikis (2013, 2012a); Dong-Si and Mourikis (2012); Li and Mourikis (2012b) which operates without explicit landmark parametrisation and instead keeps old poses in a sliding window expressed as constraints through shared observations. 4.4.1 Linear Triangulation A monocular camera in combination with an IMU can be used to estimate the pose and to recover the metric scale, a topic which have studied by Martinelli (2012); Nützi et al. (2011); Kneip et al. (2011) and others. It was showed in Martinelli (2012) that given measurements of a landmark from five distinct vantage points with known rotation it is possible to recover the landmark position, the camera position and velocity, and accelerometer bias in closed form. His results are based on well-known linear formulations of the direct parametrisation (2.48) but without considering noise. This formulation was used in Paper D including noise terms and will now be explained. Camera coordinates and the pinhole projection are given by mc = Rcw (mw − cw t ), " c# 1 x P (mc ) = c c , z y (4.20a) (4.20b) which also can be written as a linear form. With normalised measurements [u, v]T we have " # " c# u c x z = c , (4.21) v y which is linear in the unknown mc and also in mw and cw t if the rotation is known. We can solve (4.21) w.r.t., camera and landmark coordinates with enough measurements available. However, for each new 2D camera measurement there is also another unknown 3D camera position. A solution is to use approximate positions calculated from the accelerometer. This results in a linear system in the 60 4 SLAM unknown mw and we need at least two camera measurement from distinct vantage points to find a solution. In this linear formulation, the camera measurement noise does not reflect depth uncertainty in a proper manner since we have " # " c# " # " # " c# u c x e u − eu c x z = c + z c u ⇐⇒ z = c , (4.22) v y ev v − ev y meaning that the noise scales with the depth parameter. This can be treated using iteratively reweighted least squares (IRWLS), see e.g., Björck (1996) by updating the camera measurement covariance weighting matrix at each iterate using ẑ c from the previous iterate. The key is that we bypass solving the nonlinear reprojection error minimisation problem by solving a simple linear problem. This method is algebraic in the sense that it does not consider minimisation of the measurement errors since the projective model is not directly used to form the error. However, the linear solution also minimise the re-projection error if the correct weighting is used (Zhang and Kanade, 1998; Hartley and Zisserman, 2004) which here corresponds to that ẑ c is sufficiently close to its true value. This section finishes with an example of IRWLS triangulation, inspired by the paper (Hartley and Sturm, 1997). Example 4.1: IRWLS Triangulation Consider two 3×4 camera matrices, P and P 0 with known rotation and translation both measuring a 3D point m. The two cameras and the point are then related by λ[u, v, 1]T = P [mT 1]T = P x, 0 0 0 T 0 T T 0 λ [u , v , 1] = P [m 1] = P x. (4.23a) (4.23b) The λ’s are unknown scalars accounting for the projective ambiguity. They can be eliminated from the last rows of the two equation systems. After some algebra the unknown 3D point can be expressed as a linear system Ax = 0 where A = [p1 − up3 , p2 − vp3 , p10 − u 0 p30 , p2 − v 0 p30 ]T is a 4 × 4 matrix and the vectors, pi , pi0 , i = 1, 2, 3, are the rows of the camera matrices. This equation system is solved by computing the SVD of A from which the point is recovered from the singular vector σ [m̂T , 1] which corresponds to the smallest singular value σ . More measurements can be added as rows in A to further suppress noise. The minimum of ||Ax|| does not have a geometric meaning. At the minimum there will be an error = up3T − p1T x. The re-projection error of the measured image coordinate u and x is given by 0 = u − p1T x/xp3T = /p3T . Thus, if the first row of A was multiplied with w = 1/p3T x and similarly for the other rows, then the minimum of the linear method would correspond to the one of the nonlinear reprojection error. This is not possible to do since it requires x to be known. Instead, we can iterate the linear procedure starting with unit weights and then use the weights from the previous iteration and hopefully converge to a solution with an error close to the one of the re-projection error. The algorithm can for instance be terminated when the change in the weights is small. The setup is the following: The point is located in m∗ = [1, 1, 2]T , the first camera 4.4 61 IMU and Camera P in the origin, the second camera P 0 is translated to = [0, 1, 30]T and both of them without rotation. Each camera receives 10 measurements subject to noise e ∼ N (0, 0.01I2 ). Results for M = 100 simulations and 3 iterations are shown in Table 4.1 where it is clear that the iterations gives an improvement. Iteration Average RMSE 1 [M q PM i km̂i − m∗ k] 0 1 2 3 0.0274 0.0203 0.0202 0.0202 Table 4.1: Iteratively reweighted least squares triangulation. The strength of this method is that it is simple to compute and no special initialisation is needed, contrary to nonlinear optimisation methods. A disadvantage is that the method may fail to converge (Hartley and Sturm, 1997) if, for example, the baseline is very short which results in a poorly conditioned system. 5 Concluding remarks This chapter ends the first part of the thesis which consisted of background material. The work in the whole thesis is summarised here with conclusions of the publications in Part II and suggested directions for future work. For detailed conclusions and future work, the reader is referred to each of the appended publication. 5.1 Conclusions The common denominator for the problems studied in this thesis is that they deal with various aspects of navigation and mapping in unknown environments. Calibration of sensor networks is important for the network’s detection and tracking capabilities. Underwater sensor positions can be difficult to obtain in fast deployment scenarios and sensors can also move due to currents or a non-rigid seabed. An automatic and inexpensive EKF-SLAM method for underwater sensor network positioning without the need for GNSS was presented. Using only magnetic sensors and a vessel with known magnetic signature the sensor positions and the vessel’s route was determined. The expected performance of the method and the network was studied using sensitivity- and CRLB analysis on simulated data. This analysis could also be used for sensor network design. ROV’s cannot utilise GNSS for localisation because these signals are greatly attenuated in water. Due to their limited payload capacity, and in order to have a competitive price, the onboard navigation sensors are relatively cheap. We showed that fusion of a complex hydrodynamic model of the ROV with onboard sensor data can improve the navigation performance and the robustness to sensor failure. Experimental results from sea trials showed that in particular the vehicle 63 64 5 Concluding remarks speed can be accurately estimated. The combination of inertial sensor and optical camera can be used both for navigational and mapping purposes. Experiments and simulations indicated that metrically correct estimates can be obtained. The structure of the batch SLAM problem was exploited and solved with NLS and EM utilising efficient optimisation routines. The batch methods requires a good initial estimate for avoiding convergence to a local minimum. A multistage initialisation procedure for batch SLAM was proposed where a series of almost uncoupled and simple problems were solved. 5.2 Future Work Underwater sensor networks, and sensor networks in general, are important for monitoring and surveillance. The simulated setup in Paper A can certainly benefit from experimental validation which, however, could be rather expensive. A natural extension is to try NLS-SLAM minimisation of the whole data batch to find out if the results can be improved. Increased navigation performance and robustness w.r.t., disturbances is always desirable. The ROV model in Paper C is used in conjunction with the onboard sensor for this task. The model for rotational dynamics can be improved, perhaps by using speed dependent damping and stronger coupling in the inertia matrix. These, and other, ideas should first be explored using simulations and then preferably in controlled tank tests. SLAM in general is a mature and diversified field of research. In the near future perhaps the gap between the robotics community and the computer vision community can be shortened. For this to happen, more work on the similarities between the two areas are needed. For instance, it is not that common to see model-based filters in computer vision applications. System analysis concepts such as observability, controllability and robustness are often difficult to apply to SLAM systems directly. It would therefore be useful to have easy-to-use tools for SLAM system analysis as to guide design and evaluation on a general basis. For the specific application of SLAM with IMU and vision more work on performance bounds, along the line of Nyqvist and Gustafsson (2013), is desirable. The coupling between filtering and optimisation is an interesting area for more research. Perhaps the convergence rate of Rao-Blackwellized particle filters could be improved by treating some of the nonlinearities using IEKF-L. Iterated smoothers, such as moving horizon estimation further gives the possibility to handle constraints in a systematic manner. Bibliography J. Andrade-Cetto and A. Sanfeliu. The effects of partial observability when building fully correlated maps. IEEE Transactions on Robotics, 21(4):771–777, Aug. 2005. ISSN 1552-3098. doi: 10.1109/TRO.2004.842342. M. Axholt, M. Skoglund, S. D. Peterson, M. D. Cooper, T. B. Schön, F. Gustafsson, A. Ynnerman, and S. R. Ellis. Optical see-through head mounted display direct linear transformation calibration robustness in the presence of user alignment noise. In Proceedings of the 54th Annual Meeting of the Human Factors and Ergonomics Society, volume 54, pages 2427–2431, San Francisco, CA, USA, 271 Sept./Oct. 2010. Human Factors and Ergonomics Society. M. Axholt, M. A. Skoglund, S. D. O’Connell, M. D. Cooper, S. R. Ellis, and A. Ynnerman. Parameter estimation variance of the single point active alignment method in optical see-through head mounted display calibration. In Proceedings of the IEEE Virtual Reality Conference, pages 27–34, Singapore, Republic of Singapore, Mar. 2011. T. Bailey. Constrained initialisation for bearing-only SLAM. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), volume 2, pages 1966–1971, Taipei, Taiwan, 14-19 Sept. 2003. doi: 10.1109/ ROBOT.2003.1241882. Y. Bar-Shalom and X.-R. Li. Estimation and tracking : principles, techniques and software. Artech House, Boston, 1993. ISBN 0-89006-643-4. B. Bell and F. Cathey. The iterated Kalman filter update as a Gauss-Newton method. IEEE Transactions on Automatic Control, 38(2):294–297, Feb. 1993. ISSN 0018-9286. doi: 10.1109/9.250476. D. Bertsekas. Incremental least squares methods and the extended Kalman filter. In Proceedings of the 33rd IEEE Conference on Decision and Control, volume 2, pages 1211–1214, Lake Buena Vista, Florida, USA, 14-16 Dec. 1994. doi: 10. 1109/CDC.1994.411166. P. Besl and N. D. McKay. A method for registration of 3-D shapes. IEEE Transac65 66 Bibliography tions on Pattern Analysis and Machine Intelligence, 14(2):239–256, Feb. 1992. ISSN 0162-8828. doi: 10.1109/34.121791. C. Bibby and I. Reid. Simultaneous localisation and mapping in dynamic environments (SLAMIDE) with reversible data association. In Proceedings of Robotics: Science and Systems, Atlanta, GA, USA, 27-30 June 2007. Å. Björck. Numerical Methods for Least Squares Problems. SIAM, 1996. ISBN 0-89871-360-9. M. Bosse and R. Zlot. Map matching and data association for large-scale twodimensional laser scan-based SLAM. International Journal of Robotic Research, 27(6):667–691, 2008. J.-Y. Bouguet. Camera Calibration Toolbox for Matlab. www.vision.caltech. edu/bouguetj/calib_doc/, 2010. S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004. URL http://www.stanford.edu/~boyd/bv_cvxbook.pdf. S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3(1):1–122, Jan. 2011. ISSN 1935-8237. doi: 10.1561/2200000016. URL http://dx.doi.org/10. 1561/2200000016. K. Britting. Inertial Navigation Systems Analysis. John Wiley & Sons Inc., New York, USA, 1971. M. Bryson and S. Sukkarieh. Observability analysis and active control for airborne SLAM. Aerospace and Electronic Systems, IEEE Transactions on, 44(1): 261–280, 2008. ISSN 0018-9251. doi: 10.1109/TAES.2008.4517003. M. Bryson and S. Sukkarieh. Architectures for Cooperative Airborne Simultaneous Localisation and Mapping. Journal of Intelligent and Robotic Systems, 55(4-5):267–297, 2009. doi: 10.1007/s10846-008-9303-9. URL http: //dx.doi.org/10.1007/s10846-008-9303-9. M. Bryson, A. Reid, F. Ramos, and S. Sukkarieh. Airborne vision-based mapping and classification of large farmland environments. Journal of Field Robotics, 27(5):632–655, Sept. 2010. ISSN 1556-4959. doi: 10.1002/rob.v27:5. URL http://dx.doi.org/10.1002/rob.v27:5. F. Caballero, L. Merino, J. Ferruz, and A. Ollero. Vision-based odometry and SLAM for medium and high altitude flying UAVs. Journal of Intelligent and Robotics Syststems, 54(1-3):137–161, 2009. ISSN 0921-0296. doi: http://dx. doi.org/10.1007/s10846-008-9257-y. J. Callmer, K. Granström, J. Nieto, and F. Ramos. Tree of words for visual loop closure detection in urban SLAM. In J. Kim and R. Mahony, editors, Proceedings of the 2008 Australasian Conference on Robotics & Automation, Canberra, Australia, 3-5 Dec. 2008. Bibliography 67 J. Callmer, M. Skoglund, and F. Gustafsson. Silent localization of underwater sensors using magnetometers. EURASIP Journal on Advances in Signal Processing, 2010, 2010. doi: 10.1155/2010/709318. URL http://dx.doi.org/10. 1155/2010/709318. Article ID 709318. J. Callmer, D. Törnqvist, F. Gustafsson, H. Svensson, and P. Carlbom. RADAR SLAM using visual features. EURASIP Journal on Advances in Signal Processing, 2011(71), 2011. M. Chli. Applying Information Theory to Efficient SLAM. PhD thesis, Imperial College London, 2009. J. Choi, S. Ahn, and W. K. Chung. Robust sonar feature detection for the SLAM of mobile robot. In Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems, pages 3415–3420, Edmonton, Alberta, Canada, 2-6 Aug. 2005. doi: 10.1109/IROS.2005.1545284. P. Corke, J. Lobo, and J. Dias. An introduction to inertial and visual sensing. The International Journal of Robotics, 26:519–535, 2007. M. Cummins and P. Newman. Appearance-only SLAM at large scale with FAB-MAP 2.0. The International Journal of Robotics Research, 2010. doi: 10.1177/0278364910385483. URL http://ijr.sagepub.com/content/ early/2010/11/11/0278364910385483.abstract. A. J. Davison. Real-time simultaneous localisation and mapping with a single camera. In Proceedings of the 9th IEEE International Conference on computer vision, pages 1403–1410, Nice, France, 13-16 Oct. 2003. A. J. Davison, I. D. Reid, N. D. Molton, and O. Stasse. MonoSLAM: Real-time single camera SLAM. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6):1052–1067, 2007. F. Dellaert and M. Kaess. Square Root SAM: Simultaneous location and mapping via square root information smoothing. International Journal of Robotics Research (IJRR), 25(12):1181–1203, 2006. URL http://www.cc.gatech.edu/ ~dellaert/pubs/Dellaert06ijrr.pdf. Special issue on RSS 2006. F. Dellaert, S. Seitz, C. Thorpe, and S. Thrun. EM, MCMC, and chain flipping for structure from motion with unknown correspondence. Machine Learning, 50 (1-2):45–71, 2003. A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1):1–38, 1977. ISSN 00359246. doi: 10.2307/2984875. URL http://web.mit.edu/6.435/www/Dempster77.pdf. J. Dennis and R. Schnabel. Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Prentice Hall, 1983. T. Dong-Si and A. I. Mourikis. Initialization in vision-aided inertial navigation with unknown camera-imu calibration. In Proceedings of the IEEE/RSJ Inter- 68 Bibliography national Conference on Robotics and Intelligent Systems (IROS), pages 1064– 1071, Vilamoura, Portugal, October 2012. H. Durrant-Whyte and T. Bailey. Simultaneous Localization and Mapping: Part I. IEEE Robotics and Automation Magazine, 13(12):99–110, June 2006. E. Eade. Monocular Simultaneous Localisation and Mapping. PhD thesis, Cambridge University, 2008. E. Eade, P. Fong, and M. Munich. Monocular graph SLAM with complexity reduction. In Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on, Taipei, Taiwan, 18-22 Oct. 2010. R. Eustice, H. Singh, J. Leonard, and M. Walter. Visually mapping the RMS Titanic: Conservative covariance estimates for SLAM information filters. International Journal of Robotics Research, 25(12):1223–1242, December 2006. M. A. Fischler and R. C. Bolles. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM, 24(6):381–395, June 1981. ISSN 0001-0782. doi: 10.1145/358669. 358692. URL http://doi.acm.org/10.1145/358669.358692. R. A. Fisher. On an absolute criterion for fitting frequency curves. Messenger of Mathematics, 41:155–160, 1912. J. Fossel, D. Hennes, D. Claes, S. Alers, and K. Tuyls. OctoSLAM: A 3D mapping approach to situational awareness of unmanned aerial vehicles. In Unmanned Aircraft Systems (ICUAS), 2013 International Conference on, pages 179–188, May 2013. doi: 10.1109/ICUAS.2013.6564688. T. I. Fossen. Handbook of Marine Craft Hydrodynamics and Motion Control. John Wiley & Sons Ltd., 2011. ISBN 978-1-1199-9149-6. F. Fraundorfer and D. Scaramuzza. Visual odometry: Part II: Matching, robustness, optimization, and applications. IEEE Robotics and Automation Magazine, 19(2):78–90, 2012. ISSN 1070-9932. doi: 10.1109/MRA.2012.2182810. F. Gerossier, P. Checchin, C. Blanc, R. Chapuis, and L. Trassoudaine. Trajectoryoriented EKF-SLAM using the Fourier-Mellin transform applied to microwave radar images. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 4925–4930, St. Louis, MO, USA, 11-15 Oct. 2009. doi: 10.1109/IROS.2009.5354548. G. Grisetti, R. Kümmerle, C. Stachniss, and W. Burgard. A tutorial on graphbased SLAM. 2(4):31–43, 2011. ISSN 1939-1390. doi: 10.1109/MITS.2010. 939925. F. Gustafsson. Statistical Sensor Fusion. 2012. ISBN 9789144077321. Utbildningshuset/Studentlitteratur, S. W. Hamilton. On quaternions; or on a new system of imaginaries in algebra. Philosophical Magazine, xxv:10–13, July 1844. Bibliography 69 J. Han and J. Kim. Navigation of an unmanned surface vessel under bridges. In Proceedings of the International Conference on Ubiquitous Robots and Ambient Intelligence (URAI), pages 206–210, Jeju, Korea, 3-2 Oct./Nov. 2013. doi: 10.1109/URAI.2013.6677343. C. Harris and M. Stephens. A combined corner and edge detector. In proceedings of the 4th Alvey Conference, pages 147–151, Manchester, UK, 1988. R. I. Hartley. Chirality. International Journal of Computer Vision, 26(1):41–61, 1998. R. I. Hartley and P. Sturm. Triangulation. Computer Vision and Image Understanding, 68(2):146 – 157, 1997. ISSN 1077-3142. doi: 10.1006/cviu.1997. 0547. URL http://www.sciencedirect.com/science/article/pii/ S1077314297905476. R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, second edition, 2004. ISBN 0-521-54051-8. T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer-Verlag, 2 edition, 2009. J. Hedborg, E. Ringaby, P.-E. Forssen, and M. Felsberg. Structure and motion estimation from rolling shutter video. In IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pages 17–23, Barcelona, Spain, Nov. 2011. doi: 10.1109/ICCVW.2011.6130217. J. Hedborg, P.-E. Forssén, M. Felsberg, and E. Ringaby. Rolling shutter bundle adjustment. In IEEE Conference on Computer Vision and Pattern Recognition, Providence, Rhode Island, USA, June 2012. ISBN 978-1-4673-1227-1. http://dx.doi.org/10.1109/CVPR.2012.6247831. J. A. Hesch, D. G. Kottas, S. L. Bowman, and S. I. Roumeliotis. CameraIMU-based localization: Observability analysis and consistency improvement. The International Journal of Robotics Research, 2013. doi: 10.1177/ 0278364913509675. URL http://ijr.sagepub.com/content/early/ 2013/11/13/0278364913509675.abstract. J. Hol. Sensor Fusion and Calibration of Inertial Sensors, Vision, Ultra-Wideband and GPS. Linköping studies in science and technology. Dissertations. no. 1368, Linköping University, The Institute of Technology, June 2011. H. Hopf. Systeme symmetrischer Bilinearformen und euklidische Modelle der projektiven Räume. Vierteljahrsshrift der Naturforschenden Gesellenschaft in Zürich, 85(Beiblatt (Festschrift Rudolf Fueter)):165–177, 1940. A. H. Jazwinski. Stochastic Processes and Filtering Theory, volume 64 of Mathematics in Science and Engineering. Academic Press, Inc, 1970. K. Jönsson. Position Estimation of Remotely Operated Underwater Vehicle. Master’s thesis, Linköping University, 2010. 70 Bibliography I.-K. Jung and S. Lacroix. Simultaneous localization and mapping with stereovision. In Proceedings of The Eleventh International Symposium Robotics Research, volume 15 of Springer Tracts in Advanced Robotics, pages 315–324, Siena, Italy, 19-22 Oct. 2003. Springer. ISBN 978-3-540-23214-8. S.-H. Jung and C. Taylor. Camera trajectory estimation using inertial sensor measurements and structure from motion results. In Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on, volume 2, pages II–732–II–737 vol.2, 2001. doi: 10.1109/CVPR.2001.991037. T. Kailath, A. H. Sayed, and B. Hassibi. Linear Estimation. Prentice-Hall, Upper Saddle River, New Jersey, 2000. Z. Kalal, K. Mikolajczyk, and J. Matas. Forward-backward error: Automatic detection of tracking failures. In Proceedings of the International Conference on Pattern Recognition (ICPR), pages 2756–2759, Istanbul, Turkey, 23-26 Aug. 2010. doi: 10.1109/ICPR.2010.675. A. Karlsson and J. Bjärkefur. Simultaneous Localisation and Mapping of Indoor Environments Using a Stereo Camera and a Laser Camera. Master’s thesis, Department of Electrical Engineering, Linköping University, Linköping, Sweden, 2010. R. Karlsson, T. Schön, D. Törnqvist, G. Conte, and F. Gustafsson. Utilizing model structure for efficient simultaneous localization and mapping for a UAV application. In Proceedings of the IEEE Aerospace Conference, Big Sky, MT, USA, Mar. 2008. A. Kim and R. Eustice. Pose-graph visual SLAM with geometric model selection for autonomous underwater ship hull inspection. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1559–1565, St. Louis, MO, USA, 11-15 Oct. 2009. doi: 10.1109/IROS. 2009.5354132. J. Kim and S. Sukkarieh. Improving the real-time efficiency of inertial SLAM and understanding its observability. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), volume 1, pages 21–26, Sendai, Japan, 28-2 Sept./Oct. 2004. doi: 10.1109/IROS.2004.1389323. R. Kim, Ayoungand Eustice. Real-time visual SLAM for autonomous underwater hull inspection using visual saliency. IEEE Transactions on Robotics, 29(3): 719–733, June 2013. ISSN 1552-3098. doi: 10.1109/TRO.2012.2235699. G. Klein and D. Murray. Parallel tracking and mapping for small AR workspaces. In Proceedings of the International Symposium on Mixed and Augmented Reality (ISMAR), pages 225–234, Nara, Japan, 2007. L. Kneip, A. Martinelli, S. Weiss, D. Scaramuzza, and R. Siegwart. Closed-form solution for absolute scale velocity determination combining inertial measurements and a single feature correspondence. In Proceedings of the International Bibliography 71 Conference on Robotics and Automation, Shanghai, China, May 2011. URL http://hal.inria.fr/hal-00641772. K. Konolige and M. Agrawal. FrameSLAM: From bundle adjustment to real-time visual mapping. IEEE Transactions on Robotics, 24(5):1066–1077, 2008. J. Kuipers. Quaternions and Rotations Sequences. Princeton University Press, 2002. K. W. Lee, W. Wijesoma, and I. Javier. On the observability and observability analysis of SLAM. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 3569–3574, Beijing, China, 9-15 Oct. 2006. doi: 10.1109/IROS.2006.281646. T. Lemaire, C. Berger, I.-K. Jung, and S. Lacroix. Vision-based SLAM: Stereo and monocular approaches. International Journal of Computer Vision, 74(3):343– 364, Sept. 2007. ISSN 0920-5691. doi: 10.1007/s11263-007-0042-3. URL http://dx.doi.org/10.1007/s11263-007-0042-3. H. Li and R. Hartley. Five-point motion estimation made easy. In Proceedings of the International Conference on Pattern Recognition, volume 1, pages 630–633, Hong Kong, 20-24 Aug. 2006. doi: 10.1109/ICPR.2006.579. M. Li and A. I. Mourikis. Improving the accuracy of EKF-based visual-inertial odometry. In Proceedings of the IEEE International Conference on Robotics and Automation, pages 828–835, Minneapolis, MN, May 2012a. M. Li and A. I. Mourikis. Optimization-based estimator design for vision-aided inertial navigation. In Proceedings of Robotics: Science and Systems, Sydney, Australia, July 2012b. M. Li and A. I. Mourikis. High-precision, consistent EKF-based visual-inertial odometry. International Journal of Robotics Research, 32(6):690–711, May 2013. ISSN 0278-3649. doi: 10.1177/0278364913481251. URL http: //dx.doi.org/10.1177/0278364913481251. H. Longuet-Higgins. A computer algorithm for reconstructing a scene from two projections. Nature, 293:133–135, Sept. 1981. D. Lowe. Object Recognition from Local Scale-Invariant Features. In Proceedings of the Seventh International Conference on Computer Vision (ICCV’99), pages 1150–1157, Corfu, Greece, 1999. F. Lu and E. Milios. Globally consistent range scan alignment for environment mapping. Autonomous Robots, 4:333–349, 1997. C. Lundquist, M. A. Skoglund, K. Granström, and T. Glad. Insights from implementing a system for peer-review. IEEE Transactions on Education, 56(3): 261–267, 2013. T. Lupton and S. Sukkarieh. Removing scale biases and ambiguity from 6DoF monocular SLAM using inertial. In Proceedings of the International Confer- 72 Bibliography ence on Robotics and Automation (ICRA), pages 3698–3703, Pasadena, California, USA, 2008. IEEE. doi: 10.1109/ROBOT.2008.4543778. URL http: //dx.doi.org/10.1109/ROBOT.2008.4543778. T. Lupton and S. Sukkarieh. Efficient integration of inertial observations into visual SLAM without initialization. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 1547–1552, St. Louis, MO, USA, 11-15 Oct. 2009. IEEE. doi: 10.1109/IROS.2009.5354267. URL http://dx.doi.org/10.1109/IROS.2009.5354267. T. Lupton and S. Sukkarieh. Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions. IEEE Transactions on Robotics, 28(1):61 –76, Feb. 2012. ISSN 1552-3098. doi: 10.1109/TRO.2011. 2170332. M. Kaess and A. Ranganathan and F. Dellaert. iSAM: Incremental smoothing and mapping. IEEE Transactions on Robotics, 24(6):1365–1378, Dec. 2008. Y. Ma, S. Soatto, J. Kosecka, and S. S. Sastry. An Invitation to 3-D Vision: From Images to Geometric Models. Springer Verlag, 2003. ISBN 0387008934. I. Mahon, S. B. Williams, O. Pizarro, and M. Johnson-Roberson. Efficient viewbased SLAM using visual loop closures. IEEE Transactions on Robotics, 24(5): 1002–1014, 2008. J. Marck, A. Mohamoud, E. v.d.Houwen, and R. van Heijster. Indoor radar SLAM a radar application for vision and gps denied environments. In Proceedings of the European Radar Conference (EuRAD). D. W. Marquardt. An algorithm for least-squares estimation of nonlinear parameters. SIAM Journal on Applied Mathematics, 11(2):431–441, 1963. doi: 10.1137/0111030. URL http://dx.doi.org/10.1137/0111030. A. Martinelli. Vision and imu data fusion: Closed-form solutions for attitude, speed, absolute scale, and bias determination. IEEE Transactions on Robotics, 28(1):44 –60, Feb. 2012. ISSN 1552-3098. doi: 10.1109/TRO.2011.2160468. J. Mattingley and S. Boyd. Real-time convex optimization in signal processing. IEEE Signal Processing Magazine, 27(3):50–61, 2010. ISSN 1053-5888. doi: 10.1109/MSP.2010.936020. M. Montemerlo, S. Thrun, D. Koller, and B. Wegbreit. FastSLAM: A factored solution to the simultaneous localization and mapping problem. In Proceedings of the AAAI National Conference on Artificial Intelligence, page 593–598, Edmonton, Alberta, Canada, 28-1 July/Aug. 2002. M. Montemerlo, S. Thrun, D. Koller, and B. Wegbreit. FastSLAM 2.0: An improved particle filtering algorithm for simultaneous localization and mapping that provably converges. In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (IJCAI), Acapulco, Mexico, 9-15 Aug. 2003. Bibliography 73 J. Montiel and A. Davison. A visual compass based on SLAM. In Proceedings the IEEE International Conference on Robotics and Automation (ICRA), pages 1917–1922, Orlando, Florida, USA, May 2006. doi: 10.1109/ROBOT.2006. 1641986. A. Mourikis and S. Roumeliotis. A multi-state constraint Kalman filter for visionaided inertial navigation. In Proceedings of the IEEE International Conference on Robotics and Automation, Roma, Italy, 10-14 Apr. 2007. A. I. Mourikis, N. Trawny, S. I. Roumeliotis, A. E. Johnson, A. Ansar, and L. Matthies. Vision-aided inertial navigation for spacecraft entry, descent, and landing. IEEE Transactions on Robotics, 25(2):264 – 280, 2009. ISSN 15523098. URL https://lt.ltag.bibl.liu.se/login?url=http: //search.ebscohost.com/login.aspx?direct=true&db=buh&AN= 38716223&site=ehost-live. J. Mullane, S. Keller, A. Rao, M. Adams, A. Yeo, F. Hover, and N. Patrikalakis. X-band radar based SLAM in Singapore’s off-shore environment. In Proceedings of the International Conference on Control Automation Robotics Vision (ICARCV), Singapore, 7-10 Dec. 2010. P. M. Newman, J. J. Leonard, and R. J. Rikoski. Towards constant-time SLAM on an autonomous underwater vehicle using synthetic aperture sonar. In Proceedings of the Eleventh International Symposium on Robotics Research, pages 409–420, Siena, Italy, 19-22 Oct. 2003. Springer Verlag. E. Nilsson. An optimization based approach to visual odometry using infrared images. Master’s thesis, Department of Electrical Engineering, Linköping University, Linköping, Sweden, 2010. E. Nilsson, C. Lundquist, T. B. Schön, D. Forslund, and J. Roll. Vehicle motion estimation using an infrared camera. In Proceedings of the IFAC World Congress, Milan, Italy, Aug./Sept. 2011. D. Nistér. Reconstruction from uncalibrated sequences with a hierarchy of trifocal tensors. In Proceedings of the 6th European Conference on Computer Vision-Part I, ECCV ’00, pages 649–663, London, UK, UK, 2000. SpringerVerlag. ISBN 3-540-67685-6. URL http://dl.acm.org/citation.cfm? id=645313.649415. D. Nistér. An efficient solution to the five-point relative pose problem. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(6):756–770, 2004. ISSN 0162-8828. doi: 10.1109/TPAMI.2004.17. J. Nocedal. Updating quasi-Newton matrices with limited storage. Mathematics of Computation, 35(151):773–782, 1980. URL http://www.jstor.org/ stable/2006193. J. Nocedal and S. J. Wright. Numerical Optimization. Springer, New York, 2nd edition, 2006. ISBN 978-0-387-40065-5. 74 Bibliography G. Nützi, S. Weiss, D. Scaramuzza, and R. Siegwart. Fusion of IMU and vision for absolute scale estimation in monocular SLAM. Journal of Intelligent Robotics Systems, 61:287–299, Jan. 2011. ISSN 0921-0296. doi: http: //dx.doi.org/10.1007/s10846-010-9490-z. URL http://dx.doi.org/10. 1007/s10846-010-9490-z. H. Nyqvist and F. Gustafsson. A high-performance tracking system based on camera and imu. In Proceedings of the 16th International Conference on Information Fusion, Istanbul, Turkey, 9-12 July 2013. M. Paskin. Thin junction tree filters for simultaneous localization and mapping. In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence (IJCAI-03), pages 1157–1164, San Francisco, CA, 2003. L. Perera, A. Melkumyan, and E. Nettleton. On the linear and nonlinear observability analysis of the SLAM problem. pages 1–6, Málaga, Spain, 14-17 Apr. 2009. doi: 10.1109/ICMECH.2009.4957168. M. Pfingsthorn and A. Birk. Simultaneous localization and mapping (slam) with multimodal probability distributions. The International Journal of Robotics Research, 2012. doi: 10.1177/0278364912461540. URL http://ijr.sagepub.com/content/early/2012/10/05/ 0278364912461540.abstract. P. Pinies, T. Lupton, S. Sukkarieh, and J. Tardos. Inertial aiding of inverse depth SLAM using a monocular camera. In Proceedings of the IEEE International Conference on Robotics and Automation, pages 2797–2802, Roma, Italy, 10-14 Apr. 2007. C. Rao. Moving Horizon Strategies for the Constrained Monitoring and Control of Nonlinear Discrete-Time Systems. PhD thesis, University of Wisconsin Madison, 2000. H. E. Rauch, S. C. T., and T. F. Maximum likelihood estimates of linear dynamic systems. AIAA Journal, 3(8):1445–1450, 1965. D. Ribas, P. Ridao, J. Neira, and J. Tardos. SLAM using an imaging sonar for partially structured underwater environments. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China, 915 Oct. 2006. D. Scaramuzza and F. Fraundorfer. Visual odometry : Part I: The first 30 years and fundamentals. IEEE Robotics and Automation Magazine, 18(4):80–92, 2011. ISSN 1070-9932. doi: 10.1109/MRA.2011.943233. D. Scaramuzza, A. Martinelli, and R. Siegwart. A toolbox for easily calibrating omnidirectional cameras. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China, 9-15 Oct. 2006. J. Shi and C. Tomasi. Good features to track. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Bibliography 75 pages 593–600, Seattle, WA, USA, 21-23 June 1994. doi: 10.1109/CVPR.1994. 323794. M. D. Shuster. Survey of attitude representations. Journal of the Astronautical Sciences, 41(4):439–517, 1993. M. D. Shuster. Constraint in attitude estimation part II: Unconstrained estimation. The Journal of the Astronautical Sciences, 51(1):75–101, Jan.–Mar. 2003. Z. Sjanic and F. Gustafsson. Simultaneous navigation and SAR auto-focusing. In Proceedings of the International Conference on Information Fusion (FUSION), pages 1–7, Edinburgh, UK, 26-29 July 2010. Z. Sjanic, M. A. Skoglund, F. Gustafsson, and T. B. Schön. A nonlinear least squares approach to the SLAM problem. In Proceedings of the IFAC World Congress, volume 18, Milan, Italy, 28-2 Aug./Sept. 2011. Z. Sjanic, M. A. Skoglund, and F. Gustafsson. EM-SLAM with inertial/visual applications. Submitted to IEEE Transactions on Aerospace and Electronic Systems, June 2014. M. A. Skoglund, K. Jönsson, and F. Gustafsson. Modeling and sensor fusion of a remotely operated underwater vehicle. In Proceedings of the 15th International Conference on Information Fusion (FUSION), Singapore, 9-12 July 2012, pages 947–954. IEEE, 2012. M. A. Skoglund, Z. Sjanic, and F. Gustafsson. Initialisation and estimation methods for batch optimisation of inertial/visual SLAM. Submitted to IEEE Transactions on Aerospace and Electronic Systems, June 2014. R. Smith, M. Self, and P. Cheeseman. Estimating uncertain spatial relationships in robotics. In Autonomous robot vehicles, pages 167–193. Springer-Verlag New York, Inc., New York, NY, USA, 1990. ISBN 0-387-97240-4. H. Strasdat, A. Davison, J. M. M. Montiel, and K. Konolige. Double window optimisation for constant time visual SLAM. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Barcelona, Spain, 6-13 Nov. 2011. J. Stuelpnagel. On the parametrization of the three-dimensional rotation group. SIAM Review, 6(4):pp. 422–430, 1964. ISSN 00361445. URL http://www. jstor.org/stable/2027966. T. Thormählen, N. Hasler, M. Wand, and H.-P. Seidel. Merging of Feature Tracks for Camera Motion Estimation from Video. In Proceedings of the 5th European Conference on Visual Media Production (CVMP 2008), pages 1 – 8, Hertfordshire, UK, Jan. 2008. IET. S. Thrun and M. Montemerlo. The GraphSLAM algorithm with applications to large-scale mapping of urban structures. International Journal on Robotics Research, 25(5):403–430, 2006. 76 Bibliography S. Thrun, Y. Liu, D. Koller, A. Ng, Z. Ghahramani, and H. Durrant-Whyte. Simultaneous localization and mapping with sparse extended information filters. International Journal of Robotics Research, 23(7/8), 2004. S. Thrun, W. Burgard, and D. Fox. Probabilistic Robotics (Intelligent Robotics and Autonomous Agents). The MIT Press, 2005. ISBN 0262201623. R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society (Series B), 58:267–288, 1996. D. Titterton and J. Weston. Strapdown Inertial Navigation Technology. IEE radar, sonar, navigation and avionics series. Peter Peregrinus Ltd., Stevenage, UK, 1997. ISBN 0863413587. D. Törnqvist. Estimation and Detection with Applications to Navigation. Linköping Studies in Science and Technology. Dissertations. No. 1216, Linköping University, Nov. 2008. P. Torr and A. Zisserman. MLESAC: A new robust estimator with application to estimating image geometry. Computer Vision and Image Understanding, 78(1):138 – 156, 2000. ISSN 1077-3142. doi: 10.1006/cviu.1999. 0832. URL http://www.sciencedirect.com/science/article/pii/ S1077314299908329. U. Frese. Treemap: An O(log n) algorithm for simultaneous localization and mapping. In C. Freksa, editor, Spatial Cognition IV, pages 455–476. Springer Verlag, 2005. URL http://www.informatik.uni-bremen.de/~ufrese/ slamvideos1_e.html. N. Wahlström. Localization using Magnetometers and Light Sensors. Linköping Studies in Science and Technology. Thesis No. 1581, Linköping University, Sweden, 2013. Z. Wang and G. Dissanayake. Observability analysis of SLAM using Fisher information matrix. In Proceedings of the International Conference on Control, Automation, Robotics and Vision (ICARCV), pages 1242–1247, Hanoi, Vietnam, 17-20 Dec. 2008. doi: 10.1109/ICARCV.2008.4795699. Z. Wang and G. Dissanayake. Exploiting vehicle motion information in monocular SLAM. In Proceedings of the International Conference on Control Automation Robotics Vision (ICARCV), pages 1030–1035, Guangzhou, China, 5-7 Dec. 2012. doi: 10.1109/ICARCV.2012.6485299. W. Wijesoma, L. Perera, and M. Adams. Toward multidimensional assignment data association in robot localization and mapping. IEEE Transactions on Robotics, 22(2):350 – 365, Apr. 2006. ISSN 1552-3098. doi: 10.1109/TRO. 2006.870634. K. Wrobel. SLAM-based approach to dynamic ship positioning. TransNav, the International Journal on Marine Navigation and Safety of Sea Transportation, 8(1):21–25, 2014. Bibliography 77 Z. Zhang. A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11):1330–1334, Nov. 2000. ISSN 0162-8828. doi: 10.1109/34.888718. Z. Zhang and T. Kanade. Determining the epipolar geometry and its uncertainty: A review. International Journal of Computer Vision, 27:161–195, 1998. Part II Publications Publications The articles associated with this thesis have been removed for copyright reasons. For more details about these see: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-110373 P. Pucar: Modeling and segmentation using multiple models. Thesis No. 405, 1995. ISBN 91-7871-627-6. H. Fortell: Algebraic approaches to normal forms and zero dynamics. Thesis No. 407, 1995. ISBN 91-7871-629-2. A. Helmersson: Methods for robust gain scheduling. Thesis No. 406, 1995. ISBN 91-7871628-4. P. Lindskog: Methods, algorithms and tools for system identification based on prior knowledge. Thesis No. 436, 1996. ISBN 91-7871-424-8. J. Gunnarsson: Symbolic methods and tools for discrete event dynamic systems. Thesis No. 477, 1997. ISBN 91-7871-917-8. M. Jirstrand: Constructive methods for inequality constraints in control. Thesis No. 527, 1998. ISBN 91-7219-187-2. U. Forssell: Closed-loop identification: Methods, theory, and applications. Thesis No. 566, 1999. ISBN 91-7219-432-4. A. Stenman: Model on demand: Algorithms, analysis and applications. Thesis No. 571, 1999. ISBN 91-7219-450-2. N. Bergman: Recursive Bayesian estimation: Navigation and tracking applications. Thesis No. 579, 1999. ISBN 91-7219-473-1. K. Edström: Switched bond graphs: Simulation and analysis. Thesis No. 586, 1999. ISBN 91-7219-493-6. M. Larsson: Behavioral and structural model based approaches to discrete diagnosis. Thesis No. 608, 1999. ISBN 91-7219-615-5. F. Gunnarsson: Power control in cellular radio systems: Analysis, design and estimation. Thesis No. 623, 2000. ISBN 91-7219-689-0. V. Einarsson: Model checking methods for mode switching systems. Thesis No. 652, 2000. ISBN 91-7219-836-2. M. Norrlöf: Iterative learning control: Analysis, design, and experiments. Thesis No. 653, 2000. ISBN 91-7219-837-0. F. Tjärnström: Variance expressions and model reduction in system identification. Thesis No. 730, 2002. ISBN 91-7373-253-2. J. Löfberg: Minimax approaches to robust model predictive control. Thesis No. 812, 2003. ISBN 91-7373-622-8. J. Roll: Local and piecewise affine approaches to system identification. Thesis No. 802, 2003. ISBN 91-7373-608-2. J. Elbornsson: Analysis, estimation and compensation of mismatch effects in A/D converters. Thesis No. 811, 2003. ISBN 91-7373-621-X. O. Härkegård: Backstepping and control allocation with applications to flight control. Thesis No. 820, 2003. ISBN 91-7373-647-3. R. Wallin: Optimization algorithms for system analysis and identification. Thesis No. 919, 2004. ISBN 91-85297-19-4. D. Lindgren: Projection methods for classification and identification. Thesis No. 915, 2005. ISBN 91-85297-06-2. R. Karlsson: Particle Filtering for Positioning and Tracking Applications. Thesis No. 924, 2005. ISBN 91-85297-34-8. J. Jansson: Collision Avoidance Theory with Applications to Automotive Collision Mitigation. Thesis No. 950, 2005. ISBN 91-85299-45-6. E. Geijer Lundin: Uplink Load in CDMA Cellular Radio Systems. Thesis No. 977, 2005. ISBN 91-85457-49-3. M. Enqvist: Linear Models of Nonlinear Systems. Thesis No. 985, 2005. ISBN 91-8545764-7. T. B. Schön: Estimation of Nonlinear Dynamic Systems — Theory and Applications. Thesis No. 998, 2006. ISBN 91-85497-03-7. I. Lind: Regressor and Structure Selection — Uses of ANOVA in System Identification. Thesis No. 1012, 2006. ISBN 91-85523-98-4. J. Gillberg: Frequency Domain Identification of Continuous-Time Systems Reconstruction and Robustness. Thesis No. 1031, 2006. ISBN 91-85523-34-8. M. Gerdin: Identification and Estimation for Models Described by Differential-Algebraic Equations. Thesis No. 1046, 2006. ISBN 91-85643-87-4. C. Grönwall: Ground Object Recognition using Laser Radar Data – Geometric Fitting, Performance Analysis, and Applications. Thesis No. 1055, 2006. ISBN 91-85643-53-X. A. Eidehall: Tracking and threat assessment for automotive collision avoidance. Thesis No. 1066, 2007. ISBN 91-85643-10-6. F. Eng: Non-Uniform Sampling in Statistical Signal Processing. Thesis No. 1082, 2007. ISBN 978-91-85715-49-7. E. Wernholt: Multivariable Frequency-Domain Identification of Industrial Robots. Thesis No. 1138, 2007. ISBN 978-91-85895-72-4. D. Axehill: Integer Quadratic Programming for Control and Communication. Thesis No. 1158, 2008. ISBN 978-91-85523-03-0. G. Hendeby: Performance and Implementation Aspects of Nonlinear Filtering. Thesis No. 1161, 2008. ISBN 978-91-7393-979-9. J. Sjöberg: Optimal Control and Model Reduction of Nonlinear DAE Models. Thesis No. 1166, 2008. ISBN 978-91-7393-964-5. D. Törnqvist: Estimation and Detection with Applications to Navigation. Thesis No. 1216, 2008. ISBN 978-91-7393-785-6. P-J. Nordlund: Efficient Estimation and Detection Methods for Airborne Applications. Thesis No. 1231, 2008. ISBN 978-91-7393-720-7. H. Tidefelt: Differential-algebraic equations and matrix-valued singular perturbation. Thesis No. 1292, 2009. ISBN 978-91-7393-479-4. H. Ohlsson: Regularization for Sparseness and Smoothness — Applications in System Identification and Signal Processing. Thesis No. 1351, 2010. ISBN 978-91-7393-287-5. S. Moberg: Modeling and Control of Flexible Manipulators. Thesis No. 1349, 2010. ISBN 978-91-7393-289-9. J. Wallén: Estimation-based iterative learning control. Thesis No. 1358, 2011. ISBN 97891-7393-255-4. J. Hol: Sensor Fusion and Calibration of Inertial Sensors, Vision, Ultra-Wideband and GPS. Thesis No. 1368, 2011. ISBN 978-91-7393-197-7. D. Ankelhed: On the Design of Low Order H-infinity Controllers. Thesis No. 1371, 2011. ISBN 978-91-7393-157-1. C. Lundquist: Sensor Fusion for Automotive Applications. Thesis No. 1409, 2011. ISBN 978-91-7393-023-9. P. Skoglar: Tracking and Planning for Surveillance Applications. Thesis No. 1432, 2012. ISBN 978-91-7519-941-2. K. Granström: Extended target tracking using PHD filters. Thesis No. 1476, 2012. ISBN 978-91-7519-796-8. C. Lyzell: Structural Reformulations in System Identification. Thesis No. 1475, 2012. ISBN 978-91-7519-800-2. J. Callmer: Autonomous Localization in Unknown Environments. Thesis No. 1520, 2013. ISBN 978-91-7519-620-6. D. Petersson: A Nonlinear Optimization Approach to H2-Optimal Modeling and Control. Thesis No. 1528, 2013. ISBN 978-91-7519-567-4. Z. Sjanic: Navigation and Mapping for Aerial Vehicles Based on Inertial and Imaging Sensors. Thesis No. 1533, 2013. ISBN 978-91-7519-553-7. F. Lindsten: Particle Filters and Markov Chains for Learning of Dynamical Systems. Thesis No. 1530, 2013. ISBN 978-91-7519-559-9. P. Axelsson: Sensor Fusion and Control Applied to Industrial Manipulators. Thesis No. 1585, 2014. ISBN 978-91-7519-368-7. A. Carvalho Bittencourt: Modeling and Diagnosis of Friction and Wear in Industrial Robots. Thesis No. 1617, 2014. ISBN 978-91-7519-251-2.

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

### Related manuals

Download PDF

advertisement