Treffer: Laboratory experiments of model-based reinforcement learning for adaptive optics control

Title:
Laboratory experiments of model-based reinforcement learning for adaptive optics control
Contributors:
Lappeenranta–Lahti University of Technology Finlande (LUT), Aalto University, European Southern Observatory (ESO), TKK Helsinki University of Technology (TKK), DOTA, ONERA Salon-de-Provence, ONERA, Laboratoire d'Astrophysique de Marseille (LAM), Aix Marseille Université (AMU)-Institut national des sciences de l'Univers (INSU - CNRS)-Centre National d'Études Spatiales Toulouse (CNES)-Centre National de la Recherche Scientifique (CNRS), Institute for Particle Physics and Astrophysics ETH Zürich (IPA), Department of Physics = Departement Physik ETH Zürich (D-PHYS), Eidgenössische Technische Hochschule - Swiss Federal Institute of Technology Zürich (ETH Zürich)-Eidgenössische Technische Hochschule - Swiss Federal Institute of Technology Zürich (ETH Zürich), The work of J.N. and T.H. was supported by the Academy of Finland (Grant Nos. 326961, 345720, and 353094)., Part of this work was supported by an ETH Zurich Research Grant. S.P.Q. gratefully acknowledges the financial support from ETH Zurich., Part of this work has been carried out within the framework of the National Centre of Competence in Research PlanetS supported by the Swiss National Science Foundation (SNSF) (Grant Nos. 51NF40 182901 and 51NF40 205606). S.P.Q. and A.M.G acknowledge the support from the SNSF., The COSMIC is developed through a strategic partnership between LESIA at Observatoire de Paris, AITC at the Australian National University, Microgate, and CAS at Swinburne University of Technology.
Source:
ISSN: 2329-4124 ; Journal of Astronomical Telescopes Instruments and Systems ; https://hal.science/hal-04650504 ; Journal of Astronomical Telescopes Instruments and Systems, 2024, 10 (01), pp.019001. ⟨10.1117/1.JATIS.10.1.019001]⟩.
Publisher Information:
CCSD
Society of Photo-optical Instrumentation Engineers
Publication Year:
2024
Collection:
Aix-Marseille Université: HAL
Document Type:
Fachzeitschrift article in journal/newspaper
Language:
English
Relation:
info:eu-repo/semantics/altIdentifier/arxiv/2401.00242; ARXIV: 2401.00242; BIBCODE: 2024JATIS.10a9001N
DOI:
10.1117/1.JATIS.10.1.019001
Rights:
info:eu-repo/semantics/OpenAccess
Accession Number:
edsbas.40FC2DBD
Database:
BASE

Weitere Informationen

International audience ; Direct imaging of Earth-like exoplanets is one of the most prominent scientific drivers of the next generation of ground-based telescopes. Typically, Earth-like exoplanets are located at small angular separations from their host stars, making their detection difficult. Consequently, the adaptive optics (AO) system’s control algorithm must be carefully designed to distinguish the exoplanet from the residual light produced by the host star. A promising avenue of research to improve AO control builds on data-driven control methods, such as reinforcement learning (RL). RL is an active branch of the machine learning research field, where control of a system is learned through interaction with the environment. Thus, RL can be seen as an automated approach to AO control, where its usage is entirely a turnkey operation. In particular, model-based RL has been shown to cope with temporal and misregistration errors. Similarly, it has been demonstrated to adapt to nonlinear wavefront sensing while being efficient in training and execution. In this work, we implement and adapt an RL method called policy optimization for AO (PO4AO) to the GPU-based high-order adaptive optics testbench (GHOST) test bench at ESO headquarters, where we demonstrate a strong performance of the method in a laboratory environment. Our implementation allows the training to be performed parallel to inference, which is crucial for on-sky operation. In particular, we study the predictive and self-calibrating aspects of the method. The new implementation on GHOST running PyTorch introduces only around 700μs of in addition to hardware, pipeline, and Python interface latency. We open-source well-documented code for the implementation and specify the requirements for the RTC pipeline. We also discuss the important hyperparameters of the method and how they affect the method. Further, the paper discusses the source of the latency and the possible paths for a lower latency implementation.