Die Ergebnisse können Gästen nur in Auswahl angezeigt werden. Bitte melden Sie sich für Vollzugriff an: Anmelden

Treffer: Multi-agent off-policy actor-critic algorithm for distributed multi-task reinforcement learning.

Title:

Multi-agent off-policy actor-critic algorithm for distributed multi-task reinforcement learning.

Authors:

Stanković, Miloš S.^1,2 (AUTHOR) milos.stankovic@singidunum.ac.rs, Beko, Marko^3,4 (AUTHOR) beko.marko@ulusofona.pt, Ilić, Nemanja^2,5 (AUTHOR) nemili@etf.rs, Stanković, Srdjan S.⁶ (AUTHOR) stankovic@etf.rs

Source:

European Journal of Control. Nov2023, Vol. 74, pN.PAG-N.PAG. 1p.

Subject Terms:

Distributed algorithms, Reinforcement learning, Machine learning, Stochastic approximation

Database:

Supplemental Index

Weitere Informationen

In this paper a new distributed multi-agent Actor-Critic algorithm for reinforcement learning is proposed for solving multi-agent multi-task optimization problems. The Critic algorithm is in the form of a Distributed Emphatic Temporal Difference DETD(λ) algorithm, while the Actor algorithm is proposed as a complementary consensus based policy gradient algorithm, derived from a global objective function having the role of a scalarizing function in multi-objective optimization. It is demonstrated that the Feller-Markov properties hold for the newly derived Actor algorithm. A proof of the weak convergence of the algorithm to the limit set of an attached ODE is derived under mild conditions, using a specific decomposition between the Critic and the Actor algorithms and additional two-time-scale stochastic approximation arguments. An experimental verification of the algorithm properties is given, showing that the algorithm can represent an efficient tool for practice. [ABSTRACT FROM AUTHOR]

Treffer: Multi-agent off-policy actor-critic algorithm for distributed multi-task reinforcement learning.

Weitere Informationen

Links

Zusatz-Funktionen