Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Optimal Deterministic Controller Synthesis from Steady-State Distributions

Optimal Deterministic Controller Synthesis from Steady-State Distributions The formal synthesis of control policies is a classic problem that entails the computation of optimal strategies for an agent interacting in some environment such that some formal guarantees of behavior are met. These guarantees are specified as part of the problem description and can be supplied by the end user in the form of various logics (e.g., Linear Temporal Logic and Computation Tree Logic) or imposed via constraints on agent-environment interactions. The latter has received significant attention in recent years within the context of constraints on the asymptotic frequency with which an agent visits states of interest. This is captured by the steady-state distribution of the agent. The formal synthesis of stochastic policies satisfying constraints on this distribution has been studied. However, the derivation of deterministic policies for the same has received little attention. In this paper, we focus on this deterministic steady-state control problem, i.e., the problem of obtaining a deterministic policy for optimal expected-reward behavior in the presence of linear constraints representing the desired steady-state behavior. Two integer linear programs are proposed and validated experimentally to solve this problem in unichain and multichain Markov decision processes. Finally, we prove that this problem is NP-hard even in the restricted setting where deterministic transitions are enforced on the MDP and there are only two actions. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Automated Reasoning Springer Journals

Optimal Deterministic Controller Synthesis from Steady-State Distributions

Loading next page...
 
/lp/springer-journals/optimal-deterministic-controller-synthesis-from-steady-state-lIrJfDLs3P
Publisher
Springer Journals
Copyright
Copyright © The Author(s), under exclusive licence to Springer Nature B.V. 2022. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
ISSN
0168-7433
eISSN
1573-0670
DOI
10.1007/s10817-022-09657-9
Publisher site
See Article on Publisher Site

Abstract

The formal synthesis of control policies is a classic problem that entails the computation of optimal strategies for an agent interacting in some environment such that some formal guarantees of behavior are met. These guarantees are specified as part of the problem description and can be supplied by the end user in the form of various logics (e.g., Linear Temporal Logic and Computation Tree Logic) or imposed via constraints on agent-environment interactions. The latter has received significant attention in recent years within the context of constraints on the asymptotic frequency with which an agent visits states of interest. This is captured by the steady-state distribution of the agent. The formal synthesis of stochastic policies satisfying constraints on this distribution has been studied. However, the derivation of deterministic policies for the same has received little attention. In this paper, we focus on this deterministic steady-state control problem, i.e., the problem of obtaining a deterministic policy for optimal expected-reward behavior in the presence of linear constraints representing the desired steady-state behavior. Two integer linear programs are proposed and validated experimentally to solve this problem in unichain and multichain Markov decision processes. Finally, we prove that this problem is NP-hard even in the restricted setting where deterministic transitions are enforced on the MDP and there are only two actions.

Journal

Journal of Automated ReasoningSpringer Journals

Published: Mar 1, 2023

Keywords: Markov decision processes; Optimal control; NP-complete; Linear temporal logic; Autonomous system

References