Access the full text.
Sign up today, get DeepDyve free for 14 days.
The formal synthesis of control policies is a classic problem that entails the computation of optimal strategies for an agent interacting in some environment such that some formal guarantees of behavior are met. These guarantees are specified as part of the problem description and can be supplied by the end user in the form of various logics (e.g., Linear Temporal Logic and Computation Tree Logic) or imposed via constraints on agent-environment interactions. The latter has received significant attention in recent years within the context of constraints on the asymptotic frequency with which an agent visits states of interest. This is captured by the steady-state distribution of the agent. The formal synthesis of stochastic policies satisfying constraints on this distribution has been studied. However, the derivation of deterministic policies for the same has received little attention. In this paper, we focus on this deterministic steady-state control problem, i.e., the problem of obtaining a deterministic policy for optimal expected-reward behavior in the presence of linear constraints representing the desired steady-state behavior. Two integer linear programs are proposed and validated experimentally to solve this problem in unichain and multichain Markov decision processes. Finally, we prove that this problem is NP-hard even in the restricted setting where deterministic transitions are enforced on the MDP and there are only two actions.
Journal of Automated Reasoning – Springer Journals
Published: Mar 1, 2023
Keywords: Markov decision processes; Optimal control; NP-complete; Linear temporal logic; Autonomous system
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.