Access the full text.
Sign up today, get DeepDyve free for 14 days.
Mukund Raghothaman, A. Udupa (2014)
Language to Specify Syntax-Guided Synthesis ProblemsArXiv, abs/1405.5590
Armando Solar-Lezama, Liviu Tancau, R. Bodík, S. Seshia, V. Saraswat (2006)
Combinatorial sketching for finite programs
Andrew Reynolds, Tim King, Viktor Kunčak (2017)
Solving quantified linear arithmetic by counterexample-guided instantiationFormal Methods in System Design, 51
P. Cousot, R. Cousot (1977)
Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpointsProceedings of the 4th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
R. Floyd (1993)
Assigning Meanings to Programs
Susmit Jha, Sumit Gulwani, S. Seshia, A. Tiwari (2010)
Oracle-guided component-based program synthesis2010 ACM/IEEE 32nd International Conference on Software Engineering, 1
G. Collins (1975)
Hauptvortrag: Quantifier elimination for real closed fields by cylindrical algebraic decomposition
Sumit Gulwani, Susmit Jha, A. Tiwari, R. Venkatesan (2011)
Synthesis of loop-free programs
Ashutosh Gupta, T. Henzinger, R. Majumdar, A. Rybalchenko, Ru-Gang Xu (2008)
Proving non-termination
O. Strichman (2002)
On Solving Presburger and Linear Arithmetic with SAT
C. David, Pascal Kesseli, D. Kroening, M. Lewis (2016)
Danger Invariants
A. Abate, C. David, Pascal Kesseli, D. Kroening, E. Polgreen (2018)
Counterexample Guided Inductive Synthesis Modulo Theories
Z. Manna, R. Waldinger (1979)
A Deductive Approach to Program SynthesisACM Trans. Program. Lang. Syst., 2
Haniel Barbosa, Andrew Reynolds, Daniel Larraz, C. Tinelli (2019)
Extending enumerative function synthesis via SMT-driven classification2019 Formal Methods in Computer Aided Design (FMCAD)
Andrew Reynolds, Morgan Deters, Viktor Kunčak, C. Tinelli, Clark Barrett (2015)
Counterexample-Guided Quantifier Instantiation for Synthesis in SMT
Aishwarya Gupta (2020)
Decision ProceduresDistributed Artificial Intelligence
Andrew Reynolds, Haniel Barbosa, Andres Nötzli, Clark Barrett, C. Tinelli (2019)
cvc4sy: Smart and Fast Term Enumeration for Syntax-Guided Synthesis
Yu Feng, R. Martins, Jacob Geffen, Işıl Dillig, Swarat Chaudhuri (2016)
Component-based synthesis of table consolidation and transformation tasks from examplesProceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation
Andrew Reynolds, Arjun Viswanathan, Haniel Barbosa, C. Tinelli, Clark Barrett (2018)
Datatypes with Shared Selectors
Clark Barrett, Igor Shikanian, C. Tinelli (2007)
An Abstract Decision Procedure for a Theory of Inductive Data TypesJ. Satisf. Boolean Model. Comput., 3
Joao Marques-Silva, I. Lynce, S. Malik (2021)
Conflict-Driven Clause Learning SAT Solvers
N. Eén, Niklas Sörensson (2003)
An Extensible SAT-solver
R. Nieuwenhuis, Albert Oliveras, C. Tinelli (2006)
Solving SAT and SAT Modulo Theories: From an abstract Davis--Putnam--Logemann--Loveland procedure to DPLL(T)J. ACM, 53
Aart Bik, H. Wijshoff (1994)
Implementation of fourier - motzkin elimina - tion
Yu Feng, R. Martins, Yuepeng Wang, Işıl Dillig, T. Reps (2017)
Component-based synthesis for complex APIsProceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages
H. Ganzinger, G. Hagen, R. Nieuwenhuis, Albert Oliveras, C. Tinelli (2004)
DPLL( T): Fast Decision Procedures
C. David, D. Kroening, M. Lewis (2015)
Using Program Synthesis for Program AnalysisArXiv, abs/1508.07829
Aws Albarghouthi, Sumit Gulwani, Zachary Kincaid (2013)
Recursive Program Synthesis
Sumit Gulwani, Oleksandr Polozov, Rishabh Singh (2014)
Program Synthesis
R. Alur, Arjun Radhakrishna, A. Udupa (2017)
Scaling Enumerative Program Synthesis via Divide and Conquer
E. Clarke, D. Kroening, K. Yorav (2003)
Behavioral consistency of C and Verilog programs using bounded model checkingProceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451)
Aina Niemetz, Mathias Preiner, Andrew Reynolds, Clark Barrett, C. Tinelli (2018)
On Solving Quantified Bit-Vectors using Invertibility ConditionsArXiv, abs/1804.05025
L. Moura, N. Bjørner (2008)
Z3: An Efficient SMT Solver
Eric Rosen (1999)
An existential fragment of second order logicArchive for Mathematical Logic, 38
S Gulwani, O Polozov, R Singh (2017)
Program synthesisFound. Trends Program. Lang., 4
Yu Feng, O. Bastani, R. Martins, Işıl Dillig, Saswat Anand (2016)
Automated Synthesis of Semantic Malware Signatures using Maximum SatisfiabilityArXiv, abs/1608.06254
Armando Solar-Lezama, R. Rabbah, R. Bodík, K. Ebcioglu (2005)
Programming by sketching for bit-streaming programs
Sumit Gulwani, V. Korthikanti, A. Tiwari (2011)
Synthesizing geometry constructions
R. Alur, D. Fisman, Rishabh Singh, Armando Solar-Lezama (2017)
SyGuS-Comp 2017: Results and Analysis
R. Alur, Pavol Cerný, Arjun Radhakrishna (2015)
Synthesis Through UnificationArXiv, abs/1505.05868
R. Alur, R. Bodík, Garvit Juniwal, Milo Martin, Mukund Raghothaman, S. Seshia, Rishabh Singh, Armando Solar-Lezama, E. Torlak, A. Udupa (2013)
Syntax-guided synthesis2013 Formal Methods in Computer-Aided Design
Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations
Daniel Perelman, Sumit Gulwani, D. Grossman, Peter Provost (2014)
Test-driven synthesisProceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation
Armando Solar-Lezama (2012)
Program sketchingInternational Journal on Software Tools for Technology Transfer, 15
Program synthesis is the mechanised construction of software. One of the main difﬁculties is the efﬁcient exploration of the very large solution space, and tools often require a user- provided syntactic restriction of the search space. While useful in general, such syntactic restrictions provide little help for the generation of programs that contain non-trivial con- stants, unless the user is able to provide the constants in advance. This is a fundamentally difﬁcult task for state-of-the-art synthesisers. We propose a new approach to the synthesis of programs with non-trivial constants that combines the strengths of a counterexample-guided inductive synthesiser with those of a theory solver, exploring the solution space more efﬁ- ciently without relying on user guidance. We call this approach CEGIS(T ), where T is a ﬁrst-order theory. We present two exemplars, one based on Fourier-Motzkin (FM) variable elimination and one based on ﬁrst-order satisﬁability. We demonstrate the practical value of CEGIS(T ) by automatically synthesising programs for a set of intricate benchmarks. Addi- tionally, we present a case study where we integrate CEGIS(T ) within the mature synthesiser CVC4 and show that CEGIS(T ) improves CVC4’s results. Keywords Program synthesis · Automated reasoning · Satisﬁability modulo theories · Counterexample guided inductive synthesis 1 Introduction Program synthesis [25] is the problem of ﬁnding a program that meets a correctness spec- iﬁcation given as a logical formula. This is an active area of research in which substantial progress has been made in recent years. In full generality, program synthesis is an exceptionally difﬁcult problem, and thus, the research community has explored pragmatic restrictions. One particularly successful direc- tion is Syntax-Guided Program Synthesis (SyGuS) [3]. The key idea of SyGuS is that the user supplements the logical speciﬁcation with a syntactic template for the solution, deﬁned Cristina David cristina.david@bristol.ac.uk Extended author information available on the last page of the article 0123456789().: V,-vol 123 19 Page 2 of 25 A. Abate et al. as a context-free grammar. Leveraging the user’s intuition, SyGuS reduces the size of the solution space substantially, resulting in signiﬁcant speed-ups. Unfortunately, it is difﬁcult to provide the syntactic template in many practical applica- tions. A very obvious exemplar of the limits of the syntax-guided approach are programs that require non-trivial constants. In such a scenario, the syntax-guided approach requires that the user provides the exact value of the constants in the solution. For illustration, let’s consider a user who wants to synthesise a program that rounds up a given 32-bit unsigned number x to the next highest power of two. If we refer to the function computed by the program as f (x ), then the speciﬁcation can be written as x <2 ⇒ f (x )&(− f (x ))= f (x ) ∧ f (x )≥x ∧ 2x ≥ f (x ) The ﬁrst conjunct forces f (x ) to be a power of two, the others require it to be the next highest. A possible solution for this is given by the following C program: 1 x=x-1; 2 x|=x >> 1; 3 x|=x >> 2; 4 x|=x >> 4; 5 x|=x >> 8; 6 x|=x >> 16; 7 x=x+1; It is improbable that the user knows that the constants in the solution are exactly 1, 2, 4, 8, 16, and thus, she will be unable to effectively restrict the solution space. As a result, synthesisers are very likely to enumerate possible combinations of constants, which is highly inefﬁcient. In this paper we propose a new approach to the synthesis of programs with non-trivial constants that combines the strengths of a counterexample-guided inductive synthesiser with those of a solver for a ﬁrst-order theory in order to perform a more efﬁcient exploration of the solution space, without relying on user guidance. Our inspiration for this proposal is CDCL(T )[22, 31, 39] (also known as DPLL(T )), which has boosted the performance of solvers for many fragments of quantiﬁer-free ﬁrst-order logic. The CDCL(T ) framework combines reasoning about the Boolean structure of a formula with reasoning about theory facts to decide satisﬁability of a given ﬁrst-order formula. Similarly, we attempt to separate reasoning about a program’s structure and its constants. For this purpose, we propose a new algorithm called CounterExample-Guided Inductive Synthesis(T ), where T is a given ﬁrst-order theory for which we have a specialised solver. Analogous to its counterpart CDCL(T ), the CEGIS(T ) architecture features communication between a synthesiser and a theory solver, which results in a much more efﬁcient exploration of the search space (i.e., the space of all possible programs). While standard CEGIS architectures [27, 41] already make use of SMT solvers, the typical role of the SMT solver in existing algorithms is restricted to validating candidate solutions and providing concrete counterexamples that direct subsequent search. By contrast, CEGIS(T ) allows the theory solver to communicate generalised constraints back to the synthesiser, thus enabling more signiﬁcant pruning of the search space. There are instances of more sophisticated collaboration between a program synthesiser and theory solvers. The most obvious such instance is the program synthesiser for single invocation conjectures [35] inside the CVC4 SMT solver [34]. This approach features a very tight coupling between the two components (i.e., the synthesiser and the theory solvers) that 123 Synthesising Programs with Non-trivial Constants Page 3 of 25 19 takes advantage of the particular strengths of the SMT solver by reformulating the synthesis problem as the problem of refuting a universally quantiﬁed formula (SMT solvers are better at refuting universally quantiﬁed formulae than at ﬁnding models for them). By contrast, in our approach we maintain a clear separation between the synthesiser and the theory solver while performing comprehensive and well-deﬁned communication between the two components. This enables the ﬂexible combination of CEGIS with a variety of theory solvers, which excel at reasoning about different kinds of constraints. Contributions – We propose CEGIS(T ), a program synthesis architecture that facilitates the communi- cation between an inductive synthesiser and a solver for a ﬁrst-order theory, with the objective of separating reasoning about a program’s structure and its constants. – We present two exemplars of this architecture, one based on Fourier-Motzkin (FM) variable elimination [10] and one using an off-the-shelf SMT solver. – As a case study, we present an integration of CEGIS(T ) within the synthesiser CVC4, winner of four out of ﬁve tracks of the Syntax-Guided Synthesis (SyGuS) competition 2019. We show that CEGIS(T ) improves CVC4’s performance. – Wehaveimplemented CEGIS(T ) and compared it with state-of-the-art program synthe- sisers on benchmarks that require intricate constants in the solution. 2 Preliminaries 2.1 The Program Synthesis Problem Program synthesis is the task of automatically generating programs that satisfy a given logical speciﬁcation. For non-recursive programs a program synthesiser can be viewed as a solver for formulae with existential second-order quantiﬁers. The input speciﬁcation provided to a program synthesiser is of the form ∃P. ∀ x.σ (P, x ) (1) where P ranges over functions (where a function is represented by the program computing it), x is a tuple of variables ranging over the function’s inputs, and σ is a quantiﬁer-free formula. 2.2 CounterExample Guided Inductive Synthesis CounterExample-Guided Inductive Synthesis (CEGIS) is a popular approach to program synthesis. It is an iterative process that maintains at all times a candidate program P for the speciﬁcation ∃P. ∀ x.σ (P, x ). Each iteration performs inductive generalisation based on counterexamples provided by a veriﬁcation oracle, that is, concrete input values c that falsify σ(P , c ). Essentially, the inductive generalisation uses information about a limited number of inputs to make claims about all the possible inputs in the form of candidate solutions. Existential second-order logic allows existential quantiﬁcation over second-order (i.e., function) variables in addition to quantiﬁcation over (ﬁrst-order) variables [38]. 123 SAT 19 Page 4 of 25 A. Abate et al. Fig. 1 CEGIS block diagram UNSAT synthesise no solution P c UNSAT verify solution P The CEGIS process is illustrated in Figure 1 and consists of two phases: the synthesis phase and the veriﬁcation phase. Given the speciﬁcation σ of the desired program, the induc- ∗ ∗ tive synthesis procedure generates a candidate program P that satisﬁes σ(P , x ) for a subset I of all possible inputs. The candidate program P is passed to the veriﬁcation phase, which checks whether it satisﬁes σ(P , x ) for all possible inputs. This is done by checking whether ∗ ∗ ¬σ(P , x ) is unsatisﬁable. If so, ∀x .σ (P , x ) is valid, meaning that we have successfully synthesised a solution, and the algorithm terminates. Otherwise, the veriﬁer produces a coun- terexample c from the satisfying assignment for ¬σ(P , x ), which is then added to the set I of inputs passed to the synthesiser, and the loop repeats. The methods used in the synthesis and veriﬁcation blocks vary in different CEGIS imple- mentations. We give details of the algorithms used in CVC4 in Sect. 5.1 as an exemplar. 2.3 CDCL(T ) To improve the performance of the traditional CEGIS process, we have devised an extension of it, CEGIS(T ), inspired by the CDCL(T ) framework. The latter is an extension of the CDCL algorithm used by most propositional SAT solvers [39], by a theory T .Wegivea brief overview of CDCL(T ) and compare CDCL(T ) with CEGIS(T )next. Given a formula F from a theory T , a propositional formula F is created from F in which the theory atoms are replaced by Boolean variables (the “propositional skeleton”). The standard CDCL algorithm, comprising Decide, Boolean Constraint Propagation (BCP), Analyze- Conflict and BackTrack components as illustrated in Figure 2, generates an assignment to the Boolean variables in F . The theory solver then checks whether this assignment is still consistent when the Boolean variables are replaced by their original atoms. If so, a satisfying assignment for F has been found. Otherwise, a constraint over the Boolean variables in F is passed back to Decide, and the process repeats. In the very ﬁrst SMT solvers, the SAT solver ﬁrst obtained a full assignment to the Boolean variables that comprise the abstraction of the input. Subsequently, the theory solver would determine whether or not this assignment was satisﬁable according to the background theory. If not, the next Boolean assignment was then checked. Such SMT solvers were prone to enumerating all possible candidate solutions at the Boolean level in the worst case. The key improvement in CDCL(T ) was the ability to pass back a more general constraint over the variables in the formula as a conﬂict clause [22], which could block not only the failed solution but a whole set of them. Furthermore, modern variants of CDCL(T ) call the theory solver on partial assignments to the variables in F , which helps detect conﬂicts and propagations eagerly. Our proposed, new synthesis algorithm offers equivalents of both of these ideas. As we describe in Sect. 4, our implementation of CEGIS(T ) may establish that the synthesis speciﬁcation has no solution of a particular syntactic shape, that is, matching a particular template, regardless of the choices made to instantiate the template. SAT Synthesising Programs with Non-trivial Constants Page 5 of 25 19 Decide SAT all assigned BackTrack conﬂict Analyze BCP UNSAT Conflict CDCL theory Theory Deduction Add Clauses propagation Solver Fig. 2 CDCL(T ) with theory propagation 2.4 Fourier–Motzkin Elimination Fourier–Motzkin elimination is a mathematical algorithm for eliminating variables from a system of linear inequalities. In particular, given a system of linear inequalities of the form a x + a x + ... + a x ≤ b , i =1 ... m 1 1 2 2 n n i we eliminate x as described next. For each inequality a x + a x + ... + a x ≤ b ,we n 1 1 2 2 n n i get x ≤ (b − a x − ... − a x )/a n i 1 1 n−1 n−1 n or x ≥ (b − a x − ... − a x )/a n i 1 1 n−1 n−1 n depending on whether a > 0or a < 0, respectively. n n This gives us a collection of upper bounds x ≤ U (x ,..., x ), ..., x ≤ U (x ,..., x ) n 1 1 n−1 n l 1 n−1 and lower bounds x ≥ L (x ,..., x ), ..., x ≥ L (x ,..., x ) n 1 1 n−1 n p 1 n−1 The initial system of inequalities is equivalent to max (L (x ,..., x ), ..., L (x ,..., x )) 1 1 n−1 p 1 n−1 ≤ min(U (x ,..., x ), ..., U (x ,..., x )) 1 1 n−1 l 1 n−1 which is equivalent to p · l inequalities of the form U (x ,..., x ) ≤ L (x ,..., x ) k = 1 ··· l, j = 1 ··· p k 1 n−1 j 1 n−1 We transformed the original system of linear inequalities into another system where x is eliminated. nothing to propagate 19 Page 6 of 25 A. Abate et al. 3 Motivating Examples 3.1 CEGIS on a Simple Example In each iteration of a standard CEGIS loop, the communication from the veriﬁcation phase back to the synthesis phase is restricted to concrete counterexamples. This is particularly detrimental when synthesising programs that require non-trivial constants. In such a setting, it is typical that a counterexample provided by the veriﬁcation phase only eliminates a sin- gle candidate solution and, consequently, the synthesiser ends up enumerating all possible constants. For illustration, let’s consider the trivial problem of synthesising a function f (x ) where f (x)< 0if x < 334455 and f (x ) = 0, otherwise. One possible solution is f (x ) = ite (x < 334455) −10,where ite stands for if then else. In order to make the synthesis task even simpler, we are going to assume that we know a part of this solution, namely we know that it must be of the form f (x ) = ite (x < ?) −10,where “?” is a placeholder for the missing constant that we must synthesise. A plausible scenario for a run of CEGIS is presented next: the synthesis phase guesses f (x ) = ite (x < 0) −10, for which the veriﬁcation phase returns x = 0 as a counterexample. In the next iteration of the CEGIS loop, the synthesis phase guesses f (x ) = ite (x < 1) −1 0 (which works for x = 0) and the veriﬁer produces x = 1 as a counterexample. Following the same pattern, the synthesis phase will enumerate all the candidates f (x ) = ite (x < 2) −10 ... f (x ) = ite (x < 334454) −10 before ﬁnding the solution. This is caused by the fact that each of the concrete counterex- amples 0,..., 334454 eliminates one candidate only from the solution space. To avoid this behavior we need to propagate more information from the veriﬁer to the synthesis phase in each iteration of the CEGIS loop. 3.2 Proving Properties of Programs Synthesis engines can be used as reasoning engines in program analysers, and constants are important for this application. In such a case, the synthesised program computes the program proof of interest (e.g., program invariant [13] for safety proving, counter-model [11] for bug ﬁnding, ranking function [21] for termination proving, recurrence set [26]for non-termination). In the examples given in the rest of this section, we refer directly to the formula corresponding to each program proof computed by a synthesised program. Proving safety Let’s start by considering the very simple program below, which increments a variable x from 0 to 100000 and asserts that its value is less than 100005 on exit from the loop. 1 int x=0; 2 while (x <=100000) x++; 3 assert(x <100005); Proving the safety of such a program, i.e., that the assertion at line 3 is not violated in any execution of the program requires the generation of a loop invariant, a task well-suited for synthesis (the Syntax Guided Synthesis Competition [6] has had a track dedicated to 123 Synthesising Programs with Non-trivial Constants Page 7 of 25 19 synthesising safety invariants since 2015). For this example, a safety invariant is x < 100002, which holds on entrance to the loop, is inductive with respect to the loop’s body, and implies the assertion on exit from the loop. While it is very easy for a human to ﬁnd this invariant, the need for a non-trivial constant makes it exceedingly difﬁcult for state-of-the-art synthesisers: both CVC4 (version 1.5) [35] and EUSolver (version 2017-06-15) [4] fail to ﬁnd a solution in an hour. Proving termination Next, let’s look at the following terminating program: 1 int x=0; 2 while (x <1000) { 3 x++; 4 } Its termination argument can be encoded as the following formula, where R is a ranking function, i.e., an injective function that has a well-founded set D with order ≺ as co-domain and is injective and monotonically decreasing with respect to the program’s transition relation: ∀x . x <1000 → 0 ≺ R(x ) ∧ R(x + 1) ≺ R(x ) A possible ranking function is R(x ) = 1000 − x with (D, ≺) = (N,<), which also requires a non-trivial constant. Proving non-termination One way of proving non-termination is by ﬁnding a recurrence set, i.e., a nonempty set of states S such that for each state s ∈ S there exists a transition to some s ∈ S [26]. As an example, let us investigate the termination behaviour of the program below: 1 unsigned int i,j; 2 while(i<UINT_MAX || j<UINT_MAX) { 3 i++; 4 j++; 5 } For bit-vectors, the initial state (i , j ) = (!UINT_MAX!, !UINT_MAX! − 1) leads to 0 0 an inﬁnite loop since i and j will overﬂow and be reset to 0 in subsequent loop iterations. The corresponding recurrence set S has to satisfy the following formula of bitvector arithmetic, which encodes that S must be reachable from an initial state and, for each state in S, at least one successor must be in S. Note that, as the program is deterministic, all successors of a state in S must be in S. ∃i , j .∀i , j .S(i , j ) ∧ 0 0 0 0 S(i , j ) → (i <UI NT _MAX ∨ j <UI NT _MAX ) ∧ S(i , j ) → S(i + 1, j + 1) A possible recurrence set is S(i , j ) = i <UI NT _MAX ∨ j <UI NT _MAX, which again requires large constants. Bug ﬁnding Next, let’s look at the buggy example below, where we increment x and y in each loop iteration, maintaining the same initial difference of 10 between them. Consequently, the assertion at line 6 fails. 123 19 Page 8 of 25 A. Abate et al. 1 int x=0, y=10; 2 while(x <100000) { 3 x++; 4 y++; 5 } 6 assert(y==x+11); We can prove that this program has a bug by ﬁnding a danger invariant [14], which can be seen as a compact representation of an error trace. A danger invariant D(x , y) must hold in some initial state, be inductive with respect to the transition relation and, on exit from the loop, imply the negation of the assertion: ∃x , y .x =0 ∧ y =10 ∧ D(x , y ) ∧ 0 0 0 0 0 0 ∀x , y.D(x , y) ∧ x <100000 → D(x + 1, y + 1) ∧ ∀x , y.D(x , y) ∧ x ≥100000 → y=x +11 For our example, a possible danger invariant is D(x , y) = (y=x +10). However this is not quite enough to conclude that the assertion does fail, since we have not yet established that the loop terminates from any D-state — thus we are in the situation where the danger invariant denotes either an assertion violation or the presence of a recurrence set. If we want to prove only an assertion violation, an additional part of the danger invariant is a ranking function R(x , y) proving that the loop does terminate making the assertion at line 6 reachable. ∃x , y .x =0 ∧ y =10 ∧ D(x , y ) ∧ 0 0 0 0 0 0 ∀x , y.D(x , y) ∧ x <100000 → D(x + 1, y + 1) ∧ R(x , y)> 0 ∧ R(x , y)> R(x + 1, y + 1) ∧ ∀x , y.D(x , y) ∧ x ≥100000 → y=x +11 In this case, a ranking function is R(x , y) = 100000 − x. As we can see, both D(x , y) and R(x , y) require non-trivial constants. 4 CEGIS(T ) 4.1 Overview In this section, we describe the architecture of CEGIS(T ), which is obtained by augmenting the standard CEGIS loop with a theory solver. Since we are particularly interested in the synthesis of programs with constants, we present CEGIS(T ) from this particular perspective. In such a setting, CEGIS is responsible for synthesising program skeletons, whereas the theory solver generates constraints over the literals that denote constants. These constraints are then propagated back to the synthesiser. To explain the main ideas behind CEGIS(T ) in more detail, it is useful to differentiate between a candidate solution, a candidate solution skeleton, a generalised candidate solution and a ﬁnalised solution. Deﬁnition 1 (Candidate solution) Using the notation from Sect. 2.2, a program P is a candidate solution of ∀ x .σ (P, x ) if the set {σ(P , e ) | e ∈ E } is satisﬁable where inputs E is a set of possible values for x . inputs 123 SAT Synthesising Programs with Non-trivial Constants Page 9 of 25 19 UNSAT synthesise no solution concrete counterexample UNSAT verify solution P CEGIS program has constants Constant Theory Deduction Removal Solver Fig. 3 CEGIS(T ) The set {σ(P , e ) | e ∈ E } in Deﬁnition 1 contains the ground instances of σ(P, x ) inputs obtained by instantiating the vector x with each vector in E . Such instances do not have inputs any free variables other than the second-order variable P. If this set is satisﬁable for some ∗ ∗ value (i.e., program) P for P then P meets the speciﬁcation for each input from E inputs (i.e., ∀ x ∈ E .σ (P, x )). inputs Deﬁnition 2 (Candidate solution skeleton) Given a candidate solution P ,the skeleton of ∗ ∗ ∗ P , denoted by P [?], is obtained by replacing each constant in P with a distinguished symbol ?, representing a hole. Deﬁnition 3 (Generalised candidate solution) Given a candidate solution skeleton P [?], ∗ ∗ we obtain a generalised candidate P [ v] by ﬁlling each hole in P [?] with a distinct logical variable, i.e., variable v will correspond to the i-th hole (in some arbitrary but ﬁxed ordering of the hole occurrences in P ). Then v =[v ,...,v ],where n denotes the number of holes 1 n in P[?]. Deﬁnition 4 (Finalised solution) A candidate solution P is a ﬁnalised solution if the formula ∀ x .σ (P , x ) is valid. Example 1 (Candidate solution, candidate solution skeleton, generalised candidate solution, ﬁnalised solution) Given the example in Sect. 3.1,if E ={0},then f (x ) =−2isa inputs candidate solution. The corresponding candidate skeleton is f [?](x ) = ? and the generalised candidate is f [v ](x ) = v . A ﬁnalised solution for this example is f (x ) = ite (x < 1 1 334455) −10. Figure 3 illustrates the communication between the synthesiser and the theory solver in CEGIS(T ). The interaction can be described as follows: – The CEGIS architecture (enclosed in a dotted rectangle with the label “CEGIS”) generates the candidate solution P , which is provided to the theory solver. – The theory solver (enclosed in a dashed rectangle with the label “Theory Solver”) obtains ∗ ∗ ∗ the skeleton P [?] of P and generalises it to P [ v] in the box marked constant removal. Subsequently, Deduction attempts to ﬁnd a constraint over v describing propagate constraints SAT SAT 19 Page 10 of 25 A. Abate et al. those values for which P [ v] is a ﬁnalised solution. This constraint is propagated back to CEGIS. Whenever there is no valuation of v for which P [ v] becomes a ﬁnalised solution, the constraint needs to block the current skeleton P [?]. The CEGIS(T ) algorithm is given as Algorithm 1 and proceeds as follows: – Before entering the while loop, E is initialized with the empty set. This means that, inputs in the ﬁrst iteration of CEGIS(T ), there are no inputs to be considered and any program will trivially obey the speciﬁcation. – CEGIS synthesis phase: checks the satisﬁability of {σ(P, e ) | e ∈ E },where inputs E is a subset of all possible values for x , and obtains a candidate solution P .Ifthis inputs set is unsatisﬁable, then the synthesis problem has no solution. – CEGIS veriﬁcation phase: checks whether there exists a concrete counterexample for ∗ ∗ the current candidate solution P by checking the satisﬁability of the formula ¬σ(P , x ). If the result is UNSAT, then P is a ﬁnalised solution to the synthesis problem. If the result is SAT, a concrete counterexample c can be extracted from the satisfying assignment. – Theory solver: if P contains constants, then they are eliminated, resulting in the skele- ∗ ∗ ton P [?], which is afterwards generalised to P [ v]. The goal of the theory solver is to ﬁnd T -implied literals and communicate them back to the CEGIS part in the form ∗ ∗ of a constraint, C (P, P , v) . In Algorithm 1, this is done by Deduction(σ, P [ v]). The result of Deduction(σ, P [ v]) is of the following form: whenever there exists a valuation of v for which the current skeleton P [?] is a ﬁnalised solution, res=true and C (P, P , v) = v =c ,where c are constants; otherwise, res=false and i i i i =1·n ∗ ∗ ∗ ∗ C (P, P , v) needs to block the current skeleton P [?], i.e., C (P, P , v) =P[?]=P [?]. In our CEGIS implementation, this amounts to placing constraints over the boolean selector variables in the synthesis formula, which choose the sequence of operators and operands in the candidate program P . – CEGIS learning phase: adds new information to the problem speciﬁcation. If we did not use the theory solver (i.e., the candidate P found by the synthesiser did not contain constants or the problem speciﬁcation was out of the theory solver’s scope), then the learning would be limited to adding the concrete counterexample e obtained from the veriﬁcation phase to the set E . However, if the theory solver is used and returns inputs res=true, then the second element in the tuple contains valuations for v such that P [ v] is a ﬁnalised solution. If res=false, then the second element blocks the current skeleton and needs to be added to σ . 4.2 CEGIS(T ) with a Theory Solver Based on FM Elimination In this section we describe a theory solver based on FM variable elimination. In our case, we call the FM theory solver whenever the speciﬁcation σ belongs to linear arithmetic. Otherwise, the FM theory solver is not called. As mentioned above, we need to produce a constraint over variables v describing the situation when P [ v] is a ﬁnalised solution. For this purpose, we consider the formula ∃ x . ¬σ(P [ v], x ), where v is a satisﬁability witness if the speciﬁcation σ admits a counterex- ∗ ∗ ample x for P .Let E (v) be the formula obtained by eliminating x from ∃ x . ¬σ(P [ v], x ). If ¬E (v) is satisﬁable, any satisﬁability witness gives us the necessary valuation for v : C (P, P , v) = v = c . i i i =1·n 123 Synthesising Programs with Non-trivial Constants Page 11 of 25 19 ALGORITHM 1: CEGIS(T ) 1 function CEGIS(T )(speciﬁcation σ ) 2 E =∅ inputs 3 while true do 4 /* CEGIS synthesis phase */ 5 if {σ(P, e ) | e ∈ E )} is UNSAT then inputs 6 return Failure; 7 end 8 else 9 P = satisﬁability witness for {σ(P, e ) | e ∈ E )}; inputs 10 /* CEGIS veriﬁcation phase */ 11 if ¬(σ (P , x )) is UNSAT then 12 return Finalised solution P ; 13 end 14 else 15 e = satisﬁability witness for ¬σ(P , x ); 16 /* Theory solver */ 17 if P contains constants then ∗ ∗ 18 Obtain P [?] from P ; ∗ ∗ 19 Generalise P [?] to P [ v]; ∗ ∗ 20 (res, C (P, P , v) ) = Deduction(σ, P [ v]); 21 end 22 end 23 end 24 /* CEGIS learning phase */ 25 if res then 26 // C (P, P , v) is of the form v = c i i i =1...n 27 return Finalised solution P [ c]; 28 end 29 else 30 σ(P, x ) = σ(P, x ) ∧ C (P, P , v) ; 31 E = E ∪{e}; inputs inputs 32 end 33 end If ¬E (v) is UNSAT, then the current skeleton P [?] needs to be blocked. This reasoning is supported by Lemma 1 and Corollary 1. Lemma 1 Let E (v) be the formula obtained by eliminating xf rom ∃ x . ¬σ(P [ v], x ). Then, # # any witness v to the satisﬁability of ¬E (v) gives us a ﬁnalised solution P [v ] to the synthesis problem. Proof From the fact that E (v) is obtained by eliminating x from ∃ x . ¬σ(P [ v], x ),weget that E (v) is equivalent with ∃ x . ¬σ(P [ v], x ) (we use ≡ to denote equivalence): E (v) ≡∃x. ¬σ(P [ v], x ). Then: ¬E (v) ≡∀x.σ (P [ v], x ). # ∗ ∗ # Consequently, any v satisfying ¬E (v) also satisﬁes ∀ x.σ (P [ v], x ).From ∀ x.σ (P [v ], x ) ∗ # and Deﬁnition 4 we get that P [v ] is a ﬁnalised solution. 123 19 Page 12 of 25 A. Abate et al. Corollary 1 Let E (v) be the formula that is obtained by eliminating xf rom ∃ x . ¬σ(P [ v], x ). If ¬E (v) is unsatisﬁable, then the corresponding synthesis problem does not admit a solution for the skeleton P [?]. ∗ ∗ Proof Given that ¬E (v) ≡∀x.σ (P [ v], x ),if ¬E (v) is unsatisﬁable, so is ∀ x.σ (P [ v], x ), meaning that there is no valuation for v such that the speciﬁcation σ is obeyed for all inputs x . For the current skeleton P [?], the constraint E (v) generalises the concrete counterexam- ple e (found during the CEGIS veriﬁcation phase) in the sense that the instantiation v of v ∗ # for which e failed the speciﬁcation, i.e., ¬σ(P [ v ], e ), is a satisﬁability witness for E (v) . ∗ # This is true as E (v) ≡∃x. ¬σ(P [ v], x ), which means that the satisﬁability witness (v , e ) for ¬σ(P [ v], x ) projected on v is a satisﬁability witness for E (v) . 4.2.1 Disjunction The speciﬁcation σ and the candidate solution may contain disjunctions. However, most theory solvers (and in particular the FM variable elimination [10]) work on conjunctive fragments only. A naïve approach could use case-splitting, i.e., transforming the formula into Disjunctive Normal Form (DNF) and then solving each clause separately. This can result in a number of clauses exponential in the size of the original formula. Instead, we handle disjunction using the Boolean Fourier-Motzkin procedure [28, 43]. As a result, the constraints we generate may be non-clausal. 4.2.2 Applying CEGIS(T ) with FM to the Motivational Example We recall the example in Sect. 3 and apply CEGIS(T ). The problem is ∃ f .∀x . x < 334455 → f (x)< 0 ∧ x ≥334455 → f (x ) = 0 which gives us the following speciﬁcation: σ( f , x ) = (x ≥ 334455 ∨ f (x)< 0) ∧ (x <334455 ∨ f (x ) = 0). The ﬁrst synthesis phase generates the candidate f (x )=0 for which the veriﬁcation phase returns the concrete counterexample x =0. As this candidate contains the constant 0, we generalise it to f [v ](x ) = v , for which we get 1 1 σ( f [v ], x ) = (x ≥ 334455 ∨ v < 0) ∧ (x <334455 ∨ v = 0). 1 1 1 Next, we use FM to eliminate x from f : ∃x .¬(σ ( f [v ], x )) =∃x .(x < 334455 ∧ v ≥ 0) ∨ (x ≥334455 ∧ v = 0). 1 1 1 Note that, given that formula ¬σ( f [v ], x ) is in DNF, for convenience we directly apply FM to each disjunct and obtain E (v ) = v ≥0 ∨ v =0, which characterises all the values of v 1 1 1 1 for which there exists a counterexample. When negating E (v ) we get v <0 ∧ v =0, which 1 1 1 is UNSAT. As there is no valuation of v for which the current f is a ﬁnalised solution, the result returned by the theory solver is (false, f [?]= f [?]), which is used to augment the speciﬁcation. Subsequently, a new CEGIS(T ) iteration starts. The learning phase has changed the speciﬁcation σ to σ( f , x ) = (x ≥334455 ∨ f (x )<0) ∧ (x <334455 ∨ f (x )=0) ∧ f [?]=? . 123 Synthesising Programs with Non-trivial Constants Page 13 of 25 19 This forces the synthesis phase to pick a new candidate solution with a different skeleton. The new candidate solution we get is f (x ) = ite (x <100) − 3 1, which works for the previous counterexample x =0. However, the veriﬁcation phase returns the counterexample x =100. Again, this candidate contains constants which we replace by symbolic variables, obtaining f [v ,v ,v ](x ) = ite (x <v )v v . 1 2 3 1 2 3 Next, we use FM to eliminate x from ∃x .¬(σ ( f [v ,v ,v ], x )) = 1 2 3 ∃x .¬(x ≥334455 ∨ (x <v → v <0 ∧ x ≥v → v <0)∧ 1 2 1 3 x <334455 ∨ (x <v → v =0 ∧ x ≥v → v =0)) = 1 2 1 3 ∃x .¬((x ≥334455 ∨ x ≥v ∨ v <0) ∧ (x ≥334455 ∨ x <v ∨ v <0)∧ 1 2 1 3 (x <334455 ∨ x ≥v ∨ v =0) ∧ (x <334455 ∨ x <v ∨ v =0)) = 1 2 1 3 ∃x .(x <334455 ∧ x <v ∧ v ≥0) ∨ (x <334455 ∧ x ≥v ∧ v ≥0)∨ 1 2 1 3 (x ≥334455 ∧ x <v ∧ v =0) ∨ (x ≥334455 ∧ x ≥v ∧ v =0). 1 2 1 3 As we work with integers, we can rewrite x <334455 to x ≤334454 and x <v to x ≤v −1. 1 1 Then, we obtain the following constraint E (v ,v ,v ) (as aforementioned, we applied FM 1 2 3 to each disjunct in ¬σ( f [v ,v ,v ], x )) 1 2 3 E (v ,v ,v ) = v ≥0 ∨ (v ≤334454 ∧ v ≥0) ∨ (v ≥334456 ∧ v =0) ∨ v =0 1 2 3 2 1 3 1 2 3 whose negation is ¬E (v ,v ,v ) = v <0 ∧ (v >334454 ∨ v <0) ∧ (v <334456 ∨ v =0) ∧ v =0 1 2 3 2 1 3 1 2 3 A satisﬁability witness is v =334455, v =− 1and v =0. Thus, the result returned by the 1 2 3 theory solver is (true,v =334455 ∧ v =− 1 ∧ v =0), which is used by CEGIS to obtain 1 2 3 the ﬁnalised solution f (x ) = ite (x <334455) −10 . 4.3 CEGIS(T ) with an SMT-Based Theory Solver For our second variant of a theory solver, we make use of an off-the-shelf SMT solver that can solve ﬁrst-order formulae with quantiﬁers. This approach is more general than the one described in Sect. 4.2, as there are solvers for a broad range of theories. Recall that our goal is to obtain a constraint C (P, P , v) that either characterises the ∗ ∗ valuations of v for which P [ v] is a ﬁnalised solution or blocks P [?] whenever no such valuation exists. Consequently, we use the SMT solver to check the satisﬁability of the formula Φ =∃v ∀ x.σ (P [ v], x ). If Φ is satisﬁable, then any satisﬁability witness c gives us a valuation for v such that P is a ﬁnalised solution: C (P, P , v) = v = c . Conversely, if Φ is unsatisﬁable, then i i i =1·n ∗ ∗ ∗ ∗ C (P, P , v) must block the current skeleton P [?]: C (P, P , v) = P[?] = P [?]. The formula passed to the SMT solver is still a non-trivial formula containing an alternating quantiﬁer, except now both quantiﬁer preﬁxes are ﬁrst-order, that is quantify over domain values not over functions. It would not be uncommon for an SMT solver to take substantially 123 19 Page 14 of 25 A. Abate et al. Table 1 Learned constraints for each SMT result combination (✗: unsat, : sat, ∅: timeout, ∗: any result) Φ ∧ v< K ✗✗ ∅∅ ∗ Φ ∧ v> K ✗ ∅ ✗ ∅∗ ∗ ∗ ∗ ∗ C (P, P , v) P[?]=P [?] P[?]=P [?]∧ v<KP[?]=P [?]∧ v>KK v=c v=c longer to solve this formula than the synthesis step of CEGIS takes. To avoid that, we impose a heuristically chosen timeout of 2 s on the veriﬁcation step. If the solver exceeds the timeout, CEGIS(T ) defaults to the behaviour of a standard CEGIS loop for the current iteration, and returns the concrete counterexample found by the CEGIS veriﬁcation phase (i.e., the satisﬁability witness for ¬σ(P , x )). To reduce the number of timeouts, we produce several formulae, each of which constrains Φ in a different way. These formulae are then passed to the SMT solver sequentially with a timeout. For each variable v in v , we produce two formulae: the ﬁrst, Φ ∧ v<K , constrains v to be smaller than the value K it took in the original candidate program found by the CEGIS synthesis phase (i.e., its corresponding value in P found by the CEGIS synthesis phase); the second, Φ ∧v>K , constrains it to be greater than K.Table 1 captures the possible outcomes in terms of the resulting C (P, P , v) . In column 2, both Φ ∧ v<K and Φ ∧ v>K are unsatisﬁable, meaning that the current program skeleton P [?] is blocked. In column 3 and 4, only one of the formulae is proved to be unsatisﬁable, meaning that the skeleton is only blocked for the corresponding subdomain of v. If both SMT calls time out, as captured by column 5, then, for the current iteration, CEGIS(T ) defaults to the behavior of a standard CEGIS loop and returns the concrete counterexample found by the CEGIS veriﬁcation phase. The last two columns capture the scenario where one of the SMT calls returns with a valuation for v. 4.3.1 Applying SMT-Based CEGIS(T ) to the Motivational Example Again, we recall the example in Sect. 3. We will solve it by using SMT-based CEGIS(T )for the theory of linear arithmetic. For this purpose, we assume that the synthesis phase ﬁnds the same sequence of candidate solutions as in Sect. 3. Namely, the ﬁrst candidate is f (x )=0, which gets generalised to f [v ](x )=v . Then, we invoke SMT twice for Φ and Φ , 1 1 v < v > 1 1 where Φ =∀x .(x ≥334455 ∨ v <0) ∧ (x <334455 ∨ v =0) ∧ v <0 v < 1 1 1 Φ =∀x .(x ≥334455 ∨ v <0) ∧ (x <334455 ∨ v =0) ∧ v >0 . v > 1 1 1 The SMT solver returns UNSAT for both, which means that C ( f , f ,v ) = f [?]=?.The ∗ ∗ second candidate is f (x ) = ite (x < 100) − 3 1, which generalises to f [v ,v ,v ](x ) = 1 2 3 ite (x <v )v v . The corresponding base constraint for the SMT solver is ∀x.σ ((ite (x < 1 2 3 v )v v ), x ), for which one SMT invocation obtains the satisﬁability witness v = 334455, 1 2 3 1 v =−1and v = 0. Then C ( f , f ,v ,v ,v ) = v =334455 ∧ v =− 1 ∧ v =0, which 2 3 1 2 3 1 2 3 gives us the same ﬁnalised solution we obtained when using FM in Sect. 3. 5 Case Study: CEGIS(T )WithinCVC4 In this section, we discuss the use of CEGIS(T ) to improve the search of the solution space of an existing and mature synthesiser, CVC4, winner of four out of ﬁve tracks of the Syntax- 123 Synthesising Programs with Non-trivial Constants Page 15 of 25 19 Guided Synthesis (SyGuS) competition 2019 [44]. We start by giving a general description of the internals of CVC4, followed by discussing the actual embedding of CEGIS(T ). For an in-depth description see [34]. 5.1 Enumerative Synthesis in CVC4 CVC4 makes use of several strategies for solving synthesis problems [35]. In this paper, we focus on its enumeration-based techniques [34]. We have integrated two variants of CEGIS(T ) as extensions of these techniques. Both are based on incorporating a theory solver for ﬁnding values of holes in candidate solutions. Before describing these extensions, we introduce necessary background details of how term enumeration is encoded inside an SMT solver. Term Enumeration via Deep Embedding CVC4 uses a specialised technique for enumeration-based synthesis that is based on encoding a given SyGuS grammar as an alge- braic data type [37]. Each value of this datatype can be understood as the Abstract Syntax Tree (AST) of a program generated by the grammar, as illustrated by the following example. Example 2 Given a program represented as a function P : (x : Int) × (y : Int) → Int and the context-free grammar R below, specifying which integer (I ) and Boolean (B) terms can appear in candidate solutions for P: I ::=0 | 1 | x | y | I + I | I − I | ite(B, I , I ) B::=B ≥ B | I I |¬B | B ∧ B CVC4 generates the following mutually recursive datatypes: I = 0 | 1 | x | y | plus(I, I) | minus(I, I) | ite(B, I, I) B = geq(I, I) | eq(I, I) | not(B) | and(B, B) Each datatype constructor corresponds to a production rule of R, e.g., plus corresponds to the rule I ::=I + I . A datatype term such as plus(x, y) represents the arithmetic term x + y. In the context of SyGuS, the given grammar captures the user-provided syntax restrictions. In the absence of such restrictions, CVC4 generates a default grammar that, roughly speaking, generates all possible programs of the speciﬁed type — for instance of type Int × Int → Int for the programs in the example above. Since the aim of CEGIS(T ) is to perform efﬁcient search space exploration without relying on syntax restrictions (see Sect. 1), CVC4 with CEGIS(T ) uses a default grammar in a similar way. The correspondence between the ASTs and their denotational semantics is achieved by a deep embedding, constructed automatically by CVC4, of the new algebraic data type into CVC4’s (combined) background theory. Concretely, CVC4’s background theory is extended with evaluation operators e , for each generated datatype D, whose semantics is to interpret the datatype values of sort D in the theory of the type represented by D. For instance, e , which takes as input a value of type I and two integer values for the variables x and y, respectively, is deﬁned axiomatically so that e (plus(x, y), 3, 4) ≡ 7. This way, determining whether a datatype value denotes an actual solution to the synthesis problem (1) amounts to checking the satisﬁability of ∀z. ∃ x . ¬σ(e (z, x ), x ) (2) ∗ ∗ in which z has type D. If CVC4 can ﬁnd datatype value P , encoding a program P ,such ∗ ∗ as the instance ∃ x . ¬σ(e (P , x ), x ) of (2) is unsatisﬁable, then P is a solution for (1). 123 19 Page 16 of 25 A. Abate et al. Depending on the structure of the synthesis problem, CVC4 uses different strategies for generating candidates solutions. It either applies a constraint-based (smart) enumeration, which allows for numerous optimisations [34, Section 2]; a highly optimised brute-force (fast) enumeration [34,Section3]; or a hybrid approach combining smart and fast enumeration [34, Section 4]. Note that the integration of CEGIS(T ) in CVC4 is agnostic to the enumeration strategy. 5.2 CEGIS(T ) in CVC4 via Skeleton Generation In this section, we discuss our ﬁrst approach for integrating CEGIS(T ) with CVC4. This direction follows Algorithm 1 very closely. Given a candidate solution P , if the veriﬁcation ∗ ∗ fails and P contains constants, then a skeleton P [?] is generated that replaces the constants with symbolic holes. Similar to Sect. 4.3, an off-the-shelf SMT solver, in this case CVC4 itself, which can perform efﬁcient quantiﬁer reasoning in the theories of linear arithmetic and bitvectors [30, 36], checks the satisﬁability of Φ =∃v ∀ x.σ (P [ v], x ). (4) If Φ is satisﬁable, learning the constraint C (P, P , v) = v = c (5) i i i =1·n provides a solution to the original conjecture. However, differently from Algorithm 1, if Φ is unsatisﬁable, meaning that the skeleton P [?] is infeasible for every constant value tuple c , we have no way of blocking CVC4 from generating new candidates differing from P only by the constant values. This is because our enumeration, in this ﬁrst approach, is not changed so that it can reason about skeletons. That is the motivation for our second approach, described in Sect. 5.3. Instead of enumerating concrete programs, it directly enumerates skeletons, which are turned into concrete solutions by the theory solvers. However, this ﬁrst approach has the advantage that it can be used with other approaches to CEGIS such as, for instance, divide and conquer [7, 8]. In that approach, the candidate solutions are built from enumerated partial solutions according to how they behave on the current set of counterexamples, which requires the enumeration to provide concrete programs rather than skeletons. ∗ ∗ Observe that when generating P [?] from P not all constants can be abstracted away, since this can, for instance, transform linear problems into non-linear ones. Example 3 Suppose that a candidate P = 1 + 2 × x is generated for a linear function. Abstracting the constants, we obtain P [?]= v + v × x, which is a non-linear query. CVC4 1 2 does not have efﬁcient support for the combination of non-linear arithmetic and quantiﬁers. In such cases, the generalisation is only partial. In this case, only the ﬁrst constant is abstracted, with the ﬁnal skeleton being P [?]= v + 2 × x. Note that this issue only impacts conjectures with the speciﬁc pattern of having con- stants multiplying variables. This does not prevent the handling of conjectures with multiple constants. 5.3 CEGIS(T ) in CVC4 via Any Constant Constructors In the second approach for CEGIS(T ), we explicitly model the holes in candidate solutions using a specialised datatype constructor ? : Int → I within the datatype that encodes the 123 Synthesising Programs with Non-trivial Constants Page 17 of 25 19 syntactic restrictions of the input. We refer to this as the any constant constructor. Internally, our solver treats an application of this constructor as the representation of any constant of integer type. Using this constructor, the process for generating candidate skeletons is made explicit at the level of the datatype, with datatype values now representing arithmetic terms with holes. Thus, in this approach, we remove all other constructors corresponding to concrete constants from the datatypes encoding the grammar and include only this constructor instead. Example 4 Consider the grammar R from Example 2. Using the any constant constructor, the datatype encoding of R becomes: I = ?(Int) | x | y | plus(I, I) | minus(I, I) | ite(B, I, I) B = geq(I, I) | eq(I, I) | not(B) | and(B, B) Notice that the argument of the any constant constructor ? is the builtin integer type Int. This is in contrast to the grammar from Sect. 2, which had no subﬁelds of integer type. Hence, the arithmetic theory solver of CVC4 will reason about values of this datatype in the (combined) theory of datatypes and integer arithmetic. In this approach, CVC4 generates candidate solutions P that can be abstracted to a skeleton P [?] by replacing each application of an any constant constructor with a hole. For example, plus(x, ?(i )) becomes x + v for fresh variable v and some (for now irrelevant) 1 1 integer value i. For each such skeleton, CVC4 proceeds to determine the satisﬁability of Φ =∃v ∀ x.σ (P [ v], x ).If Φ is unsatisﬁable, the constraint ∗ ∗ C (P, P , v) =¬ P [?] is learned, which will prevent the generation of all programs differing from P only on the constants, effectively blocking the skeleton P [?]. An advantage of this approach with respect to the approach in the previous section is that it allows the term enumerator to reason about the equivalence of expressions with holes in order to eliminate redundant candidates, with the effect of accelerating the search. In detail, the term enumeration techniques in CVC4 employ aggressive strategies for blocking candidate solutions that are equivalent to previously generated ones. This is a critical aspect of the efﬁciency of an enumerative-based synthesis solver as it allows it to recognise and immediately discard redundant candidate solutions. We call this process blocking via theory rewriting [34] since it uses the SMT solver’s own term simpliﬁer as an incomplete, but fast, checker for term equivalence. The process needs special consideration when extending the solver to CEGIS(T ). For example, consider the candidate solution x +0, whose datatype encoding is plus(x, 0). In CVC4’s default implementation of CEGIS, the term enumerator will skip this candidate (as well as all candidates that have it as a subterm) because it is redundant with the smaller candidate x. This is a problem for the approach in Sect. 5.2, which may then miss the candidate plus(x, 0) and and hence fail to generalise it to a ﬁnalised solution when a solution of the form plus(x, c) exists. By contrast, the approach in this section can be instrumented in this setting to reason about constants symbolically. In particular, instead of reasoning about the equivalence of candidate terms of the form P = plus(x, ?(c)) based on ﬁxing a value for c, our extensions to blocking via theory rewriting with CEGIS(T ) employ stronger criteria for term redundancy based on analyzing the entire set of skeleton instances. As an example, the skeleton term plus(?(c1), ?(c2)) is redundant according to our criteria since the addition of two constants is always equivalent to some constant. Thus, this skeleton is considered redundant with respect to the simpler skeleton ?(c). 123 19 Page 18 of 25 A. Abate et al. 6 Experimental Evaluation We compared CEGIS(T ) against CEGIS in two program synthesisers and evaluated the improvement in performance. The ﬁrst synthesiser is our prototype implementation fastsynth and the second one is CVC4, reﬂecting the case study in Sect. 5.Both 2 3 fastsynth and the implementation inside CVC4 are available to download. We conducted the experimental evaluation on an AWS c5.18xlarge We used the Linux time command to measure CPU time used for each benchmark. The runtime was limited to 1800 s per benchmark. We used MiniSat [17] as the SAT solver, and Z3 v4.5.1 [16]asthe SMT-solver in CEGIS(T ) with SMT-based theory solver. The SAT solver could, in principle, be replaced with Z3 to solve benchmarks over a broader range of theories. 6.1 CEGIS(T )in fastsynth The basic CEGIS implementation in fastsynth uses SAT/SMT solving for both synthesis and veriﬁcation. For synthesis, we construct a formula which encodes all possible programs up to a set program length and is satisﬁed if one such program satisﬁes the logical speciﬁcation. The formula introduces extra boolean “selector” variables, which choose the sequence of operators and operands in the candidate program P . Incremental Satisﬁability Solving The CEGIS implementation may sometimes perform hun- dreds of loop iterations before ﬁnding the correct solution. Recall that the synthesis block of CEGIS is based on SAT solving. Each iteration of CEGIS, the synthesis phase makes a call to a SAT solver. Consequently, we may end up making hundreds of calls to this SAT solver, which are all very similar (the same base speciﬁcation with some extra constraints added in each iteration). This makes CEGIS a prime candidate for incremental SAT solving. We implemented incremental solving in the synthesis block of CEGIS. Benchmarks We selected a set of bitvector benchmarks from the Syntax-Guided Synthesis (SyGuS) competition [5] and a set of benchmarks synthesising safety invariants and danger invariants for C programs [14]. The selection criterion was that the solution to be synthesized requires constants. All benchmarks are written in SyGuS-IF [33], a variant of the SMT-LIB 2 language [9]. Since the syntactic restrictions (called the grammar or the template) provided in the SyGuS benchmarks generally contain all the necessary non-trivial constants, we completely removed the grammars from these benchmarks. Removing just the non-trivial constants and keeping the rest of the grammar (with the only constants being 0 and 1) would have made the problem much more difﬁcult, as the constants would have had to be incrementally constructed by applying the operators available to 0 and 1. We group the 83 benchmarks into three categories: 47 fall into invariant generation,which covers danger invariants, safety invariants and the class of invariant generation benchmarks from the SyGuS competition; 6 derive from hackers/crypto, which includes benchmarks from hackers-delight and cryptographic circuits; and 7 benchmarks are categorised as com- parisons, i.e., benchmarks that require synthesising longer programs with comparisons, e.g., ﬁnding the maximum value of 10 variables. The remaining 23 benchmarks are listed under www.github.com/kroening/fastsynth.git. www.github.com/CVC4/CVC4.git. https://docs.aws.amazon.com/awsec2/latest/userguide/compute-optimized-instances.html. 123 Synthesising Programs with Non-trivial Constants Page 19 of 25 19 other, and are all benchmarks taken from the SyGuS competition that do not ﬁt neatly into any of the previous categories. Results In Table 2 we report results comparing four different conﬁgurations of CEGIS and CEGIS(T )in fastsynth, along with a “virtual best solver” result, which is the fastest result of all the conﬁgurations. These results are from the prototype implementation of CEGIS(T ) in [1]. The conﬁgurations presented in the table are as follows: –CEGIS(T )-FM: CEGIS(T ) with Fourier-Motzkin as the theory solver; –CEGIS(T )-SMT: CEGIS(T ) with Z3 as the theory solver; – CEGIS: basic CEGIS as described in Sect. 2.2; – CEGIS-Inc: basic CEGIS with incremental SAT solving; –CEGIS(T )-vbs: virtual best solver. The fastest result of the above four conﬁgurations. The results for our implementation of CEGIS(T )are giveninTable 2. In combination, CEGIS(T )-vbs and CEGIS(T )-SMT solve 6 more benchmarks than our straight implemen- tation of CEGIS (42 vs. 36). Unsurprisingly, CEGIS(T )-SMT solves more of the invariant generation benchmarks that require synthesising arbitrary constants than CEGIS. Also, CEGIS(T )-FM is generally faster than CEGIS for benchmarks with non-trivial constants as it avoids enumerating the constants. However, it fails to solve many of the other bench- marks that plain CEGIS can solve because attempting to apply FM slows it down. A noted weakness in our implementation of both CEGIS and CEGIS(T ) is that they are slow to synthesise long expressions. This is due to the iterative-deepening-style search performed by the implementation where, starting with n = 1, the space of possible programs of size n is searched exhaustively before considering those of size n + 1, and so on. 6.2 CEGIS(T )inCVC4 Benchmarks We tested CVC4 on the full set of benchmarks across the SyGuS competition minus the programming-by-example ones. We excluded PBE because there already exists a speciﬁc divide and conquer technique for solving these benchmarks [8] that is orthogonal to CEGIS(T ). We ran two sets of experiments. For the ﬁrst one, we removed the grammars from all the benchmarks. Notably, this enables CVC4 to use the full grammar from the background theory. In the second experiment, we included only the benchmarks that contain a grammar (e.g. the invariant benchmarks are excluded from this experiment as they don’t contain a grammar). For these benchmarks we removed all constant literals except 0 and 1 and then extended the grammars to permit any constant literal. For example, a syntactic template which originally contains the rules Int → 0and Int → 7 will now contain the rules Int → 0and Int → Any Integer Literal. We include 0 and 1 in the grammar so that the synthesis algorithms that do not implement CEGIS(T ), and instead enumerate through the grammar given are, at least theoretically, capable of solving the benchmarks that need constants. Results In Table 3 we present results comparing ﬁve conﬁgurations of CVC4: the default behaviour of CVC4 1.7 using CEGIS, CVC4 using CEGIS but adding constant literals from the benchmark to the grammar, the implementation CVC4-CEGIS(T ), as well as two “virtual best CVC4 solvers”, denoting the fastest results of some of the previous CVC4 conﬁgurations. The conﬁgurations presented in Table 3 are as follows: – CVC4-CEGIS: default behaviour of CVC4 version 1.7 using CEGIS, as described in [34]; https://github.com/SyGuS-Org/benchmarks/tree/master/lib [retrieved March 2021]. 123 19 Page 20 of 25 A. Abate et al. Table 2 Prototype CEGIS(T) experimental results – for every set of benchmarks, we give the number of benchmarks solved by each conﬁguration within the timeout and the average time taken per solved benchmark Benchmark CEGIS(T )-SMT CEGIS(T )-FM CEGIS CEGIS-inc CEGIS(T )-vbs # s # s # s # s # s comparisons 1 < 0.11 0.63 1 < 0.11 < 0.11 < 0.1 hackers/crypto 3 32.03 31.03 31.0 3 237.73 30.4 inv 23 220.914 60.3 17 203.017 86.8 23 171.1 other 15 77.713 85.115 89.215 18.715 18.6 total solved 42 151.031 65.9 36 135.636 68.5 42 102.5 – CVC4-ac: CVC4 using CEGIS, but adding constant literals from the benchmark to the grammar; – CVC4-CEGIS(T ): CEGIS(T ) as implemented inside CVC4 and described in Sect. 5.3; – vbs1: the fastest result of CVC4-CEGIS and CVC4-ac. – vbs2: the fastest result of CVC4-CEGIS, CVC4-ac and CVC4-CEGIS(T ). We exclude from the evaluation the single invocation solver of CVC4, which is not impacted in any way by the CEGIS(T ) approach. We do not compare against the technique described in Sect. 5.2 because it has been deprecated in CVC4 in favour of the more general one described in Sect. 5.3. The implementation of CEGIS(T ) inside CVC4 (as shown in Table 3) is able to extend the set of benchmarks that CVC4 with CEGIS solves by 24. This is shown in the virtual best solver results, where vbs2, including CEGIS(T ), solves 1420 benchmarks compared to vbs1, which solves 1396. Note that the virtual best solvers are presented not as practical parallel solvers, but to highlight the improvements of CVC4-CEGIS(T ) w.r.t. CVC4-CEGIS in solving these extra 24 problems. These improvements occur on benchmarks that require large constants, as expected. Moreover, it would be virtually impossible for CVC4-CEGIS to solve the 24 benchmarks that CVC4-CEGIS(T ) solves exclusively, since representing those large constants in the regular enumeration (which uses only the constants 0 and 1) requires prohibitively large programs. CVC4-CEGIS(T ) is worse than CVC4-CEGIS on some benchmarks because when no large constants are necessary for building a solution, CEGIS(T ) may only add overhead. CVC4-CEGIS(T ) is particularly effective on the invertibility conditions with grammars benchmark set, solving 7 more problems than both CVC4-CEGIS and CVC4-ac (101 bench- marks vs. 94). Moreover it signiﬁcantly outperforms CVC4-ac on harder benchmarks (those requiring more than 100 s to solve), as shown in Figure 4. These benchmarks contain synthe- sis goals for inverses of bit-vector operators (for more details, see Niemetz et al. [30]), whose solutions often require particular constants. Even though the grammars in these benchmarks contain carefully chosen constants to help the SyGuS solver [30, Sect 3.1], CVC4-CEGIS(T ) ability to ﬁnd arbitrary, necessary, constants still can lead to considerable improvements. On average, CVC4-CEGIS is better than CVC4-CEGIS(T ) because on benchmarks where constants are not needed there is some slight overhead to using CEGIS(T ). How- ever, CEGIS(T ) is able to solve benchmarks that CEGIS alone cannot and the portfolio of solvers is a winning combination. Also, CVC4-CEGIS performs better than CVC4-ac because providing explicit constants in the grammar requires CVC4 to enumerate through a 123 Synthesising Programs with Non-trivial Constants Page 21 of 25 19 Table 3 Experimental results for CVC4-CEGIS(T ) – for every set of benchmarks, we give the number of benchmarks solved by each conﬁguration within the timeout and the average time taken per solved benchmark Benchmark Num CVC4-CEGIS CVC4-ac CVC4-CEGIS(T ) vbs1(ﬁrst two) vbs2(all three) # s # s # s # s # s With grammar Conditional 128 114 0.65 114 0.65114 25.7 114 0.65 114 0.65 inverses Invertibility 128 94 85.594 82.7 101 111.494 82.4 102 84.8 conditions General 232 128 40.2 128 40.2 108 81.5 128 40.0 132 31.7 Total (with 488 336 39.4 336 38.7 323 71.2 336 38.5 348 37.1 grammar) Without grammar General track 308 147 98.9 144 57.6 129 152.7 149 65.9 149 60.2 (2018) Conditional 161 142 25 142 2.7 144 3.7 142 1.4 144 3.2 inverses Invertibility 160 103 84.096 84.2 107 83.0 105 71.7 112 66.0 condition Invariants 369 157 106.3 157 106 157 106 157 105.6 157 105.3 (woosuk) Invariants 456 203 84.1 199 126.9 175 95.8 208 106.7 208 106.7 (lustre) Invariant track 127 109 4.2 111 8.8 111 29.6 111 4.4 111 3.9 (2018) Invariants 276 184 69.2 187 60.4 185 78.7 191 66.4 194 65.7 (code2inv) Total (no 1808 1042 70.7 1033 68.61005 80 1060 65.6 1072 64.1 grammar) Total 2296 1378 63.1 1369 61.31328 77.9 1396 59.1 1420 57.5 Fig. 4 Comparison on the 128 invertibility condition benchmarks with grammars. Execution times in miliseconds 123 19 Page 22 of 25 A. Abate et al. bigger space. This illustrates the importance of techniques like CEGIS(T ): it allows ﬁnding constants when needed, and avoid enumerating them when they are not needed. 7 Related Work The traditional view of program synthesis is that of synthesis from complete speciﬁcations [29]. Such speciﬁcations are often unavailable, difﬁcult to write, or expensive to check against using automated veriﬁcation techniques. This has led to the proposal of inductive synthesis and, more recently, of oracle-based inductive synthesis, in which the complete speciﬁcation is not available and oracles are queried to choose programs [27]. A well-known application of CEGIS is program sketching [40, 42], where the programmer uses a partial program, called a sketch, to describe the desired implementation strategy, and leaves the low-level details of the implementation to an automated synthesis procedure. Inspired by sketching, Syntax-Guided Program Synthesis (SyGuS) [3] requires the user to supplement the logical speciﬁcation provided to the program synthesiser with a syntactic template that constrains the space of solutions. In contrast to SyGuS, our aim is to improve the efﬁciency of the exploration to the point that user guidance is no longer required. Another very active area of program synthesis are component-based approaches [2, 18– 20, 23, 24, 32]. Such approaches are concerned with assembling programs from a database of existing components and make use of various techniques, from counterexample-guided synthesis [23] to type-directed search with lightweight SMT-based deduction and partial eval- uation [19] and Petri-nets [20]. The techniques developed in the current paper are applicable to any component-based synthesis approach that relies on counterexample-guided inductive synthesis. Heuristics for constant synthesis are presented in [15], where the solution language is parameterised, inducing a lattice of progressively more expressive languages. One of the parameters is word width, which allows synthesising programs with constants that satisfy the speciﬁcation for smaller word widths. Subsequently, heuristics extend the program (including the constants) to the required word width. As opposed to this work, CEGIS(T ) denotes a systematic approach that does not rely on ad-hoc heuristics. Regarding the use of SMT solvers in program synthesis, they are frequently employed as oracles. By contrast, Reynolds et al. [35] present an efﬁcient encoding able to solve program synthesis constraints directly within an SMT solver. Their approach relies on rephrasing the synthesis constraint as the problem of refuting a universally quantiﬁed formula, which can be solved using ﬁrst-order quantiﬁer instantiation. Conversely, in our approach we maintain a clear separation between the synthesiser and the theory solver, which communicate in a well-deﬁned manner. In Sect. 6, we provide a comprehensive experimental comparison with the synthesiser described in [35]. 8 Conclusion We proposed CEGIS(T ), a new approach to synthesis of programs with non-trivial constants that combines the strengths of a counterexample-guided inductive synthesiser with those of a theory solver with the aim of improving the synthesis of programs with non-trivial constants. We discussed two options for the theory solver, one based on variable elimination and one relying on an off-the-shelf SMT solver for ﬁrst-order formulas with quantiﬁers, as well as a 123 Synthesising Programs with Non-trivial Constants Page 23 of 25 19 case study on integrating CEGIS(T ) inside CVC4. Our experiments results show a signiﬁcant performance improvement on benchmarks from the SyGuS competition on benchmarks that require synthesising arbitrary constants. in this paper, we evaluated CEGIS(T ) in the context of constant elimination when the seciﬁcation belongs to linear arithmetic. Other techniques for eliminating existentially quan- tiﬁed variables can be used. For instance, one might use cylindrical algebraic decomposition [12] for speciﬁcations with non-linear arithmetic. Funding This work was partly supported by ERC project 280053 (CPROVER) and the H2020 FET OPEN 712689 SC . Cristina David is supported by the Royal Society University Research Fellowship UF160079. Data Availability The benchmarks tested during the current work are available at https://github.com/kroening/ cegis-smt-journal-paper.git. Code Availability All the code developed and used for the current work is available at www.github.com/ kroening/fastsynth.git (the code for fastsynth)and www.github.com/CVC4/CVC4.git (the code for CVC4). Declarations Conﬂict of interest Cesare Tinelli, an author of this paper, is also on the journal’s editorial board. All the other authors declare they have no conﬂict of interest. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. References 1. Abate, A., David, C., Kesseli, P., Kroening, D., Polgreen, E.: Counterexample guided inductive synthesis modulo theories. In: CAV (1), Lecture Notes in Computer Science, vol. 10981, pp. 270–288. Springer (2018) 2. Albarghouthi, A., Gulwani, S., Kincaid, Z.: Recursive program synthesis. In: CAV, LNCS, vol. 8044, pp. 934–950. Springer (2013) 3. Alur, R., Bodík, R., Juniwal, G., Martin, M.M.K., Raghothaman, M., Seshia, S.A., Singh, R., Solar- Lezama, A., Torlak, E., Udupa, A.: Syntax-guided synthesis. In: FMCAD, pp. 1–8. IEEE (2013) 4. Alur, R., Cerný, P., Radhakrishna, A.: Synthesis through uniﬁcation. In: CAV, LNCS, vol. 9207, pp. 163–179. Springer (2015) 5. Alur, R., Fisman, D., Singh, R., Solar-Lezama, A.: SyGuS-Comp 2017: Results and analysis. CoRR abs/1711.11438 (2017) 6. Alur, R., Fisman, D., Singh, R., Udupa, A.: Syntax guided synthesis competition. http://sygus.seas.upenn. edu/SyGuS-COMP2017.html (2017) 7. Alur, R., Radhakrishna, A., Udupa, A.: Scaling enumerative program synthesis via divide and conquer. In: TACAS, LNCS, vol. 10205, pp. 319–336 (2017) 8. Barbosa, H., Reynolds, A., Larraz, D., Tinelli, C.: Extending enumerative function synthesis via SMT- driven classiﬁcation. In: Barrett, C.W., Yang, J. (eds.) Formal Methods In Computer-Aided Design (FMCAD), pp. 212–220. IEEE (2019). https://doi.org/10.23919/FMCAD.2019.8894267 9. Barrett, C., Shikanian, I., Tinelli, C.: An abstract decision procedure for a theory of inductive data types. JSAT 3(1–2), 21–46 (2007) 10. Bik, A.J.C., Wijshoff, H.A.G.: Implementation of Fourier-Motzkin elimination. Rijksuniversiteit Leiden, Tech. rep. (1994) 123 19 Page 24 of 25 A. Abate et al. 11. Clarke, E., Kroening, D., Yorav, K.: Behavioral consistency of C and Verilog programs using bounded model checking. In: Design Automation Conference, DAC ’03, pp. 368–371. ACM, New York, NY, USA (2003). https://doi.org/10.1145/775832.775928 12. Collins, G.E.: Hauptvortrag: Quantiﬁer elimination for real closed ﬁelds by cylindrical algebraic decom- position. In: Automata Theory and Formal Languages, LNCS, vol. 33, pp. 134–183. Springer (1975) 13. Cousot, P., Cousot, R.: Abstract interpretation: A uniﬁed lattice model for static analysis of programs by construction or approximation of ﬁxpoints. In: Principles of Programming Languages, POPL, pp. 238–252 (1977). https://doi.org/10.1145/512950.512973 14. David, C., Kesseli, P., Kroening, D., Lewis, M.: Danger invariants. In: Formal Methods (FM), LNCS, vol. 9995, pp. 182–198. Springer (2016) 15. David, C., Kroening, D., Lewis, M.: Using program synthesis for program analysis. In: LPAR, LNCS, vol. 9450, pp. 483–498. Springer (2015) 16. de Moura, L.M., Bjørner, N.: Z3: an efﬁcient SMT solver. In: TACAS, LNCS, vol. 4963, pp. 337–340. Springer (2008) 17. Eén, N., Sörensson, N.: An extensible SAT-solver. In: SAT, LNCS, vol. 2919, pp. 502–518. Springer (2003) 18. Feng, Y., Bastani, O., Martins, R., Dillig, I., Anand, S.: Automated synthesis of semantic malware signa- tures using maximum satisﬁability. In: NDSS. The Internet Society (2017) 19. Feng, Y., Martins, R., Geffen, J.V., Dillig, I., Chaudhuri, S.: Component-based synthesis of table consol- idation and transformation tasks from examples. In: PLDI, pp. 422–436. ACM (2017) 20. Feng, Y., Martins, R., Wang, Y., Dillig, I., Reps, T.W.: Component-based synthesis for complex APIs. In: POPL, pp. 599–612. ACM (2017) 21. Floyd, R.W.: Assigning Meanings to Programs, pp. 65–81. Springer, Dordrecht (1993). https://doi.org/ 10.1007/978-94-011-1793-7_4 22. Ganzinger, H., Hagen, G., Nieuwenhuis, R., Oliveras, A., Tinelli, C.: DPLL(T): fast decision procedures. In: CAV, LNCS, vol. 3114, pp. 175–188. Springer (2004) 23. Gulwani, S., Jha, S., Tiwari, A., Venkatesan, R.: Synthesis of loop-free programs. In: PLDI, pp. 62–73. ACM (2011) 24. Gulwani, S., Korthikanti, V.A., Tiwari, A.: Synthesizing geometry constructions. In: PLDI, pp. 50–61. ACM (2011) 25. Gulwani, S., Polozov, O., Singh, R.: Program synthesis. Found. Trends Program. Lang. 4(1–2), 1–119 (2017). https://doi.org/10.1561/2500000010 26. Gupta, A., Henzinger, T.A., Majumdar, R., Rybalchenko, A., Xu, R.: Proving non-termination. In: Pro- ceedings of the 35th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2008, pp. 147–158 (2008). https://doi.org/10.1145/1328438.1328459 27. Jha, S., Gulwani, S., Seshia, S.A., Tiwari, A.: Oracle-guided component-based program synthesis. In: ICSE (1), pp. 215–224. ACM (2010) 28. Kroening, D., Strichman, O.: Decision Procedures: An Algorithmic Point of View, 1st edn. Springer (2008) 29. Manna, Z., Waldinger, R.: A deductive approach to program synthesis. In: IJCAI, pp. 542–551. William Kaufmann (1979) 30. Niemetz, A., Preiner, M., Reynolds, A., Barrett, C., Tinelli, C.: Solving quantiﬁed bit-vectors using invertibility conditions. In: Chockler, H., Weissenbacher, G. (eds.) Computer Aided Veriﬁcation (CAV), Part II, Lecture Notes in Computer Science, vol. 10982, pp. 236–255. Springer (2018) 31. Nieuwenhuis, R., Oliveras, A., Tinelli, C.: Solving SAT and SAT modulo theories: From an abstract Davis-Putnam-Logemann-Loveland procedure to DPLL(T). J. ACM 53(6), 937–977 (2006) 32. Perelman, D., Gulwani, S., Grossman, D., Provost, P.: Test-driven synthesis. In: PLDI, pp. 408–418. ACM (2014) 33. Raghothaman, M., Udupa, A.: Language to specify syntax-guided synthesis problems. CoRR abs/1405.5590 (2014) 34. Reynolds, A., Barbosa, H., Nötzli, A., Barrett, C., Tinelli, C.: cvc4sy: Smart and fast term enumeration for syntax-guided synthesis. In: Dillig, I., Tasiran, S. (eds.) Computer Aided Veriﬁcation (CAV), Part II. Lecture Notes in Computer Science, vol. 11562, pp. 74–83. Springer, Cham (2019) 35. Reynolds, A., Deters, M., Kuncak, V., Tinelli, C., Barrett, C.W.: Counterexample-guided quantiﬁer instan- tiation for synthesis in SMT. In: CAV (2), LNCS, vol. 9207, pp. 198–216. Springer (2015) 36. Reynolds, A., King, T., Kuncak, V.: Solving quantiﬁed linear arithmetic by counterexample-guided instan- tiation. Formal Methods Syst. Des. (2017). https://doi.org/10.1007/s10703-017-0290-y 37. Reynolds, A., Viswanathan, A., Barbosa, H., Tinelli, C., Barrett, C.: Datatypes with shared selectors. In: D. Galmiche, S. Schulz, R. Sebastiani (eds.) International Joint Conference on Automated Reasoning (IJCAR), Lecture Notes in Computer Science, vol. 10900, pp. 591–608. Springer (2018) 123 Synthesising Programs with Non-trivial Constants Page 25 of 25 19 38. Rosen, E.: An existential fragment of second order logic. Arch. Math. Log. 38(4–5), 217–234 (1999) 39. Silva, J.P.M., Lynce, I., Malik, S.: Conﬂict-driven clause learning SAT solvers. In: Handbook of Satisﬁ- ability, pp. 131–153 (2009). https://doi.org/10.3233/978-1-58603-929-5-131 40. Solar-Lezama, A.: Program sketching. STTT 15(5–6), 475–495 (2013) 41. Solar-Lezama, A., Rabbah, R.M., Bodík, R., Ebcioglu, K.: Programming by sketching for bit-streaming programs. In: PLDI, pp. 281–294. ACM (2005) 42. Solar-Lezama, A., Tancau, L., Bodík, R., Seshia, S.A., Saraswat, V.A.: Combinatorial sketching for ﬁnite programs. In: ASPLOS, pp. 404–415. ACM (2006) 43. Strichman, O.: On solving Presburger and linear arithmetic with SAT. In: FMCAD, LNCS, vol. 2517, pp. 160–170. Springer (2002) 44. SyGuS: Syntax-guided synthesis competition. http://www.sygus.org/. Accessed 14 Oct 2019 Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional afﬁliations. Authors and Aﬃliations 1 2 3 4 Alessandro Abate · Haniel Barbosa · Clark Barrett · Cristina David · 5 6 7 8 Pascal Kesseli · Daniel Kroening · Elizabeth Polgreen · Andrew Reynolds · Cesare Tinelli Alessandro Abate alessandro.abate@cs.ox.ac.uk Haniel Barbosa hbarbosa@dcc.ufmg.br Clark Barrett barrett@cs.stanford.edu Pascal Kesseli kesseli.pascal@gmail.com Daniel Kroening daniel.kroening@magd.ox.ac.uk Elizabeth Polgreen elizabeth.polgreen@ed.ac.uk Andrew Reynolds andrew.j.reynolds@gmail.com Cesare Tinelli cesare-tinelli@uiowa.edu University of Oxford, Oxford, UK Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil Stanford University, Stanford, USA University of Bristol, Bristol, UK Lacework Ltd, Mountain View, CA, UK Amazon Inc., Oxford, UK University of Edinburgh, Edinburgh, UK The University of Iowa, Iowa City, USA
Journal of Automated Reasoning – Springer Journals
Published: Jun 1, 2023
Keywords: Program synthesis; Automated reasoning; Satisfiability modulo theories; Counterexample guided inductive synthesis
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.