Suport de Curs-AA

Published on February 2017 | Categories: Documents | Downloads: 52 | Comments: 0 | Views: 455
of 43
Download PDF   Embed   Report

Comments

Content

 

Algorithms & Complexity Theory Matei Popovici Computer Science Department POLITEHNICA POLITE HNICA Universit University y of Bucharest Bucharest January 9, 2015

 

2

0. 0.1 1 0. 0.1. 1.1 1

In Intr trodu oduct ctio ion n The The Apol Apollo lo examp example le

The story On the 11th of April 1970, the Apollo 13 mission is taking o ff , with the intention of landing on the Moon. Two days into the mission, an explosion in the Service Module forces the crew to abort mission and attempt a return to Earth. Facing extreme extreme hardship, especially especially due to loss of power, the crew finally succeeds in returning to Earth. Both the crew and the support team from Space Center were forced to find fast and simple solutions to apparently insurmontable problems1 . Drawing inspiration from this scenario, we consider the following inspirational example. The space mission S takes o ff  having   having Moon as destination. Some time  after take-o ff , S notices a malfunction in the main engine, which fails to respond to command inputs. Trying to hack-control the engine the   following circuit is identified: ϕ

≡   (¬A ∨ B ∨ D) ∧ (¬B ∨ C  ∨ D) ∧ (A ∨ ¬C  ∨ D) ∧ (A ∨ ¬B¬D) ∧(B ∨ ¬C  ∨ ¬D) ∧ (¬A ∨ C ¬D) ∧ (A ∨ B ∨ C ) ∧ (¬A ∨ ¬B ∨ ¬C )

If   ϕ  can be made to output 1, then the engine could be manually started. S  however notices that, with no apparent input for   for   A,   B ,   C   C   and   D, does the  circuit yield 1. S requires advice from Control, as how to proceed in this  situation. The solution solution which which Control Control must must prompt prompt is depende dependent nt on some some ke key y “background” information: (i) The position of  S  allows  S  allows only 5 minutes of window for a decision to be made. After 5 minutes, a safe mission abort is no longer possible. 2 (ii) Trying to find the solution “by “ by hand ” is unfeasible and likely to produce errors. 1

One example is the famous improvisation which made the Command Module’s square CO2  filters operable in the Lunar Module, which required such filters of round shape. 2 Th This is is hig highly hly similar similar to the real situati situation on of Apo Apollo llo 13: giv given en their posit position ion and malfunctio malfu nctions ns made a direc directt abort and return to Earth impossible. Instead, Instead, their return trajectory involved entering Moon orbit.

 

 

0.1. INT INTRO RODUC DUCTIO TION  N 

3

(iii) The actual problem problem to be solved solved consists consists in finding a certain “input  “input ” I , which assigns to each of  A of  A . . . D a D  a value from   {0, 1}, such that the circuit  ϕ  yields 1 (which we denote as  ϕ (I ) = 1). We call this problem S (ϕ). (iv) To solve  solve  S , one needs to find an algorithm for computing   ϕ(I ), ), given an input I  input  I .. Computing   ϕ(I ).   It is helpful helpful to examine the the structure structure of   ϕ  first, which shows certain regularity.   ϕ  is of the form: (L1

∨ . . . ∨ L1 ) ∧ . . . ∧ (L ∨ . . . ∨ L k

n1

k nk ))

where each (L (Li . . . Lini ) is a clause  a  clause  consisting   consisting of  n  n i  literals , and each literal i L j  is of the form  form   X   or   ¬X , where X  where  X  is  is a variable (input). Next, we identify a suitable   representation   representation   of which our algorithm can benefit. bene fit. Assume Assume   I  I    is repres represen ented ted as a ve vecto ctor, r, which which,, for conve convenie nience nce,, is indexed by variable names instead of 0 . . . n 1 (n ( n is the number of variables). Assume   ϕ  is represented as a matrix, where each column   ϕ[i] represents a vector vec tor of literals. The value value   ϕ[i, X ] = 1 indicates that  that   X  X  appears  appears as itself  in clause   i,   ϕ[i, X ] = 0 indicates that   X  X    appears in a literal   ¬X , while ϕ[i, X ] =  indicates that  that   X  is   is not present in clause   ϕ[i]. The algorithm to solv solvee our problem, entitled CHECK(ϕ, I ), ), proceeds as follows:







a) Set v Set  v  = 0 b) for i for  i  = 1, k: re-set v re-set  v  = 0 for each variable X  variable  X  from  from I   I :: set   v  = set  = v  v +  + ϕ[i, X ]

⊕ I [X ]

if   v  = 0 stop and return 0; otherwise, continue; c) return 1





The operation is a customized XNOR, which behaves as follows:   a b is always 0 if either of   a   or or   b   is . Hen Hence, ce, if a variabl ariablee   X  X    does not appear in a clause  clause   i, then   ϕ[i, X ] =  will not influence the value of  v.  v .   a b  is 1 if  both operands are the same. same. Hence Hence   ϕ[i, X ] I [X ] = 1 if   X  occurs   occurs as itself  in clause i clause  i  and is given value 1 in I  in  I ,, or if  X  occurs  X  occurs as  ¬ X  in   in clause i clause  i  and is given value 0 in I  in  I ..



 ⊥





 

4 At the end of the inner loop, CHECK will have computed the value of  some clause (L (Li . . . Lini ), which is 1 if at least some literal is 1. If, some clause has the computed value of 0, then   ϕ  is 0 for the input I  input  I .. Finally, note that CHECK performs at performs  at most   n k  computations, where n  is the number of variables, and k and  k  is the number of clauses in   ϕ.





Computing   S (ϕ) To so solv lvee  S ,  S , it is sufficient to take all possible inputs I  inputs  I ,, up to the number of variables which occur in   ϕ, and for each one, perform CHECK(ϕ, I ). ). This can be achieved by viewing the vector I  vector  I  as  as a binar binary y counter. counter. Hence, Hence, the operation I  operation I ++ ++ is implemented as follows, where variables X  variables X  are   are iterated in some arbitrary order, but fixed with respect to the algorithm: a) for each v variabl ariablee  X   X    in  in   I : if   I [ I [X ] = 1 make I  make  I [[X ] = 0 and continue; otherwise make I  make  I [[X ] = 1 and stop; b) overflo overflow; w; The operation operation  I   + + + is said said to overflow  to overflow , if instruction b) is reached, namely if all variables are set to 0 upon a traversal of all variables. Now, we implement FIND(ϕ): a) Set  Set   I  to   to be the input where all variables are set to 0; b) while overflow  while  overflow  was  was not signalled: if CHECK(ϕ, I )=1 )=1 then return 1 otherwise I  otherwise  I ++; ++; c) return 0 We now have a procedure for solving   S , one which Control may use in order to assist the Mission. However, several issues arise: (V ) is FIND(ϕ) correct, that is, bug-free, and returning an appropriate result ? (M) how long does FIND(ϕ) take to run, with respect to the size of the input (i.e. the size of the matrix   ϕ) ? (C ) is FIND(ϕ) an an optimal   optimal  algorithm?   algorithm? Are there ones which perform better?

 

 

0.1. INT INTRO RODUC DUCTIO TION  N 

5

Next, we discuss discuss each issue in more detail: (V ) denotes the Verification the  Verification problem. Verification of computer programs (that is, of algorithm implementations) is a essential nowadays, as more and more we rely on software to perform perf orm critical critical tasks: tasks: assist assist us in drivin drivingg cars cars and flying planes, planes, guidin guidingg missiles missil es and performing performing surgeries. surgeries. Verification erification is equally equally important important for the mission. (M) identifies a measurement  a  measurement  problem.  problem. Namely Namely, in what way Critical way Critical resources are consumed  consumed   by by an algorithm. Traditionally, raditionally, these resources  time,, i.e. number of CPU operations and space are time are and  space i.e.  i.e. computer memory. We note that, for our given formula, CHECK( ϕ, I ) will run at most 8 4 = 32 CPU operations, where 8 is the number of clauses, and 4 is the number of  variables ariables from any any input. FIND( FIND(ϕ) will build at most 2 n assignments assignments   I , where n where  n  is the number of variables. In our example, n example,  n =  = 4, thus we have 16 diff erent erent ways of assigning A assigning  A,, B,C   and  and   D  to values 0, 0, 1. For each, each, we run CHECK(ϕ, I ), ), yielding a total of 16 32 = 512 CPU operations. For the sake of our example, example, let us assume assume Mission’ Mission’ss com comput puters ers ran one CPU instruction per second, which is not unlikely, given the hardware available in the ’70s. Thus, in 5 minutes, we can perform: 5 60 = 300 CPU operations ! Note that FIND( ϕ) does not finish in su fficient time: if Control would use this algorithm, it would waste Mission’s time. Finally Finall y, (C ) designates a  complexity   problem, one specific to   Complexity plexit y Theory: Theory: “Do e  ffi cient cient alg algori orithm thmss exist, exist, for a given given pr probl oblem?  em? ”. ”. There are also sub-questions which spawn from the former:







•  is there a better encoding better encoding of  of the input and the variables from I  from  I , which makes the algorithm run faster? •  is there a certain machine certain  machine which  which allows performing computations in a more efficient (fast) way? Unfortunately, the Unfortunately, t he problem which we denoted by S , is traditionally called SAT: SA T: i.e. i.e. th thee  Satisfiability   probl problem em for boolean formu formulae lae.. It has been known that efficient algorithms to solve SAT are unlikely  are  unlikely  to   to exist, no matter the machine machine of choice. Translated ranslated into simple terms, in the general case, we can’t get substantially better than an exponential number of steps, with respect to the formula input. If Control would have this knowledge, then it will take no trouble in looking at specific algorithms. If would recognize the problem to be impossible to solve in the given time constraints, and would recommend Mission to return home. To recap, (V ), ), (M) and (C ) play an important role in decision making, both at the software software level, and beyond. Natur Naturally ally hard problems (re)occur (re)occur

 

6 if every field of science. Having at least a minimal knowledge on their nature is a critical step in attempting to solve them. Disclamer SAT is probably one of the most studied problems in Computer Science. SAT solvers are key components of many algorithms which solve even harder problems: proble ms: program program verificatio verification, n, code optimisation, optimisation, cryptograp cryptography hy,, only to name a few. While reading this, it is highly probable that the reader employs some technology which relies on SAT solvers, directly or indirectly. Are these solvers efficient, in the sense that, they run in less than exponential time? For the general case, the answer is no. However, these solvers will run fast enough for  most formulae . Thi Thiss doesn’t doesn’t really really help Control Control,, unless they are lucky enough to stumble upon a “nice  “ nice ”” form formula. ula. The intuition behind SAT solvers is that they rely on an efficient graphlike like struct structure ure to repres represen entt a formu formula. la. This This struct structure ure is called called an OBDD OBDD (Ordered-B (Orde red-Binary inary Decision Decision Diagram). The efficiency of an OBDD is unfortunately dependent tunately dependent on finding a “suitable” “suitable” ordering ordering for the variables variables.. The latter, in itself is a hard problem. However, there are good heuristics which come close. The overall overall result is that SAT SAT solvers perform well in “many  “ many ” cases, and the “many  “many ” is measurable with controlled precision. 0. 0.1. 1.2 2

Which Which is harde harder? r?

Consider the following problems: A telecommunications company T owns radios across the country, and  needs to monitor all links, to record performance. Monitoring each radio is  expensive, thus T would like a minimal number of   k  radios to be monitored. Is this possible?  An organism consists of a number of cells, and connections between cells. Some cells are good, while some are bad. The bad cells can be detected by  examining their links: they form a complete mesh: each cell is connected to all others. Are there (k-) bad cells in the organism?  It is easy to notice that both problems can be cast into graph problems. Let G Let  G =  = (V, E ) be an unoriented unoriented graph. If we interpret interpret each each node as a radio, and each edge as a radio link, then solving the former problem boils down to finding a subset   S    V  such V  such that, for all (a, (a, b)   E , at least one of the following holds:   a S ,  b S . Hence, S  Hence,  S    “covers  covers ”” all edges from E  from  E ..



  ⊆⊆



 ∈

 

 

0.1. INT INTRO RODUC DUCTIO TION  N 

7

If we interpret nodes as cells, and edges as connections between cells, then the latter problem consists in finding a subset   S  V   V   (of size   |S |  = k  =  k)) such that (a, (a, b) E  for   for each  each   a, b S . Hence  Hence   S  is   is a clique (of size k size  k). ). One interesting question which may be raised is whether one problem is harder har der than the other. other. We not notee tha thatt a graph graph capturing capturing the first first proble problem m can be transformed into one capturing the latter, and vice-versa. If we start from the “telecommunications  “ telecommunications ”” graph, we can build a “cell  “cell ” graph gra ph by: (i) creatin creatingg once once cell per radio radio (ii) if tw twoo radios radios do not share share a lin link, k, then then the correspo correspondi nding ng cells will share share a connec connectio tion. n. Thus, Thus, if some some subset S  subset  S  of  of radios covers all links, then all the cells corresponding to radios outside S  outside  S  must  must share connections, hence they are bad. We note that the transformation can be easily adjusted to work in the other direction. direction. Thus, Thus, if one has an algorithm algorithm to solve solve the Telecommun elecommuniications problem, then, via a transformation, we can solve the cell problem, and vice-versa. This observation highlights several issues:

 ⊆  ⊆





•  If such a transformation exists between two problems, what does it say about the complexity of solving the problems? •  Is it always the case that transformations are bijective? •  What are the properties of the transformation, such that it is e ff ective ective (can be used for problem solving)? solving)? For instance, instance, if some problem problem A 3 can be transformed to B in exponential time , and A can be solved in polynomial time, then solving B via A yields an algorithm in exponential time. •  Can (appropriate) transformations be used to characterize problems which are equally are  equally hard? hard? In the previous section, we illustrated a problem which can be solved in exponential time. We claimed, without arguing, that this problem cannot be solved in polynomial time. Given some problem  problem   P  P ,, if we can appropriately transform   P  transform P  into  into the former problem, we can also argue that   P  P    cannot be solve sol ved d in polynomial polynomial time. time. If this would would be possibl possible, e, an algorithm algorithm for  for   P  could also solve our former problem. Developing a formal methodology from the above intuition is a critical tool for assessing problem hardness. 3

with respect to the size of the input

 

8

0.2 0.2.1 0.2 .1

Com Compu puta tabi bili litty Th Theor eory y Proble Problems ms and and prob problem lem instan instances ces

In the previous section, we have illustrated the problem SAT, as well as a pseudocode pseud ocode which describes a solution in exponential exponential time. We have seen that such a solution is infeasible in practice, and also that no (predictible) techno tec hnologi logical cal advanc advancee can help. The main question question we asked asked (and also also answe ans wered red)) is whethe whetherr there there exists exists a faster faster procedu procedure re to solve solve SAT. SAT. (We (We have conjectured that the answer is   no no,, or, more precisely not precisely  not likely ..)) To generalize a little, we are interested in the following question: Can a given problem Q problem  Q  be solved in “e  “ e  ffi cient  cient ” time? For now, we go around the currently absent definition for “e  “ e  ffi cient  cient ”, ”, and note that such a question spawns another (which is actually more straightforward to ask): Can a given problem  problem   Q  be  b e “solved  solved ”” at all? In this chapter, we shall focus on answering this question, first, and to do so, we need to settle the following issues:

•  What exactly is a “problem  “problem ””?? •  What does it mean to “solve  “solve ” a problem? Definition 0.2.1.1 (Abstract problem, problem instance)   A problem instance is instance  is a mathematical object of which we ask a question and expect an  answer. An (abstract)   problem   is a mapping   mapping   P  P    :   I    O   where   where   I   I   is a set of  pr probl oblem em instan instancces of which which we ask the same questio question, n, and   O   is a set of  answers.   P  P    assigns to each problem instance   instance   i I  the I  the answer   P  P ((i) O.

 → ∈



It is often the case that the answers we seek are also mathematical ob jects. For instance, the vector sorting problem must be answered by a sorted vector. vec tor. Howeve However, r, many other problems problems prompt yes/no prompt  yes/no   answers. answers. Whenever O   =  { 0, 1}  we say that   P  P    is a  decision problem . Man Many y other problems problems can be cast cast in into to decisi decision on proble problems. ms. The vector vector sorting sorting pro proble blem m may may be seen seen as a decision problem, if we simply ask whether the problem instance (i.e. the vector) vector) is sorted. sorted. The original sorting sorting problem and it’s decision counterpart may not seem equivalent in terms of hardness. For instance, sorting

 

0.2.

 

COMPU COMPUT TABILITY THEOR THEORY  Y 

9

is solved in polynomial time   n log n   (using “standard” algorithms) which decidi dec iding ng whether whether a ve vecto ctorr is sorted can be don donee in linear time. time. We shall shall see that, from the point of view of Complexity Theory, both problems are equally hard. Definition 0.2.1.1 Definition  0.2.1.1 may  may seem abstract and unusable for the following reason: the set  set   I  I  is  is hard to characte characterize. rize. One solution solution may be to assign types  assign types  to problem instances. instances. For example graph  example  graph  may  may be a problem instance type. However, such a choice forces us to reason about problems separately, based on the type of their problem instances. instances. Also types themselve themselvess are an infinite set, which is also difficult to characterize. Another approach is to   level out   out   problem instances, starting from the following follo wing key observations: observations: (i) each  each   i   I  I    must be, in some sense finite. For instance, vectors have a finite length, hence a finite number of elements. Graphs (of which we ask our questions) also have a finite set of nodes, hence, a finite set of edges, etc. (ii) I  (ii)  I  must   must be countable  be  countable  (but   (but not necessarily finite). For instance, the problem P  problem  P    : R R {0, 1} where  where P   P ((x, y) returns 1 if  x and  x  and y   are equal, has no sense from the point of computabilit computability y theory. theory. Assume Assume we would like to answer   P  P ((π , 2). Simply Simply storin storingg   π   and 2, which takes infinite space, is impossible on machines, and also takes us to point (i). The observations observations suggest that problem problem instances instances can be represen represented ted via a   finite encoding  encoding , which may be assumed to be uniform over all possible mathematical mathem atical objects ob jects we consider. consider.

 ∈

√ × →

√ 

Definition 0.2.1.2 (Encoding problem instances)   Let   Σ  be a finite set  whom we call   call   alphabet. alphabet. A  one-letter  one-letter word  word is a member of   Σ. A  two-letter 2 instance, e, if   Σ   =   {a , b . . .}, then  word is any member of   Σ Σ   =   Σ . For instanc 2 (a, a)   Σ is a two-le two-lette tterr word. word. An   i-letter  word is a member of   Σi . We  denote by: ∗ 2 . . . Σi . . . Σ =  { } Σ Σ

 ∈

×

∪ ∪ ∪ ∪ ∪

the set of finite words which can be formed over   Σ.     is a special word which  we call  empty   empty word. word. Instead of writing, e.g.   (a,b,b,a,a a,b,b,a,a))  for a 5-letter word, we simply write  abbaa.   abbaa. Concatenation of two words is defined as usual.

 →  →

Remark 0.2.1.1   We shall henceforth consider that a problem   P  P    :   I    O ∗ has the followin followingg pr prop opert erty: y: if   I  I    is infinite, then   I    Σ ( I   I   is isomorphic  ∗ with   Σ ). Thus, each problem instance   i  can be represented as a finite word  enc((i) Σ∗ , for some   Σ∗ . enc

 '



We shall postpone, for now, the question of choosing the appropriate   Σ for our problem (the above remark simply states that such a   Σ  must exist).

 

10 Definition 0.2.1.2 and Definition 0.2.1.2  and Remark 0.2.1.1 Remark 0.2.1.1 can  can be easily recognized from practice. A programmer programmer always always employs the same prede predefined fined mechanisms mechanisms of his programming language (the available datatypes) to represent his program inputs inp uts.. Moreo Moreove ver, r, these these objects objects ultima ultimately tely become streams streams of bits, bits, when when they are actually processed by the machine. Making one step further, we can observe the following property of alphabets, which conforms with (ii): Proposition Propositi on 0.2.1.1 0.2.1.1  For any finite   Σ,   Σ∗ is infinitely countable. Proof:   We show   Σ∗   N. We build a bijective bijective function function   h  which assigns to each word, a unique natural. We assign 0 to   . Assume   |Σ|  = n  =  n.. We assign to each one-letter word, the numbers from 1 to   n. Next, we assign to each k 2-letter word  word   w  =  w 0 x  the number  number   n h(w0 ) + h(x). If   n  = 2 we easily recognise that each binary word is assigned to his natural equivalent.    Hence, we have the following diagram:

'

 ≥



i



∈ I  ↔  ↔ enc enc((i)) ∈ N enc((i) ∈ Σ ↔ h(enc

Hence, we can view a problem instance as a natural number, without losing the ability to uniquely identify the instance at hand. Thus:

 →

Definition 0.2.1.3 (Problem)   A problem is a function   function   f   f   :   N   N. If  some   n  encodes a problem input, then   f (n)  encodes it’s answer. A decision  problem is a function   f   : N {0, 1}.



To conclude: when trying to solve solve concrete problems, problems, the encoding encoding issue is fundamental, and this is dependent on the type of problem instances we tackle tac kle.. From the perspect perspectiv ivee of Com Comput putabil abilit ity y Theory Theory whi which ch deals with proble pro blems ms in genera general, l, the encoding encoding is unesse unessent ntial, ial, and can be abstra abstracte cted d without “loss “loss of information ” by a natural number. 0.2.2 0.2 .2

Algori Algorithm thmss as Turi Turing ng Machin Machines es

Algorithms are usually described as pseudo-code, and intended as abstractions over concrete concrete programming programming language language operatio op erations. ns. The level of abstracabstraction is usually unspecified rigorously, and is decided in an ad-hoc manner by the writer. From the author’s experience, experience, pseudo-code is often dependent dependent on (some future) implementation implementation,, and only abstracts abstracts from language language syntax, possibly including data initialization and subsequent subsequent handling. handling. Thus, Thus, some pseudo-code can be easily implemented in di ff erent erent languages to the

 

0.2.

COMPU COMPUT TABILITY THEOR THEORY  Y 

 

11

extent to which the languages are the same, or at least follow the same programming principles. The above observation is not intended as a criticism towards pseudocode and pseudo-c pseudo-code ode writing. writing. It is indeed indeed difficult, for instance, to write pseudocode which does not seem vague, and which can be naturally implemented in an imperative language (using assignments and iterations) as well as in a purely functional language (where iterations are possible only through recursion). As before, we require a means for   leveling out   out   diff erent erent programming styles and programming languages, in order to come up with a uniform, straightf straig htforwa orward rd and simple definition for an algorithm. algorithm. The key observation here is that programming languages, especially the newest and most popular, are quite restrictive w.r.t. to what the programmer can do. This may seem counter-i counter-intui ntuitive tive at first. Consider Consider typed languages gua ges for instan instance. ce. Enforc Enforcing ing each variab variable le to ha have ve a ty type pe is obvio obviousl usly y a restri restricti ction, on, and has a definit definitee purpose purpose:: it helps helps the progra programme mmerr write write cleaner code, and one which is less likely to crash at runtime. However, this issue is irrelevant irrelevant from the point of view of Computabilit Computability y Theory. Theory. If we try to search for less restrictive languages, we find the assembly languages. The restrictions are minimal here (as well as the “programming structure”). The formal definition for an algorithm which we propose, can be seen as an abstract assembly language, where all technical aspects are put aside. We call such a language the Turing Machine. Definition 0.2.2.1 (Deterministic Turing Machine)   A Deterministic Turing Machine (abbreviated Machine  (abbreviated DTM) is a tuple   tuple   M   = (K,F, Σ, δ , s0 )  where:

•   Σ =  { a , b , c , . . .}  is a finite set of   of   symbols which symbols  which we call   call   alphabet; alphabet; •   K  is   is a set of states, and   F 

× → × × ∈

K  is   is a set of   of   accepting/final states; states;

 ⊆ ∈

•   δ   ::  K  Σ K  Σ {L,H,R}  is a transition function which assigns  to each state   s K  K    and   c Σ  the triple   δ (s, c) = (s0 , c0 ,pos ,pos)).

 ∈ K  is   is an  initial   initial state. state.

•   s0

The Turing Machine has a tape which contains infinite cells in both directions, and on each tape cell we have a symbol from   Σ. The Turing Machine  has a  tape   tape head, head, which is able to read the symbol from the current cell. Also, the Turing Machine is always in a given state. Initially (before the machine  has started) the state is   ss 0 . From a given state   s, s , the Turing Machine reads  the symbol   c from c  from the current cell, and performs a  transition a  transition.. The transition 

 

12 is given by   δ (s, c) = (s0 , c0 ,pos ,pos)). Perfo Performing rming the transitio transition n means that the  0 TM moves from state   s   to  to   s , overrides the symbol   c  with   c0 on the tape cell  and:: (i) and (i) if   pos  pos   =   L  moves the tape head on the next cell to the left, (ii) if   pos =  pos  = R  R  moves the tape head on the next cell to the right and (iii)   pos pos =  = H   H  leaves tape head on the current cell. The Turing Machine will perform transitions according to   δ . Whenever the TM reaches an accepting/final state, we say it   it    halts. halts. If  a TM reaches a non-accepting state where no other transitions are possible, we say it   it   clings/hangs. clings/hangs.

•   the   the   input of input  of a Turing Machine is a finite word which is contained in  its otherwise empty tape; •   the   the   output  of a TM is the contents of the tape (not including empty  cells) after after the Machine Machine has halte halted. d. We also write  write   M (w)   to refer to the output of   M , given input   w. Example 0.2.2.1 (Turing Machine)   Consider the alphabet   Σ   =   {#, > , 0, 1}, the set of states   K   =  { s0 , s1 , s2 }, the set of final states  F    F   = { s2 }  and  the transition function: δ (s0 , 0) = (s (s0 , 0, R)   δ (s0 , 1) = (s ( s0 , 1 , R ) δ (s0 , #) = (s (s1 , #, L)   δ (s1 , 1) = (s (s1 , 0, L) δ (s1 , 0) = (s (s2 , 1, H )   δ (s1 , >) = (s2 , 1, H )

The Turing Machine   M   = (K,F, Σ, δ , s0 )  reads a number encoded in binary  on the tape, tape, and incr increments ements it by 1. The symbol  symbol   #  encodes the empty cell  4 tape. Initially, the tape head is positioned at the most significant bit of the  number. numb er. The Machine Machine first goes goes over over all bits, bits, from left to right. right. When When the   first empty cell is detected, detected, the machine goes into state  s  s 1 , and starts flipping  1s to 0s, until the first 0 (or the initial position, marked by   by   >) is detected. Finally, the machine places 1 on this current cell, and enters it’s final state. The behaviour of the transition function can be more intuitively represented as in Figure   Figure   2.2.1. 2.2.1. Each Each no node de represe epresents nts a state, state, and each edge, edge, a  0 tr trans ansiti ition. on. The label label on each edge edge is of the form   c/c ,pos   where   where   c   is the  0 symbol read from the current tape cell,   c is the symbol written on the current tape cell and   pos  pos   is a tape tape he head ad position. position. The label label should should be read read as: 0 the machine replaces   c   with   c on the current cell tape and moves in the direction indicated by  by   pos. pos. 4

We shall use # to refer to the empty cell, throughout the text

 

0.2.

 

COMPU COMPUT TABILITY THEOR THEORY  Y 

c/c,R c/ c,R  with  with c  c   ∈  {0, 1}

1/0, L 0/1, H  > /1 / 1, H 

#/#, L

start

s0

 

13

s1

 

s2

Figure 2.2.1: The binary binary incremen incrementt Turing Machine Machine Let us consider that, initially, on the tape we have   >0111  — the representation sentat ion of the number number 7. The evolution evolution of the tape is shown below. below. Each  line shows the  TM   TM configuration at step i step  i,, that is, the tape and current state, after transition   i. i . For convenience, we have chosen to show two empty cells  in each direction, only. Also, the underline indicates the position of the tape  head. Trans ansiti ition no Tape ape Cur urrren entt state  tate  0 1 2 3 4 5 6 7 8 9

   

##>0111## ##>0111##

 

##>0111##

 

##>0111##

 

##>0111##

 

##>0111##

 

##>0110##

 

##>0100##

 

##>0000##

 

##>1000##

                   

s0 s0 s0 s0 s0 s1 s1 s1 s1 s2

In order to better understand the Turing Machine, it is useful to establis lish h some some sim similar ilaritie itiess with, with, e.g. assem assembly bly language languages. s. As specifi specified ed in DefiDefinition nition 0.2.2.1,  a Turing Machine is  is   M  M  specifies  specifies a clearly defined behaviour, which 0.2.2.1, a is actually captured by   δ . Thus,  Thus,   M  M  is  is quite similar to a specific  program , p perfor erforming ming a definite task. If programs (algorithms (algorithms)) are abstracted by Turing Machine, then what is the abstraction for the programming language? gua ge? The answe answerr is, again, again, the Turin Turingg Machine Machine.. This This implies implies that, a Turing Machine acting as a programming language, can be fed another Turing Machine acting as program, and execute it. In the following Proposition, we show how Turing Machines can be encoded as words: Proposition 0.2.2.1 (TMs as words)  Any Turing Machine  M    M   = ((K,F, K,F, Σ, δ , s0 )   can be encoded as a word over   Σ. We w wri rite  te   enc enc((M )  to refer to this 

 

14 word. Proof:(sketch) Intuitively, Proof:(sketch)  Intuitively, we encode states and positions as integers  n

∈ N,

transitions as pairs of integers, etc. and subsequently “convert  “ convert ”” each integer to it’s word counterpart in   Σ∗ , cf. Proposition 0.2.1.1. Proposition  0.2.1.1. Let   NonFin   NonFin   =   |K   \   F   \ {s0 }|   be the set of non-fin non-final al states states,, exc exclud lud-ing ing th thee init initial ial one. one. We encod encodee ea eacch stat statee in   NonFin   NonFin   as an in inte tege gerr in {1, 2, . . . , |NonFin ||}}  and each final state as an integer in {|NonFin |+1 +1,, . . . , |NonFin |+|F |}. We encode the initial state s0  as  |NonFin |+ |F | + 1, and  L,H,R  L,H,R   as   |NonFin | +  | F | + i  +  i  with  with   i   {2, 3, 4}. Each Each integer integer from the above is represented as a word using log|Σ| (|NonFin | +  |F | + 4) bits. Each transition   δ (s, c) = (s0 , c0 ,pos ,pos)) is encoded as:

 d

 ∈

e

enc((s)#c#enc enc enc((s0 )#c’#enc enc(( pos  pos)) where   enc( enc(·) is the encodin encodingg descri described bed above. above. The entire entire   δ   is encoded a sequence of encoded transitions, separated by   #. The encoding of  M    M   is enc((M ) =  enc enc  enc((|NonFin |)#enc enc((|F |)#enc enc((δ ) 

Thus,   enc( enc(M ) is a word, which can be fed to another Turing Machine. The latter should have the ability to execute (or to simulate)   M . Th This is is is indeed possible: possible: Proposition 0.2.2.2 (The Universal Turing Machine)  There exists a  TM   U  U  which,  which, for any TM  M ,  M , and every word  w  w Σ∗ , takes   enc enc((M )  and   a nd   w as input and outputs   11  whenever   M (w) = 1   and  0  0  whenever  M (  M (w) = 0. We  call   U  U ,, the  Universal   Universal Turing Machine, Machine, and say that   U   U   simulates  simulates   M .



Proof:   Let  Let   M  M    be a TM and  and   w   =   c1 c2 . . . cn  be a word which is built from the alphabet of  M .  M . We build the Universal Turing Machine  U ,  U , as follows:

•   The input of   of   U  U    is   enc( enc(M )#enc enc((s0 )#c1 #c2 . . . cn . Note ote tha thatt   enc enc((s0 ) encodes the initial state of   of   M  M    while   c1   is the first symbol from   w. The portion of the tape  tape   enc( enc(s0 )#c1 #c2 . . . cn  will be used to mark the current configuration of   M , namely the current state of   M  M    (initially s0 ), the contents of   M ’s ’s tape, and  and   M ’s ’s current current head position. position. More generally, this portion of the tape is of the form   enc enc((si )#u#v , with ∗ u, v   Σb   and  and   si  being the current state of   M . The la last st symbol symbol of   u marks the current symbol, while   v  is the word which is to the left of  the head. Initially, the current symbol is the first one, namely  namely   c1 .

 ∈

 

0.2.

COMPU COMPUT TABILITY THEOR THEORY  Y 

 

15

•   U  U    will scan the initial state of   of   M , then it will move on the initial symbol from  from   w, and finally will move on the portion of   enc enc((M ) were transitions are encoded. Once a valid transition is found, it will execute it: 1.   U  U  will  will change the initial state to the current one, according to the transition; 2.   U  will   will change the original symbol in w in w according  according to the transition; 3.   U  U    will ch chang angee the curren currentt symbol, symbol, accord according ing to   pos pos,, from the transition;

•   U  U    will repeat this process until an accepting state of   of   M  M  is  is detected, or until no transition can be performed. 

Propositions   0.2.2.2 0.2.2.2   and  and   0.2.2.1   show that TMs have the capability to characterize both algorithms, as well as the computational framework to execut exe cutee them. them. One question question remains remains:: what what can TMs actually actually compute compute?? Can they be used used to sort vectors vectors,, solve solve SAT, SAT, etc.? The answer, answer, which which is positive is given by the following hypothesis: Conjecture 0.2.2.1 (Church-Turing)  Any problem which can be solved  with the Turing Machine is “ universally “ universally solvable”. solvable”. The term “universally “universally solvable ” cannot be given a precise mathematical definition. We only know solvability w.r.t. concrete means, e.g. computers and programming progr amming languages, languages, etc. It can be b e (an has been) shown that the TurTuring canTuring solve any problem whichdescribes known programming languages can Machine solve.5 The solve. Machine, in itself, a  model of computation  based on side-eff ects: ects: each each transi transition tion may modify the tape in some some wa way y. Computation can be described di ff erently, erently, for instance: as function application, or as term rewriting. However, all other known computational models are equivalent to the Turing Machine, in the sense that they solve precisely the same problems. This observation prompted the aforementioned conjecture. It is strongly believed to hold (as evidence suggests), but it cannot be formally proved. 5

To be fair fair to the TM, one would would formu formulat latee this this state statemen mentt as: ”all ”all pro progra grammi mming ng languages are Turing-complete, i.e. they can solve everything the TM can solve”

 

16 0.2.3 0.2 .3

Decidab Decidabili ility ty

The existence of the Universal Turing Machine (U) inevitably leads to interesting questions. Assume  Assume   M  M  is  is a Turing Machine and  and   w  is a word. We use the following convention:   enc( enc(M )   w  in order to represent the input of the U  U .. Thus, U  Thus,  U  expects  expects the encoding encoding of a TM, followed followed by by the special symbol symbol , and   M ’s and ’s input w input  w.. (?) Does  Does   U  U  halt  halt for all inputs? If the answer answer is positive, then   U  then U  can  can be used to tell whether any machine halts, for a given input. We already have some reasons to believe we cannot answer positively to (?), if we examine the proof of Proposition 0.2.2.2 Proposition  0.2.2.2.. Actually (?) is a decision problem, one that is quite interesting and useful. As before before,, we try to lift lift ou ourr se sett ttin ingg to a more more genera generall on one: e: Can Can an any y proble pro blem m be solved solved by some Turing uring Machin Machine. e. The fol follo lowin wingg proposi proposition tionss indicate that it’s not likely the case: Proposition Propositi on 0.2.3.1 0.2.3.1   The set   T M   of Turing Machines is countably infinite. Proof:   The proof follows immediately immediately from Proposition Proposition 0.2.2.1  0.2.2.1.. Any Turing Machine can be uniquely encoded by a string, hence the set of Turing Machines is isomorphic to a subset of   Σ∗ , which in turn is countably infinite, since   Σ∗ is countably infinite, for any   Σ.    Proposition 0.2.3.2 Proposition 0.2.3.2   The set   H om( om(N, N)   of functions   functions   f   f   :   N countably infinite.

 →   N  is un-

Proof:   It is sufficient to show that   H om( om(N, {0, 1}) is uncountably infinite. We build a proof by contraposition. We assume H assume  H om om((N, {0, 1}) is countably infinite. Hence, each natural number   n N  corresponds to a function  f n H om om((N, {0, 1}). We build a matrix as follows: Columns Columns describe functions functions



 ∈

f n   :   n   N. Rows Rows describe describe inputs inputs   k   N. Each Each matrix content content   mi,j   is the value of   f  j (i) (hence, the expected output for input  input   i, from function f  function  f   jj ). In the Figure   0.2.3, 0.2.3, we have have illustr illustrated ated our matrix. matrix. We no now w devise devise a ∗ problem f  problem  f  as follows:

 ∈

 ∈

f ∗ (x) =

 1 0

iff  f   f x(x) = 0 iff  f   f x(x) = 1

Since   f ∗ H om Since  om((N, {0, 1}) it must also have a number assigned to it:   f ∗ = f    for some   α   N. Then   f ∗ (α) = 1 if   f  (α) = 0. 0. But   f  (α) =   f ∗ (α). Contradict Cont radiction. ion. On the other hand  hand   f ∗ (α) = 0 if   f  (α) = 1. As befo before re we we obtain a contradiction.   



α

 ∈

α

α

α

 

0.2.

 

COMPU COMPUT TABILITY THEOR THEORY  Y 

f 0   f 1   f 2   0 1 1 0   1 0 1 1   2 1 0 1   . .. . .. . .. . .. n   1 1 0   . .. . .. . .. . ..

. .. ... ... ... . .. ... . ..

fn      0     0     1   . ..   1   . ..

17

... ... ... ... . .. ... . ..

Figure 2.3.2: An example of the matrix of the proof of Proposition  0.2.3.2.  0.2.3.2. The value  value   mi,j  have been filled out purely for the illustration. Propositions   0.2.3.2   and   0.2.3.1   tell us that there are infinitely more functions (decision problems) what means of computing them (Turing Machines). Our next step is to look at solvable and unsolvable problems, and devise a method method for separa separatin tingg the first first from from the latter. latter. In other words, words, we are looking for a tool which allows us to identify those problems which are solvable, and those which are not. We start by observing that Turing Machines Machines may never never halt. We write M (w) =   to designate that   M  M    loops infinitely for input   w. Also, lso, we w write   n write   N  to refer to the number which corresponds to   w , according to Proposition 0.2.1.1 Proposition  0.2.1.1.. Next, we refine the notion of  problem   problem solving :

 ⊥ ∈

Definition 0.2.3.1 (Decision, acceptance)   Let  M  be  M  be a Turing Machine  and   f  H om om((N, {0, 1}).We say that:

  ∈∈

•   M  M    decides  decides   f , i  ff  for   for all   n   N:   M (w) = 1  whenever   f (nw ) = 1   and  M (w) = 0  whenever   f (nw ) = 0. 0.

 ∈

•   M  M    accepts  accepts   f  f    i  ff  for   for all   n n   i  ff   f (n) = 0.



 f (n ∈ N:   M (w) = 1  i  ff  f (

w)

= 1, 1 , and  M (  M (w) =

Note that, in contrast with acceptance, decision is, intuitively, a stronger means of computing computing a function (i.e. solving solving a problem). problem). In the latter case, the TM at hand can provide both a   yes   yes   and a  a   no  answer to any problem instance, while in the former, the TM can only provide an answer of  yes  of  yes . If  the answer to the problem instance at hand is   no no,, the TM will not halt. Based on the two types of problem solving, we can classify problems (functions) as follows:

  ∈∈ H om om((N, {0, 1})  be a decision problem.

Definition 0.2.3.2   Let   f 

 

18

•   f  f    is  recursive    recursive   ( decidable ( decidable)) i  ff  there   there exists a TM   TM   M  M    which decides   f . The set of recursive functions is  R  =  { f 

om(N, {0, 1})  |   f is recursive   }   ∈∈ H om(

•   f  f    is   recursively enumerable  enumerable   ( semi-decidable ( semi-decidable)) i  ff  there   there exists a TM  M  M  which  which accepts   f . The set of recursive-enumerable functions is 

  ∈∈ H om om((N, {0, 1})  |   f is recursively-enumerable   }

RE   = { f 

Now, let us turn our attention to question ( ?), which we shall formulate as a problem: f h (n

enc(M )   w

)=

 1

iff   M (w) halts iff   M (w) =



0

Hence, the input of   f  f  is  is a natural number which encodes a Turing Machine M  chine  M  and  and an input word w word  w.. The first question we ask, is whether f  whether f h R.

 ∈

 6∈ R.

Proposition Propositi on 0.2.3.3 0.2.3.3   f h

 ∈

by  M h  the Turing Machine which decides Proof:   Assume f  Assume  f h R  and denote by M  f h . We build the Turing Machine  Machine   D, as follows: D(enc enc((M )) )) =

 ⊥   i  M   M  (enc enc((M )   enc enc((M )) )) = 1 ff 

1

h

iff  M h (enc enc((M )   enc enc((M )) )) = 0

The existence of the Universal Turing Machine guarantees that   D   can indeed be built, since  since   D  simulates  simulates M   M h . We note that  that   M h (enc enc((M )   enc enc((M )))) decides if the TM M  TM  M  halts  halts with “itself  “itself  ”   ” as input (namely  enc  enc((M ))). ). h enc Assume  Assume   Dnot (enc enc( (D))for=input enc 1. Henc He ncee( D M  enc((D D( ),enc ,enc( (D is, machine chine D  D  does halt input  enc( ). (Hence D Hence (enc enc( (D))))==0, that . Contradiction. Assume   D(enc enc((D)) = . Hence ence   M h (enc enc((D),enc ,enc((D)) = 1, and thus D(enc enc((D)) halts. Contradiction.    We note that the construction of   of   D   mimics mimics the techni technique que which which we applied for the proof of Proposition 0.2.3.2 Proposition  0.2.3.2,, which is called diagonalization  called  diagonalization 



 ⊥

Exercise 0.2.3.1 Exercise 0.2.3.1   Apply Apply the diagon diagonali alizat zation ion techn techniqu iquee fr from om the pr pro oof of  Proposition  0.2.3.2, in   0.2.3.2,  in order to prove Proposition  0.2.3.3    0.2.3.3 .

 ∈ RE .

Proposition Propositi on 0.2.3.4 0.2.3.4   f h

 

0.2.

COMPU COMPUT TABILITY THEOR THEORY  Y 

 

19

Proof:  We build a Turing Machine M  Machine  M h  which accepts f  accepts  f h . Essentially, M  Essentially,  M h   is the Universal Turing Machine.   M h (enc enc((M )   w) simulates  simulates   M , and if   M (w) halts, then it outputs 1. If   M (w) does not halt, M  halt,  M h (enc enc((M )   w) = .    Propositions 0.2.3.3 Propositions  0.2.3.3 and  and 0.2.3.4  0.2.3.4 produce  produce a classification for f  for f h . The question which we shall answer, is how to classify any problem  f ,  f , by establishing membership in R in  R and  and RE   RE , respectively. We start with a simple proposition:



Proposition Propositi on 0.2.3.5 0.2.3.5   R ( RE .

 ⊆

  ∈∈

Proof:   R  R  RE  E   is straightforward from Definition 0.2.3.2. Definition  0.2.3.2.   Let  Let   f   R,  R , and 0 0 M ff   be the TM which decides M  decides  M f  TM M  such that M  that M  (w) = 1 f . We build the TM M  0 0 iff   M ff  (w) = 1 and  and   M  (w) =   iff   M f  simulates   M  M    but enters f (w) = 0.   M  simulates 0 into an infinite loop whenever   M f  accepts f   f    hence f  hence  f  RE . f (w ) = 0.   M  accepts R =   RE   RE   has already been shown by Propositions   0.2.3.3   and and   0.2.3.4. 0.2.3.4.    but  f h R. f h RE   but f  Thus,   R   and  and   RE  should “ scale ” for solvability:   R RE  should be interpreted as a “scale  membership is complete solvability,   RE   RE   membership is partial solvability, while non-membership in  in   RE   is “complete  complete ”” unsolvability.

 ⊥

 ∈

 6

  ∈∈

 6∈

Remark 0.2.3.1  We note that  R and   R  and  RE  are  RE  are not the only sets of functions  which are used used in Computabili Computability ty Theory. Theory. It has been been shown that there there ar are  e  “ degrees “ degrees” ” of unsolvability, of “ higher “ higher level” level” than  R  R  and   and  RE   RE .. These degrees  are intuitively obtained as follows: We assume we live in a world where   where   f h is decidable (recursive). Now, as before, we ask which problems are recursive  and which ar aree recursive recursively-en ly-enumer umerable. able. It turns out that, also in this ideal  ideal  case, there still exist recursive and recursively-enumerable problems, as well  as some which are neither. This could be imagined as “ undecidability “ undecidability level 1”. Now, Now, we take some probl problem em which which is in   RE   RE   on level 1, and repeat the  samee assump sam assumptio tion: n: it is de decid cidabl able. e. Again, gain, under under thi thiss assump assumptio tion, n, we find  pr problem oblemss in  in    R,   RE   RE   and outs outsid idee th thee tw two, o, wh whic ich h fo form rm up “ undecidability “ undecidability level 2”. 2”. This process process can be repeate repeated  d  ad  ad infinitum. infinitum. Returning to our simpler classification, we must observe an interesting feature of recursively-enumerable functions, which is also the reason they are called this way.

 ∈

Proposition 0.2.3.6 Proposition 0.2.3.6   A function   function   f    H om( om(N, {0, 1})   is recursi recursively vely enumerable i  ff  there   there exists a Turing Machine which can  enumerate/generate   enumerate/generate w all elements in   Af    =   {w   N   |   f (n ) = 1}. Intuit Intuitive ively, ly,   Af    is the set of  inputs of   f  f  for  for which the answer at hand is   is   yes. yes.

 ∈

 

20



Proof:   =   Suppose   f  f    is recursively-enumerable and   M  M    accepts   f . We ∗ specify the TM genera generatin tingg write   wi  to refer to the   ith word from   Σ . We specify Af  by the following pseudocode: Algorithm 1: GEN() 1:  GEN()



static   Af   = ; static  k  = 0; while True  while  True    do for 0 for  0 i k   do run   M (wi ); run if   if   M (wi )  halts before   k  steps and   i Af   = A f  {wi }; return   wi return end end k  =  = k  k +  + 1; end

≤ ≤

 ∪

6∈ A   then f 

The value of   k   from the   for   has a two-fol two-fold d usage. usage. First, First, it is used to explore all inputs  inputs   wi   : 0  i  k.  k . Sec Second ond,, it is used as a time-li time-limit mit for  for   M . precisely   k   step steps. s. If   M (wi ) = 1 in at most run   M (wi ), for precisely  For each  each   wi  we run  k  steps, then  then   wi  is added to  to   Af , and then returned (written on the tape). Also, w Also,  w i  is stored for a future execution of GEN. If  M  If  M ((wi ) = 1 for some w some  w i , then there must exist a k a  k  : k  :  k i  such that M  that  M ((wi ) halts after k after  k  steps. Thus, such a  a   k  will eventually be reached. = Assume Assume we have the Turing Turing Machine Machine GEN which generates generates   Af . We construct a Turing Machine M  Machine  M  which  which accepts  accepts   f .   M  M  works  works as follows:

 ≤  ≤





Algorithm 2: M  2:  M ((w)



Af   = ,  n  = 0; while   w Af   do while w  =  = GE  GEN  N ((); ); Af   = A f  {w} end return 1 return  1 M  M    simply uses GEN to generate elements in   Af . If  If    w   Af , if will eventu eve ntually ally be generated, generated, and   M  M    will will ou outp tput ut 1. Othe Otherw rwis isee   M  M    will loop. Thus M  Thus  M    accepts f  accepts  f ..    Proposition   0.2.3.6  is useful since, in many cases, it is easier to find a generator for f  for  f ,, instead of a Turing Machine which accepts  f .  f .

6∈

 ∪

 ∈

 

0.2.

 

COMPU COMPUT TABILITY THEOR THEORY  Y 

21

Finally, in what follows, we shall take a few decision problems, and apply a  reduction   reduction  technique,  technique, in order to prove they are not decidable. Halting on all inputs   Let: f all all (n

enc(M )

)=

 1

iff  M   M halts for all inputs 0 ot othe herw rwis isee

 6∈  ∈

a  reduction  (from f    (from  f h ). It The technique we use to show f  show  f aallll R  is called a reduction  R. Starting from the TM M  TM  M all proceeds as follows. We assume f  assume  f all all  which all decides   f all decides decides  f h . Thus, if   f all all  we built a TM which decides f  all  is decidable then f h  is decidable, which leads to contradiction. First, for each fixed TM  TM   M  M    and fixed input   w   Σ∗ , we build the TM ΠM,w (ω ) =   “Replace   ω   by   w  and then simulate   M (w )”. It is is easy easy ttoo se seee ∗ that ( ω Σ :   ΠM,w (ω ) halts) iff   M (w) halts. Now, we build the TM M  TM  M h which decides  decides   f h . The The input input of   M h   is  is   enc( enc(M )   w. We construct construct   ΠM,w   and

 ∈

∀  ∈

Π all (enc a ll must run run M  enc( ( M,w Byfor assumption assumption  ll  must always halt. the output   ΠM,w is 1, M  then (ω ) )). halts all inputs,  M  hence  M ((w  M  ) halts. WeIfoutput 1. If  the output is 0, then ΠM,w (ω ) does not halt for all inputs, hence M  hence  M ((w) does not halt. We output 0. We have built a reduction from   f all Usin ingg th thee TM which which dedeall   to   f h : Us cides   f all cides Since ce   f h  is not all  we have constructed a machine which decides   f h . Sin recursive, we obtain a contradiction.

 ∈  ≤

om((N, {0, 1}). Definition 0.2.3.3 (Turing-reducibility)   Let   f A , f B   H om We say   f A   is   is   Turing Turing reducible   to   f B , and write   write   f A T    f B   i  ff   there exists a   a   decidable transformation   T    H om( om(N, N)   such that   that   f A (n) = 1   i  ff  f B (T ( T (n)) = 1. 1.

 ∈

Remark 0.2.3.2 (Reducibility)  We note that the transformation  T    T   must  be computable, in the sense that a Turing Machine should be able to compute    T , pute  T , for any possible valid input.When proving   proving   f halt T    f all all , we have  halt enc(M )   w taken    n taken  — an instance of   of   f h   and shown that that it could ould be tr trans ans-enc(M )   w enc(ΠM,w )  formed into  into   n )=1 — an instance of   f all all , such that   f h (n enc(ΠM,w ) i  ff   f aallll (n ) = 1. A Turing Machine can easily easily pe perform rform the transfo transforrmation of   enc( enc(M )   w   to  to   enc( enc(ΠM,w )  since it involves adding some states and  transitions which precede the start-state of   M , hence   T   T   is computable.

 ≤

Halting on 111 enc(M )

f 111 111 (n

)=

 1

iff  M   M (111) (111) halts 0 other otherwi wise se

 

22 We reduce  reduce   f h   to  to   f 111 Assume   M 1111  decides   f 1111 Given a Turing uring Ma111 . Assume 11  decides 11 . Given chine M  chine  M  and  and word  word   w, we construct the machine: ΠM,w (ω )

=   if   ω  = 111 then 111  then   M (w)   else loop

We observe that (i) the transformation from   enc enc((M )   w   to   enc enc((ΠM,w ) is decidable since it involves adding precisely three states to   M : the these se states states check the input   ω , and if it is 111 — replace it with   w   and run   M ; (ii) M 111 R. 111 (ΠM,w ) = 1 iff   M (w ) halts. The reduction is complete.   f 1 111 11

 6∈

Halting on some input   We define: enc(M )

f any any (n

)=

 1

iff  M   M ((w) halts for some w some  w 0 other otherwi wise se



∈Σ

We reduce  reduce   f 111 to   f any assume   f any by   M any con111   to  any . We assume any  is decided by  any . We construct: ΠM (ω ) =   Replace   ω   by by   111 111   and and   M (111) (111) Now,   M any Now,  enc((ΠM )) = 1 iff   M (111) (111) = 1, hence we can use   M any any (enc any  to build R. a machine which decides  decides   f 111 Contradiction  f any 111 . Contradiction f  any

 6∈

Machine halt equivalence   We define: enc(M 1 )  enc (M 2 )

f eq eq (n

)=

 1

for all  w 0 ot othe herw rwis isee



∈ Σ :   M 1(w) halts iff  M   M 2 (w) halts

We reduce  reduce   f all to   f eq et   M triv eq . Let  triv  be a one-state Turing Machine which all   to  halts on every input, and   M eq eq   be the Turing Machine which decides   f eq eq . Then M  Then  M eq enc((M )   enc( enc(M triv  M  halts on all inputs. We have shown eq (enc triv ) = 1 i ff  M  halts we can use M eq Contradiction. ion. eq  in order to build a machine which decides f all all . Contradict eq R. f eq So far, far, we ha hav ve us used ed re redu duct ctio ions ns in orde orderr to esta establ blish ish pr prob oble lem m nonnonmembership in R in  R.. There are other properties of  R  R  and  RE   RE  which  which can be of  use for this task. First we define:

 6∈

  ∈∈ H om om((N, {0, 1}).

Definition 0.2.3.4 (Complement of a problem)   Let  f   f  We denote by   f  f    the problem: f (n) =

   1   i    f (n) = 0

We call   f  f    the  complement the  complement   of   f .

 ff  0   i  ff   f (n) = 1

 

0.2.

COMPU COMPUT TABILITY THEOR THEORY  Y 

 

23

For instance, the complement of   f  f h  is the problem which asks if a Turing Machine   M  does Machine   does not halt for input w input  w.. We also note that  f    f   = f .  f . Next, we define the class:

  ∈∈ H om(  ∈ RE } om(N, {0, 1})  |  f  ∈

=  { f  coRE   =

coRE contains the set of all problems whose complement is in RE. We establish that:

  ∩∩

Proposition Propositi on 0.2.3.7 0.2.3.7   RE   coRE = R.

  ∈∈



Proof:   Assume  Assume   f   RE   R E  coRE . Hence, Hence, there exists a Turing Machine Machine   M  which accepts  accepts   f  f  and  and a Turing Machine   M  M    which accepts  accepts   f . We build build the Turing Machine: M ∗ (w) =  for   i

∈N:

  run  run   M (w)   for  for   i  steps. If   M (w) = 1, 1 , return  return   1. Otherw Otherwise: ise: run   M (w)   for  run for   i  steps. If   M (w) = 1, 1 , return  return   0.

Since   M  Since  M    and  and   M  will   will always halt when expected result is 1, they can be used together to decide  decide   f . Hence  Hence   f  R.   

  ∈∈ ∈ R   i  ff   f   ∈∈ R. Proposition Propositi on 0.2.3.8 0.2.3.8   f   ∈

Proof:  The proposition follows immediately since the Turing Machine which decides f  decides  f  can  can be used to decide  f   f ,, by simply switching it’s output from 0 to 1 and 1 to 0. The same holds for the other direction.    We conclude this chapter with a very powerful result which states that an an category/type   category/type  of  of problems does not belong in   R.

  ⊆⊆

Theorem 0.2.3.1 (Rice)   Let   C    RE . Giv Given en a Turing uring Machine  Machine   M , we  ask: “ The “ The problem accepted by M  by  M  is  is in   C ?. Answering this question is not  in   R. Proof:  We consider that the trivial problem f  problem  f ((n) = 0 is not in  C . Since  C   is ∗ C , and since  non-empty, suppose  suppose   f  since   f ∗ is recursively-enumerable, let  let   M ∗ be the Turing Machine which accepts  accepts   f ∗ . We apply a reduction from a variant of   f 111 111 , namely   f x .   f x   asks if a Turing Machine halts for input x. We assume we can decide the membership f  C   by by some Turing Machine. Based on the latter, we construct a Turing



  ∈∈

 

24 Machine which decides  decides   f x   (i.e. solves solves the halting problem for a particular input). Let M  Let  M x  be the Turing Machine which accepts  which  accepts    f x . Let: Πw (ω ) =   if   M x (w)  then  then   M ∗ (ω )  else loop. If   f Πw  is the problem accepted by   Πw , we show that: f Πw

 ∈ C   iff   M  (w) halts x

( ). Suppose f  Suppose  f Πw C . Then   Πw (ω ) cannot loop for every input  ω Σ∗ . If  there were so, then f  then  f Πw  would be the trivial function always returning 0 for any input, which we have assumed is not in   C . Thus,  Thus,   M x (w) halts. ( ). Suppose M  Suppose M x(w) halts. Then the behaviour behaviour of Πw (ω ) is precisely that of  ∗ M  (ω ).   Πw (ω ) will return 1 whenever M  whenever  M ∗ (ω ) will return 1 and   Πw (ω) =    whenever  M ∗ (ω ) = . Since f  Since  f ∗ C , then also  also   f Πw C . In Theorem   0.2.3.1,   the set   C   should be interpreted as a   property   property   of  problems, proble ms, and subsequently subsequently of the Turing Turing Machines Machines which accept them. Checking if some Turing Machine   M ∗ satisfies the given property is undecidable. cidabl e. Consider Consider the property property informally informally described as:   The set of Turing  Machines(/computer programs) that behave as viruses . The ability of deciding whether a Turing Turing Machine Machine beha b ehaves ves as a viru viruss (i.e. belongs to the former former set) is undecidable, via Rice’s Theorem.



 ∈

 ∈





0.3 0. 0.3. 3.1 1

 ∈





Com Compl plex exit ity y Th Theor eory y Measu Measuri ring ng tim time e and spa space ce

In Computability Theory, we have classified problems (e.g. in classes R classes  R  and RE ) based on Turing Machine’s ability to decide/accept them. In order to classify problems based on hardness, we need to account for the  number of steps (time)  the number (time)   and tape and  tape cells  which cells  which are employed by a Turing Machine. The amount of spent resources (time/space) by a Turing Machine   M  may be expressed as functions: ∗ T  M , S M M    : Σ

→N

where   T  M (w) (resp.   S M number of steps steps performe performed d (resp. (resp. tape M  (w )) is the number cells used) by  by   M , when running on input  input   w. This definition suff ers ers from un-necessary un-necessary overhead, overhead, which which makes makes time and space analysis difficult. We formulate some examples to illustrate illustrate why this is the case:

 

0.3.

 

COMPL COMPLEXITY EXITY THEOR THEORY  Y 

25

Alg(n) Alg( while   n < 100 while n =  = n  n + 1 return   1 We note that   Alg   runs 100 steps for   n   = 0 while only one step, for n   100. Howeve However, r, in practice, practice, it is often considered considered that each input is as 6 likely to occur as any other .  For this reason, we shall adopt the following convention: (?)  We allways consider the most  expensive most  expensive/  / defavourable defavourable  case, given inputs of a certain type . In our previou previouss example, example, we consider consider the Alg  as being 100, since this is the most expensive case. running time of   Alg as Consider the following example:

 ≥

Sum(v, n) Sum( s  = 0, i  = 0 while   i < n while s  =  = s  s + v [i] i  =  = i  i + 1 return   s Unlike   Alg, Unlike  Alg,   Sum does Sum  does not have a universal upper limit on it’s running Sum  executes, depends on the number of varitime. The number number of steps  steps   Sum executes, ables from v from  v,, namely n namely  n,, and is equal to 2n 2n + 3, if we consider each variable initialisation initial isation and the return return statement statements, s, as computing computing steps. Thus, Thus, we observe that (??)   The running time (resp. (resp. co consume nsumed d spac space) e) of an algorithm  algorithm  will grow as the size of the input grows. We can now merge ( ?) and (??) into a definition: Definition 0.3.1.1 (Running time of a TM)   The  running   running time of time of a TurN

N

ing Machine   M  M  is  is given by   T  M   :   i  ff : ∗ w Σ : the nb. nb. of transition transitionss performed performed by by  M   M  is  is at most   T  M (|w|)

∀ ∈



Remark 0.3.1.1 (Consumed space for a TM)  A naive definition for consumed space of a Turing Machine would state that   S M M  (|w |)   is the number  6

This is not often the case. There are numerous algorithms which rely on some probability that the input is of some partic ability particular ular type. For instance instance,, efficient SAT solvers rely on a particular ordering of variables, when interpretations are generated and verified. On certain cert ain ordering orderingss and certai certain n formulae, formulae, the algorith algorithm m runs in polynomial polynomial time. The key  estimate  to the efficiency of SAT solvers is that programmers  and efficient ordering, based on some expectancy from the input formula. The algorithm may be exponentially costly close-to-polynomial l  time, for most of the inputs. for some formulae, but run in   close-to-polynomia

 

26 of tape cells which   which   M  M    employs. employs. This definition definition is imprecise. imprecise. Consider Consider the  Turing Machine which receives a binary word as input and compute whether  the word is a power of   2n . Asside from  reading it’s   reading  it’s input, the machine consumes no space. Thus, we might refine our definition into: “ S S M  M  (|w|)” is the  number of tape  writes which    writes  which  M  performs”.  M  performs”. This definition is also imprecise. Consider Consid er the binary counter counter Turing Turing Machine from from the former former chapter. It  performs a number of writes proportional to the number of consecutive   consecutive   1’s   found at the end of the string. However, the counter does not use additional  space, but only makes processing on the input. Thus, the consumed space  S   S M  is  the number of written cells except cells except M  (|w |)  is  the from the input. input. Consi Consider der a Turing Machine which which rece receives  ives   n   numbers  encoded as binary words, each having at most 4 bits, and which computes  the sum of the numbers, modulo 2 modulo  2 4 . Apart from reading the  4  4 n bits and  n  n 1 word separators, the machine employs another   4   cells to hold a temporary  sum. Thus, the consumed space for this machine is   47 . A formal definition for consumed space of a TM is outside the scope of  this course, course, since since it invol involves ves multi-tape multi-tape Turing Machines. The basic idea is  to separate the   the   input from input  from the rest of the space used for computation. Thus, when assesing the consumed space of an algorithm, we shall  never  shall  never  account for the consumed space by the input .





Recall that, as in Computability Theory, our primary agenda is to produce a classificatio classification n of problems. problems. To this end, it makes makes sense to first introintroduce a  a   classification of Turing Machine running times . Asymptotic notations Remark 0.3.1.2 (Running (Running times vs. arbitrary arbitrary functions) In functions)  In the previous section, we have defined running times of a Turing Machine as func  N, and we have seen that they are often monotonically  tions:   T    :   N increasing ( n   m   =   T   (n  (n)   T   (m  ( m)). While mono monotonici tonicity ty is common  common  among the running times of conventional algorithms, it is not hard to find  examples exampl es (more-or-l (more-or-less ess rea ealisti listic) c) where where it do does es not hold. For instanc instance, e, an  algori alg orithm thm may simply simply return return,, if it’s it’s input input exce exceeds a given given siz size. e. Thus, Thus, we  shall not, in general, assume that running times are monotonic. Furthermor urthermore, e, we shall extend our classific classification ation to arbitra arbitrary ry functions  functions  R, since there is no technical reason to consider only functions over  f   : R naturals. natur als. In support support for this, we shall also add that asymptotic asymptotic notations notations ar are  e 

 →  ≤



 ≤



7

We can also build another machine which simply uses the first number to hold the temporary sum, and this use no additional space.

 

0.3.

 

COMPL COMPLEXITY EXITY THEOR THEORY  Y 

27

useful in other fields outside complexity theory, where the assumption that   functions are defined over natural numbers only is not justified. Definition 0.3.1.2 (Θ  (theta) notation)   Let   g   :   R is the class of functions:

 →   R.

Then  Then   Θ(g (n))

+

Θ(g (n))

=  { f   : R

R   ∀n ≥ n0, c1g(n) ≤ f (n) ≤ c2g(n)} → R |   ∃∃nc10, ∈c2 ∈ N

Thus,   Θ(f (n)) is the class of all functions with the   same asymptotic  asymptotic  growth    as   f (n). We ca growth  can n ea easi sily ly observ observee th that, at, fo forr all all cont contin inuo uous us   g, f  f (n)  c , where c where  c = 0. H om( om(R, R) such that  that   g Θ(f (n)), we have limn→∞ g(n)   =  c, There is an infinite number of classes   Θ(f (n)), one for each function   f . However, if  g(  g (n) Θ(f (n)), then   Θ(g (n)) = Θ(f (n)). It makes sense to consider classes which describe functions with   inferior /superior  superior    asymptotic growth:

6

 ∈

 ∈



Definition 0.3.1.3 (O,Ω  notations)   Let   f   : R

→ R. Then:

+

O(g (n)) =  { f   : R

→ R |   ∃∃cn0∈ ∈RN   ∀n ≥ n0, 0 ≤ f (n) ≤ c ∗ g(n)}

= { f   : R

→ R |   ∃∃cn0∈ ∈RN   ∀n ≥ n0, 0 ≤ c ∗ g(n) ≤ f (n)}

+

Ω(g (n))

 ∈





 ∈



Note that g that  g O(f (n)) =   O(g (n)) O(f (n)), while g while  g Ω(f (n)) = Ω(g (n))   Ω(g (n)). Finally Finally,,   Ω(f (n)) O(f (n)) =   Θ(f (n)). Eac Each of the the above propositions can be easily proved using the respective definitions of  the notations. notations.

 ⊆

 Ω



ff 

  o er  relaxed   relaxed  bounds  bounds asympt asy mptotic function func tion growth. growth.atThus, Th us, g O O (and f (n)) should be read as:The for as:The function  functi on    g   otic grows asymptotically most  as much as   f . It makes sense to also consider tight  consider  tight   bounds:

 ∈

Definition 0.3.1.4 (o,ω   notations) +

o(g (n)) =  { f   : R

→ R |   ∀∃cn0∈ ∈RN   ∀n ≥ n0, 0 ≤ f (n) ≤ c ∗ g(n)}

ω (g (n)) =  { f   : R

→ R |   ∀∃cn0∈ ∈RN   ∀n ≥ n0, 0 ≤ c ∗ g(n) ≤ f (n)}

+

 

28

 ∈

Thus,   g   o(f (n)) should be read:   g   grows assymptotically strictly less  than   f .  We have  have   o(f (n)) ω (f (n)) =   O(f (n)) Ω(f (n)) =   Θ(f (n)) and ω (f (n)) Θ(f (n)) = Ω(f (n)).



 ∅





Exercise Exerci se 0.3.1.1 0.3.1.1 If   f (n) If   f (n) If   f (n)

∈ Ω(n2)   and   g(n) ∈ O(n3)  then   f (n)/g /g((n) ∈ . . . 2 3 ∈ o(n )   and   g(n) ∈ Θ(n )  then   f (n) · g(n) ∈ . . . ∈ Θ(n3)   and   g(n) ∈ o(n2)  then   f (n) \ g(n) ∈ . . .

Exercise Exerci se 0.3.1.2 0.3.1.2   Prove or disprove the following implications:  O (n) f (n) =  O  O(log (log n) 2f (n) =  O( 2 f (n) =  O  O((n )   and   g (n) =  O  O((n) f (g (n)) =  O(  O (n3 ) f (n) =  O  O((n)   and   g (n) = 1 + f (n) g (n) = Ω(log n)



p ⇒



Syntactic sugars This section follows closely Lecture 2 from [1 [1]. Quite Quite often, often, asympt asymptotic otic notations are used to refer to arbitrary functions having certain properties related to their order of growth. For instance, in:

df (x)e = f   f ((x) + O(1) applying “rounding” to f  to  f ((x), may be expressed as the original f  original  f ((x) to which we add a function bounded by a constant. Similarly: 1 1

−x

  = 1 + x + x2 + x3 + Ω(x4 ),   for

  − 1 < x < 1

The above notation allows us to “formally disregard” the terms from the expansion, by replacing them with an asymptotic notation which characterise ter isess their their order order of growth growth.. One should should make make a distin distinctio ction n betwe between en the usage of asymptotic notations in arithmetic in  arithmetic expressions, expressions, such as the ones previously illustrated, and   equations. equations. Consider the following example: f (x) = O(1  O (1/x /x))



which should be read:  there exists a function   h O(1/x (1/x))  such that   f (x) = h(x). Similarly: f (x) =  O  O(log (log x) + O(1/x (1/x))



should be read:  there exist functions  h  h O(1/x (1/x))  and  w  w f (x) =  w(  w (x) + h(x). In equations such as: O(x) = O  O(log (log x) + O(1/x (1/x))

∈ O(log x) such that 

 

0.3.

 

COMPL COMPLEXITY EXITY THEOR THEORY  Y 

29

the equality is not symmetric, and should be read from left to right:   for  any function   f   O  O((x), there exist functions   h  O(1  O (1/x /x))   and   w  O(log  O (log x) such that   that   f (x) =   w(x) +  h(  h (x). In order to avoid avoid mis mistak takes, es, the followin followingg algorithmic rule should be applied. When reading an equation of the form:

  ∈∈

 ∈

 ∈

left = left  = right  right

•  each occurrence of an asymptotic notation in left  in  left  should   should be replaced by an an   universally quantified function quantified  function belonging to the corresponding class. •  each occurrence of an asymptotic notation in right  in  right  should   should be replaced by an existentially quantified function from the corresponding class. 0.3.2 0.3 .2

Runnin Running g time time in comp complex lexit ity y theory theory

Using asymptotic notations, we can distinguish between running (of algorithms) with diff erent erent asymptotic asymptotic growths. growths. Experience Experience has shown that, it is unfeasible to develop a theory which uses asymptotic notations in order to classify problems, based on their di fficulty culty.. Thus, in complexity theory, theory, we make an even stronger assumption:   the exponent of a polynomial function  is un-important . Recall Recall that, with asymptoti asymptoticc notatio notations, ns, we do not diff erer2 2 2 entiate between n between  n and 2n 2n + n +1 (and denote either of the two by  Θ (n )). In Complexity Theory, we do not distinguish between, e.g.   n2 and and   n3 , and thus, we write n write  n O(1) , thus refering to a polynomial of arbitrary degree. Before introducing a classification of problems, there is a question which must be addressed:  How does the encoding of a problem instance a  ff ect ect the  running time of the subsequent algorithm?  To see why this issue is important, consider the encoding of numbers us usin ingg a single single digit digit.. (e (e.g. .g.   IIIIIIII   encode encodess the number number 8). A Turin Turingg Machine   M  which Machine   which starts with the tape: >   I

I

I

I

I

I

 

#

#

and increments the represented number by shifting the head to the first empty cell, where it places   I , will perform a number steps which is linear with respect to the size of the input. Thus, the running time of   of   M  M    is  is   O(n) where n where  n  =  | w|, and  and   w   is M  is  M ’s ’s input. The Turing Machine which uses the binary alphabet, and encodes numbers as binary words, will also run in linear time w.r.t. the size of the input, but in this case, there is an exponential an exponential gap between gap  between the two representations.

 

30

 d

e

The representation of a natural   x  consumes  consumes   n  = log x  cells in the second machine, and x and  x  cells, in the first. Note that  that   x  = 2n . This is one of the rare cases  cases   [2] [2] where  where a bad choice of a representation may lead to an exponential exponential increase increase in the number number of steps. steps. In what follows, we assume problem instances are encoded in some default way: e.g. graphs are represente represented d as adjacency adjacency matrices matrices or as adjacency adjacency lists, etc. When appropriate representations are chosen, the computational gap between them is is at  at most polynomial . As an example, let us compare a matrix representarepresentation of a graph, with that of adjacency lists. Assume the graph is directed, contains   n  nodes and only one edge (u, contains ( u, v). The matrix represen representation tation will 2 2 consume n consume  n positions (out of which n which  n 1 are equal to 0), while the list representation will consume only one position (corresponding unique element from the adjacency list of   u). Howeve However, r, the gap between the two represenrepresentations is polynomial. Thus, from the point of view of Complexity Theory, it is irrellevant if we chose matrices or adjacency lists to represent graphs. This observ observation is highlighte highlighted d by the following following Proposition: Proposition:



 →

Proposition 0.3.2.1 (The encoding does not matter)   Let   f   f    :   N {0, 1}   be a problem which is decided by a Turing Machine   Machine   M  M    in time   time   T . T . 0 M  M  is  is defined over alphabet   Σ. Then, ther theree exists a Turing Machine   M  —  0 defined over alphabet   Σ =   {0, 1, #, >}, which decides   decides   f   f   and runs in time  O(log |Σ|) T ( T (n).



Proof:(sketch) We Proof:(sketch)  We build  build   M 0 = (K  ( K 0 , F 0 , Σ0 , δ 0 , s00 ), from  from   M   as follows:

•   Σ0 =  { 0, 1, #, >}. We encode each symbol symbol diff erent erent from # (the empty cell) and  and   >  (the marker symbol of the beginning of the input), as a word   w Σ0 with   |w|  = log |Σ| . We use  word use   k  to refer to the length   |w| of the word w word  w.. We write enc write encΣ (x), with x with  x Σ, to refer to the encoding

 ∈

d

0

e



of symbol x symbol  x Σ. states q   q 1 , . . . q2 k+1 K 0 , organized •  For each state s state  s K , we build 2k+1 states as a full binary tree of height k height  k.. The purpose of the tree is to recognize the word  word   encΣ (x) of length  length   k   from the tape. Thus, Thus, the unique state state at the root of the tree, namely   q 1 , is responsible for recognising the first bit. If it is 0, M  0,  M 0 will transition to  to   q 2  and if it is 1, to   q 3 .   q 2   and q 3  must each recognize the second bit. After their transitions, we shall be in one of the states  states   q 4   to  to   q 8 , which give us information about the firstt tw firs twoo bits bits of the word. word. The states states from from level level   i  recognize the first i  bits of the encoded symbol  symbol   encΣ (x). The states states from the last level level k are 2 in number, and recognize the last bit of the encoded symbol



 ∈



0

0

 

0.3.

 

COMPL COMPLEXITY EXITY THEOR THEORY  Y 

31

Thus,, ea eacch of th thee 2 k leaf-states in the tree corresponds to encΣ (x). Thus one possible symbol  symbol   x Σ  which is encoded as enc as  enc Σ (x). We connect k+1 all 2 states by transitions, as described above. 0



0

•  For each transition   δ (s, x) = (s0 , x0 ,pos ,pos)) of   M , the machine M  machine  M 0 must: (i) recognize  recognize   x, (ii) override  override   x0 , (iii) move according to  to   pos pos and  and go to 0 state   s . Thus: state –  (i) is done by the procedure described at the above point. –  for (ii), we use   k  states to go back (k ( k  cells) at the beginning of  0 encΣ (x) and write enc write  encΣ (x ), cell by cell. Finally Finally, we connect connect the state corresponding to  to   encΣ (x) from the tree, to the first of the above-described k above-described  k  states. 0

0

0

–  for (iii) if  pos =  pos  = L/R  L/R,, we use another k another  k  states to go either left or right. If  pos =  pos  = H   H ,, we need not use these states. Finally, we need to make a transition to the root of the state tree corresponding to to s  s 0 . For each transition   δ (s, x) = (s0 , x0 ,pos ,pos)) of   M ,   M 0 performs   k   transitions for reading the encoded symbol  symbol   x,   k  transitions for writing  writing   x0 and possibly k  tran  transit sitions ions for moving moving the tape head. head. Thus, Thus, for all  all   w   Σ0 , the number of transitions performed by  by   M 0 is at most 3k 3k T ( T (|w|). Hence, Hence, the running running 0 time of   M  is is   O(k ) T ( T (n).    The proof of Proposition 0.3.2.1 Proposition 0.3.2.1 shows  shows that any Turing Machine using an arbitrary alphabet   Σ  can be transformed in one using the binary alphabet. The overhead of this transformation is logarithmic:   O( log |Σ| ). Thus Thus,, if  the original Turing Machine runs in some polynomial time   T ( T (n), then the transformed TM will run in   O( log |Σ| ) T ( T (n) time which is bound by a polynom poly nomial. ial. Similar Similarly ly,, if the original original TM is runnin runningg in suprasupra-poly polynom nomial ial





 ∈

d

d

e

e∗

time, the transformed TM will also run in supra-polynomial time. In what follows, we shall assume all Turing Machines are using the binary alphabet   Σb  =  { 0, 1, #, >}. 0.3.3 0.3 .3

Comple Complexit xity y classes classes

In the previous section, we have stated that, in complexity theory, we shall make no distinction between polynomial running times with di ff erent erent asymp2 3 totic growths growths (e.g. between between   n and and   n ). With this in mind, we construct a classificatio classi fication n of problems. problems. First, First, we say that:   f   f   is decidable in time   time   T ( T (n) iff   there exists a Turing machine   M  M    which decides   f , and whose running time is  is   T . T . We interchange interchangeably ably use the terms decidable  terms decidable    and and solvable   solvable , since,

 

32 in this chapter, there is no ambiguity on what “solvability  “solvability ” means, and it cannot be mistaken by acceptability. All considered problems in this section are members of   R. The following following definition characteri characterizes zes problems problems with a specific running time. DTIME(T  DTIME( T ((n)) = { f   : N

→ {0, 1} | f  is  f  is decidable in time  time   O(T ( T (n))}

Note that, unlike asymptotic notations, DTIME(T  DTIME(T ((n)) is a class of problems, lem s, not of runnin runningg times. times. Also, Also, note note that that our characte characteriz rizatio ation n does not provid pro videe a strict strict upper upper bound. Hence, Hence, e.g. DTIME( DTIME(n n)  DTIME(  DTIME(n n2 ). In word wo rds: s: a pr prob oble lem m whic which h is deci decida dabl blee in lin liniar iar time time,, is also also de decid cidab able le in quadratic time. Next, we introduce the class:

 ⊆

PTIME =

[ DTIME( DTIME(n n ) d

d∈N

PTIME is often abbreviated P. It is the class of all problems which are decidable decida ble in polynomial time. Also, note that if some problem f  problem f  is  is decidable in log(n log(n) time, then  then   f   DTIME(log(  DTIME(log(n n))  DTIME(  DTIME(n n)   P. Hence, Hence, even even problems which are solvable in sub-liniar time belong in the class P. Further on, we introduce the class:

  ∈∈

 ⊆

EXPTIME =

 ⊆

nd

[ DTIME(2

)

d∈N

which contains all the problems which are decidable in exponential time. Naturally, we have: P EXPTIME R





There are two interesting questions which can be raised, at this point: 1. Is the inclusion P

⊆ EXPTIME strict?

2. Are all problems problems in EXPTIME EXPTIME \ P8 , “equals ”, ”, in terms of di fficulty? In the following section, we shall focus on the latter question: Nondeterminism and Nondeterministic Turing Machines We recall the problem SAT, which takes as input a boolean formula   ψ   in Conjunctive Normal Form (CNF). More precisely: ψ  =  = C   C 1 8

∧ C 2 ∧ . . . ∧ C 

n

Later in this chapter, we shall see that EXPTIME  \  P is a set whose members are currently unknown.

 

0.3.

 

COMPL COMPLEXITY EXITY THEOR THEORY  Y 

where, for each  each   i  : 1

≤ i ≤ n  we have C i  =  = L  L i1

and, for each  each   j   : 1

33

Li2

...

Limi

∨ ∨ ∨ ≤  j ≤ m  we have i

Lij   =  x  x   or or L  L ij   = ¬ x and finally, x finally,  x  is a variable. Recall that SAT can be solved in exponential time, hence SAT EXPTIME. The major source of complexity consists in generating all possible interpretations tati ons on which which a ve verifi rificat cation ion is subseq subsequen uently tly done. In order order to answe answerr question 2. from the previous section, suppose that SAT would be solvable in polynomial time. Could we find problems (possibly related to SAT) which are still solvable in exponential time under our assumption? The answer is yes: Let   γ  be   be the formula:



∀ ∀



γ   = x1 x2 . . . xk ψ

 ∈

where   ψ   is a formula in CNF containing variables   x1 , . . . , xk , and   k   N is arbitrar arbitrarly ly fixed. fixed. Check Checking ing if   γ   is satisfiable is the problem SA SAT. T. An algorithm for SAT must build  build   all combinations of 0/1 values for each   each   xi with    i   : 1   i   k   and for each one, must solve an instance of the SAT with  problem. In total, we have 2k combinations, and since k since  k is  is part of the input, the algorithm runs in exponential time, provided time,  provided that we have an algorithm   for    SAT  for  SAT  which runs in polynomial time . In order to study the di fficulty of problems, in Complexity Theory, we generalise the above-presented approach, in order to determine  degr  degrees ees of  hardness . In order order to do so, we need a genera generall ve versi rsion on of our assumpt assumption ion::

 ∀  ≤  ≤

 ∀

“SAT is “SAT  is solvable in polynomial time ”. ”. In other words, we need a theoretical tool to make make exponential exponential search (seem) (seem) polynom p olynomial. ial. This tool will have have no relation relati on to reality reality, and no implications implications for real problem solving. It should not be unders understood tood as a techni technique que for deciding deciding hard problem problems. s. It’s mere mere purpose pur pose is theore theoretic tical: al: it allows allows us to abstra abstract ct one source source of com comple plexit xity y (that of SAT in our example) in order to explore others (e.g. SAT). The tool that we mentioned is the  Nondeterministic Turing Machine :

 ∀

Definition 0.3.3.1 (Non-deterministic TM)   A non-deterministic Turing machine (NTM short) is a tuple   tuple   M  M    = (K,F, Σ, δ , s0 )   over alphabet   Σ with   K, Σ   and   s0   defined as before,   F   F   =   {syes , sno }   and   δ   K  Σ K  Σ {L,H,R}   is a  transition   transition relation .

×

 ⊆  ⊆  ×  × ×  ×  ×

 

34 A NTM   NTM   terminates  terminates   i  ff  it   it reaches a final state, hence, a state in   in   F  F .. A w   {0, 1}   i  ff   f (n ) = 0 =   M (w) NTM    M  NTM  M    decides   a function   function   f  f    :   N reaches state   sno   on all possible sequences of transitions an transitions  and  d   f (nw ) = 1 =   M (w)  reaches state   ss yes  on at least one sequence of transitions. transitions. We say the   running running time  time   of a NTM   M  M    is   is   T   T   i  ff , for all   w   Σ, all  sequences of transitions of   M (w)  contain at most   T ( T (|w|)  steps.

 →





 ∈

We sta start rt with a few technic technical al observ observatio ations. ns. First First note that the NTM is specific specifically ally tailored tailored for decisi decision on proble problems. ms. It has only tw twoo final final states states,, which correspond to yes/no to  yes/no   answers. answers. Also, the machine machine does not produce an output, and the usage of the tape is merely for internal computations. In essence, these “design choices” for the NTM are purely for convenience, and alternatives are possible. Whereas the conventional Turing Machine assigned, for each combination of state and symbol, a unique next-state, overriding symbol and head movement, a nondeterministic machine assigns a collection  a collection  of   of such elements. The current The  current configuration  of   of a conventional Turing Machine was characterized by the current state, by the current contents of the tape, and by the head position. A configuration of the nondeterministic machine corresponds to a  set   set  of  of conventional TM configurations. The intuition is that the NTM can simultaneously process a set of conventional configurations, in one single step. While the execution execution of a Turing Machine Machine can be represen represented ted as a sequence , that of the NTM can be represented as a  tree . A path in the tree corresponds to one sequence of transitions which the NTM performs. Now, notice the conditions under which a NTM decides a function: if at least one sequence of transitions leads to   syes , we can interpret the answer of the NTM as   yes . Converse Conversely ly,, if   if   all sequences all  sequences of transitions lead to   sno , then the machine returns  returns   no no.. Finally, when accounting running time of a NTM, weonly do not count all performed transitionsfor (asthe it would seem reasonable), but the length of the longest transition sequence   sequence   perform performed ed by the machine machine.. The intuition is that all members of the current configuration are processed   in  parralel , during a single step. We illustrate the NTM in the following example: Example 0.3.3.1 (Nondeterministic Turing Machine)  We build the NTM  M SAT  SAT  problem discussed previously. First, we assume  SAT  which solves the   SAT the existence of  M   M chk chk (I, ψ )  which takes an interpretation and a formula, both  encoded as a unique string, and checks if   if   I   |=   ψ.   M cchk hk   is a conventional  TM, thus upon termination it leaves   leaves   0   or   1  on the tape.

 

0.3.

COMPL COMPLEXITY EXITY THEOR THEORY  Y 

 

35

Step 1:   M SAT  SAT    computes the number of variables from   ψ  (henceforth referred  to as   n), and prepends the encoding of   ψ   with the encoding of   of   n   in  unary (as a sequence of   1’s). This step takes   O(n)  transitions. Step 2: During the the former step, step, M   M SAT  SAT  has created a context for generating interpretations. In this step, M  step,  M SAT  SAT  goes over each cell from the encoding  of   of   n, and non-deterministically places   places   0   or   or   1   in that cell cell.. Thus Thus,, af af-ter 1 such transition, there are 2 possible conventional configurations. In the first, bit   bit   0  is placed on the first cell of the encoding of   of   n. In  the second, bit   1   is plac placed in the same positi position. on. After  After   i   transitions, i we have   2 possibl ossiblee configur onfigurati ations ons.. At the end of thi thiss step, step, we have  have  n 2 possible configurations, and in each one, we have a binary word of  length  n  n  at the beginning of the tape, which corresponds to one possible  interpretation. All sequences of transitions have the same length, thus, the execution of this part of   M SAT  SAT    takes   O(n). Step 3: At the end of each each sequence sequence illustrated illustrated above, we run  M   M chk (I, ψ), where  I   and   ψ ψ  are already conveniently on the tape. This step takes  O takes  O((n m) where   m  is maximal number of literals in   ψ.





If we add up all running times of the three steps, we obtain   O(n m). An important issue is whether the NTM has more expressive power than the conventional TM: Proposition 0.3.3.1 Proposition 0.3.3.1  Every function which is decidable by an NTM in polynomial running time, is also decidable by a TM which runs in exponential  time. The proof is left as exercise. Intuitive Intuitively ly,, we can simulate simulate a NTM by doing a backtracking procedure, with a classic TM. The propositions shows that the NTM only off ers ers a gain in speed and not expres expressiv sivee power. power. It solves solves precisely the same problems which the conventional TM solves. Convent Conve ntion ion for descri describing bing NTM in pseudocode pseudocode..   In the prev previou iouss chapters, we often resorted to traditional pseudocode in order to describe algorit algo rithms hms – that that is, Turing uring Machin Machines. es. It is occasio occasional naly y useful useful to be able able to do the sa same me thing thing for for NTMs NTMs.. With With this in mind, mind, we intr introdu oduce ce some notational conventions. The instruction: v = ch choic oice( e(A) A)

where   v  is a variable and   A  is a set of values, behaves as follows:

 

36

•  the current (non-deterministic) configuration of the NTM shall contain |A|  conventional configuration. •  each conventional configuration corresponds to a distinct value a value  a A, and it should be interpreted that   v   =  a,  a , in that particular configuration.



•  the running time of the instruction is O is  O(1). (1). We also note that, it is not possible to achieve some form of “communica“ communication ” between conventional configurations. Thus, it is intuitive to think that the processing (execution of a transition) of a conventional configuration is done independently of all other conventional configurations. We add two aditional instructions:   success  success   and and fail  fail.. They ccorres orrespond pond to a transitions into states  s yes   and s and  s no , respectively. We illustrate NTM pseudocode, by re-writing the SAT algorithm described above. above. We adopt the same representa representational tional conventi conventions ons from the first Chapter, and also re-use the procedure CHECK. Example 0.3.3.2 (Pseudocode)   . SolveSAT( ϕ): Let   n  be the number of variables in   ϕ. Let   I  I  be  be a vector with   n  components which are initialised with   0  for   i  = 0, n

− 1:

I [i] = choice  choice(({0, 1}) if  CHECK(    CHECK( ϕ,I) = 0   fail  else   success As illustrated in the previous example, the NTM has the power of taming down the complexity which results from the search of an exponential number num ber of candidates candidates (in our example, interpretat interpretations). ions). Note that, in the NTM, the main source of complexity is given by the verification procedure of  M   M cchk hk (I , ψ ). By using the NTM, we have managed to find the source of complexity of SAT, SAT, which which is the exponential exponential search search over over possib p ossible le interpre interpretations tations.. We have seen that there are other types of exponential explosion. Thus, we are

 

0.3.

 

COMPL COMPLEXITY EXITY THEOR THEORY  Y 

37

in the position for refining our classification, by introducing new complexity classes:

NTIME(T  NTIME( T ((n)) = { f   : N

→ {0, 1} | f   f  is  is decidable by a NTM in time O time  O((T ( T (n))}

and NPTIME =

[ NTIME( NTIME(n n ) d

d∈N

 ∈

We usually abbreviate the class NPTIME by NP. Note that SAT   NP, however it seems unlikely that SA SAT T  NP. We sha shall ll discuss discuss this issue in the next next section. section. Le us relate relate NP with our other other classes. classes. First, First, note that NP   EXPTIME EXPTIME.. This result is essentially essentially given given by Proposition  Proposition   0.3.3.1: 0.3.3.1: every problem solved by a NTM can be solved in exponential time by a TM. Also, we trivially have: P NP. The concise argument is that the NTM is a generalization of the TM, “minus” some technical details. Thus:

 ∀

 ∈

 ⊆



P

⊆ NP ⊆ EXPTIME ⊆ R

The fundamental property of problems from NP is that “solution “ solution candidates can be   be   verified in verified  in polynomial time ”. ”. An analogy with solving crosswords wor ds is possible. possible. Generating Generating all possible solutions solutions (by “lexicographi “lexicographically” cally” generating gener ating all posible words) words) is obviously obviously exponentia exponential. l. But since verifiying  since  verifiying  whether a solution is correct can be done in polynomial time w.r.t. the size of  the crossword, the problem in in NP. We shall soon see that all algorithms which solve hard problems from NP can be split up into an exponential “generating” “gener ating” procedure, and a polynomial polynomial “verificat “verification ion procedure”. procedure”. 0. 0.3. 3.4 4

Hard Hard and compl complet ete e proble problems ms for a class class

  66∈

Recall that proving  proving   f  R  for a given problem   f  was f  was done using contraposition. Essentially Essentially,, the proof relied on finding a Turing reduction reduction   f ∗ T   f  to a problem f  problem  f ∗ for which  which   f ∗ R  is known. First, we note that R that  R  could be easily replaced replaced with any complexity complexity class X . Thus, a more general proof scheme could be described as:



6∈

f ∗

6∈ X, f   ≤≤   f ∗   =⇒   f   66∈ X  X 

where f ∗ and where f  and X   X  are  are given in advance. We observe that we have replaced T  by X , in order to highlight that the reduction X  depends on the choice

 ≤

 ≤



 

38 of  X .  X . We shall later see that we cannot allways use Turing Reductions, and there are X  are  X ’s ’s for which more restrictions on the reduction must be in place. Returning to the proof scheme, we make a second note: to prove f  prove  f  R, ∗ we must first find some  some   f  R. But if this very fact is shown by the same technique, then we need another problem f  problem  f ∗∗ R  and so on. This situation is similar to: “Which “Which came first, hens or eggs? ”. eggs? ”. For the case of   of   R, this issue was settled by the proof   proof   f h   R   using diagonalizatio diagona lization. n. This introduced introduced an initial undecidable undecidable problem, and the aforementioned technique could be employed. Now, let us turn our attention to   X  X    = NP. First, First, let us examin examinee the properties which the reduction should have, in order to make our technique work in this particular particular case. case. let let   f ∗  NP  NP.. We need to show show that  that   f    NP. The first part consists of assuming f  assuming  f  NP. Next, we a reduction T  reduction  T    :  I ff   and  I f  subscript .9 We I ff    where  where   I f  f  are the inputs of the problem from the subscript. f    and I  recall that the reduction needs to satisfy:

6∈

 6∈

6∈

  6∈

6∈   ∈∈



∀i ∈ I 



f  f ∗   :  f 

(i) = 1

  66∈

 →



  ⇐⇒   f (T ( T (i)) = 1

In words words:: “we can solve all instances   i   of   I ff    by solving instances   T ( T (i) of   I f  f ”. This is done as follows: ∗

a. receive receive i  i

∈ I 

f  f ∗

b. compute T  compute  T ((i)

∈ I 

f  f 

  ∈∈ NP) and return it’s

c. run the NTM M  NTM  M f  T (i)) (M f  since  f  f (T ( f  must exist since f  answer.

After a careful look, we observe that the conclusion   f ∗   NP (which is our objective, in order to complete the contrapositive argument) is not immediate. immedi ate. If, for instance, the computation computation of  T (  T (i) takes exponential time, then th en the proof proof does does not work work:: th thee NTM which which perf perfor orms ms a. b. c. ru runs ns in (non-deterministic) exponential time. Thus, Thu s, the restriction restriction that T  that  T  is  is decidable is insufficient. We further need that   T   T   is computable in   polynomial polynomial time. time. We writ writee   f ∗  p   f   f   iff   there ∗ exists a polynomial-time transformation which allows solving   f  via via   f , as illustrated by the scheme above. We now turn our attention to the “hen “hen and eggs ” issue. issue. We need need an initial problem, which is known   not   to be in NP. Unfortunately, such a





9

In order to make explicit the direction of the transformation, we ignore the fact that and  I ff   are (subsets of) naturals. both I  both  I ff     and I  ∗

 

0.3.

 

COMPL COMPLEXITY EXITY THEOR THEORY  Y 

39

problem is not known, although there have been major (unsuccessful) e ff orts orts in trying to find one. The same holds for the class P. Hence, the issue: P ( NP is currently open, and it is generally believed that it is true, but all proof  attempts have failed. The good news is that our e ff ort ort in classifying problems is not fruitless. Transformations: f ∗

≤  f  p

f ∗



T   f 

establish a relation between problems, which is denoted as   hardness . No matter the reduction type (polynomial or Turing),  f    f   is at is  at least as hard as hard  as ∗ f  . With With the machine machine   M f  f  (together with computing a transformation) we can solve all instances of   f ∗ . It may be possibl possiblee that  that   T   T   is bijective: bijective: hence ∗ each input of   f  f  is uniquely mapped to an input of  f  of  f    and vice-versa. Then, ∗ f   and f  and  f  are are equally  equally hard . Howeve However, r, this is not generally generally the case, hence the term “at “at least ” as hard. We can naturally extend hardness to complexity classes: Definition 0.3.4.1   A pr problem  oblem   f  f    is ca called  lled    NP-hard NP-hard i  ff   for all   all   f 0 f 0  p  f   f ..



∈   NP NP,,

Thus,, a prob Thus proble lem m is hard hard w.r.t. w.r.t. a clas class, s, i ff   it is at least as hard as any problem in the class at hand. Note that hardness can be defined w.r.t. any complexity class and not just NP, provided that the appropriate type of  transformation is employed. Definition 0.3.4.2   A problem   problem   f  f    is called   alled   NP-c NP-complet ompletee i  ff   it is   is   NP NP-hard  -hard  and   f  NP NP..

  ∈∈

Informally, NP-complete problems are the hardest problems in NP. In the more general general case, complete problems problems w.r.t. w.r.t. a class are the hardest of  that class. It is likely that if   f  f  is  is NP-complete, then  then   f   P ,  P , however, this is not a proven fact. Thus, Th us, instea instead d of trying trying to dispro disprove ve membersh membership ip of a class class (f    P  P ), ), in complexity theory, we prove completeness for the immediate upper (or greater) class (f  (f    is NP-complete). The intuition is that class membership provides an upper bound, while hardness — a lower bound for the di fficulty of a problem. We illustrate this by an abstract example. Recall:

  66∈

  6∈

P

⊆ NP ⊆ EXPTIME

 

40 Let   f  f    be a pr prob oble lem. m. Su Suppo ppose se we find an algor algorith ithm m fo forr   f   f   which runs in exponential time, however, we cannot find one which runs in polynomial time on a NTM. At this point, we have  have   f   EXP  EXPTIME. TIME. Suppose we show f  f  is  is P-hard, thus,  thus,   f  f  can  can be used to solve any problem in P. We now know that   f  that f    can be solved exponentially and it is unlikely that   f   f   can be solved in sub-polynomial sub-polynomial (e.g. logarithmic) logarithmic) time. Thus, Thus, the likely varian variants ts are: f  may be solved in polynomial time (i) by a conventional TM  f  P or (ii) by a NTM  NTM   f  NP, and (iii) in exponential time, again by a conventional TM f   EX  EXPTI PTIME. ME. In the best case  case   f  f    is polyno p olynomially mially solvable. solvable. In the worst — it is exponentially solvable. Suppose now we also find that f  that  f  is  is NP-hard. We cannot rule out f  out  f  P by a proof, but Complexity Theory predicts that such a membership is not likely. Hence, the feasible variants remain (ii) and (iii). Finally, if we manage to improve our exponential algorithm for   f   f   and turn it into a non-deterministic polynomial algorithm, then   f    NP and,

 ∈

  ∈∈

  ∈∈

  ∈∈

  ∈∈

 ∈

hence hence, it is lNP-complete. NP-comp lete. At Case (iii) remains remain of course true,characte butracterisatio it does not carry ,useful usefu information. information. this point, wes have hav e an exact cha risation n of the difficulty of   f . f . Proving   NP-completeness Proving  NP-completeness   For a problem f  problem  f  to  to be NP-complete, it must satisfy sati sfy two two condit conditions ions.. The first: first:   f    NP is shown by finding a NTM which decides  decides   f  f    in polynom polynomial ial time. For the second second part (f  ( f  is  is NP-hard), we can employ precisely the “reduction-finding” technique illustrated at the beginning of this section.

 ∈

Proposition 0.3.4.1 Proposition 0.3.4.1  A problem   f  f    is   is   NP NP-hard -hard i  ff  there   there exists a problem   g which is   is   NP-hard, NP-hard, such that   g  p  f   f ..

 ≤

Proof:   Suppose  Suppose   g  p   f  f    and  and   g  is NP-hard. NP-hard. Hence, Hence, for all  all   h   NP, NP,   h  p   g . By transitivity, we also have that  that   h  p  f .  f . It follows that  that   f   f   is NP-hard.    In the former proof we have made use of the transitivity of   p , without showing it. We now state several properties of  o  including transitivity, and leave the proofs as exercises.

 ≤

 ∈





Proposition Propositi on 0.3.4.2 0.3.4.2

 ≤  ≤

 ≤

  ≤   is reflexive and transitive.  p

 ≤  ≤ . Proposition Propositi on 0.3.4.4 0.3.4.4  The set of   of   NP-complete NP-complete problems together with  ≤   is 

Proposition Propositi on 0.3.4.3 0.3.4.3  The set of   of   NP-hard NP-hard problems is closed under 

 p

 p

an equivalence class.

 

0.3.

COMPL COMPLEXITY EXITY THEOR THEORY  Y 

 

41

 ∈  ∈ P, then  P   P = NP. NP.

Proposition Propositi on 0.3.4.5 0.3.4.5   Assume   f  f    is   is   NP-complete. NP-complete. If   f 

The former proposition, whose proof follows immediately from the underlyunderlying definitions, makes the case for the common belief that P = NP. If some efficient algorithm can be found for some NP-complete problem, then  then   all problems in NP can be solved in polynomial time. The P = NP issue can also be given another intuitive interpretation: “The verification of a solution candidate is as di  ffi cult cult as generating it ” or, alternativ altern atively: ely: “Verifying a given proof P for A, is as di  ffi cult cult as finding a  proof for P ””.. Finally, to better understand the implications of P = NP, consider several facts which would be arguably true, in the case the former equality holds:

6

•  We can provide a solution to the astronauts’ problem (see first chapter) fficiently •  Partial program correctness can betosolved ciently. Technique such as model checking can be applied a wideerange of .applications (in-

cluding operating system cluding system kernels kernels). ). Bugs are almost almost remove removed. d. Windows bluescreens are no longer happening.

•  Generation of exponentially many training sets would make tasks such as voice recognition, computer vision, natural language processing — computationally easy. •   Mathematical proofs (of, say 100 pages) can be generated e fficiently. Computers can be used to find proofs for some open problems. •  We can exponential search to find passwords, or to break encryption keyss in polynomial key polynomial time. Internet Internet privacy privacy is no longer possible possible using encryptio encryp tion n (e.g. (e.g. using using SSH). SSH). In Inter ternet net commerce commerce and bankin bankingg is no longer possible. possible. Safe communicat communication ion is no longer possible (at all levels). Any computer-controlled facility (public, military, etc.), which is connected to the Internet has considerable potential of being compromised. Remark 0.3.4.1 (Practical applications of reductions)  As illustrated  before, befor e, re reductions ductions of o f the type   p  are a theoretical tool which is useful for proving   NP-hardness. NP-hardness. Reductions also have practical applications. For instance, most    NP-complete most  NP-complete problems are solved by employing SAT solvers, which, as discussed in the former chapters, may be quite fast in the general case. Thus, a specific problem instance is cast (via an appropriate transformation)



 

42 into a formula   ϕ, such that   ϕ  is satisfiable i  ff  the   the answer to the instance is  yes.. yes SAT. The first NP-complete problem.   We observe observe that the “hen and  eggs ” issue issue still holds holds in our scenar scenario. io. To app apply ly our techniq technique, ue, we need need an initial NP-hard problem problem in the first place. This problem problem is provided provided by Cook’s Theorem, which proves that SAT is NP-complete. The technique for the proof relies on building, for each NTM M  NTM  M  (and  (and hence, for each problem in NP), a formula  ϕ M  such that it is satisfiable i ff  there  there exists a sequence in the computation tree leading to success to  success..

 

Bibliography [1] A.J. Hildebrand. Asymptotic methods in analysis ˜ http://www.math.uiuc.edu/hildebr/595ama/.   Math595AMA, Math595AMA, 2009.

-

[2] Christos Christos M. Papadimitrio Papadimitriou. u. Computational  Computational complexity . Addison-Wesley, Reading, Massachusetts, 1994.

43

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close