This is the first tutorial in my series on Symbolic Execution.

Before we really dive into symbolic execution I want to define some terms that I will be using.

A basic block is a sequence of instructions that will _always_ execute together (excluding some 'exception' such as a divide by zero might trigger some exception handling behavior etc etc, don't consider that a branch.) Basically, there is no branching between the set of instructions. So for example, the below snippet of code is ONE basic block:

Code:

`push eax`

mov eax, 10

call bobs_shitty_subroutine

sub eax, 3

int 3

Code:

`push eax`

sub eax, 2

jz is_zero

sub eax, eax

push eax

is_zero:

pop eax

int3

A constraint is essentially a limit or rule that is observed when we assert a particular fact. So for example, if we assert that x > 3: we can say that there is a constraint on x that says it must be greater than 3.

Additionally, if we assert that fact that x + y = 4, there is a constraint on x that, when added to y it must be 4. Likewise, we also have a constraint on y of similar nature.

From a collection of constraints, we can derive possible solutions. For example, if we observe the following facts about x:

1. x is an integer

2. x > 0

3. x < 10

We can derive from these constraints a set of possible solutions, which are {1,2,3,4,5,6,7,8,9}

A variable without many or sufficient constraints can have a virtually infinite number of solutions (I say virtually because in instance of, for example, a 32 bit integer, there are ofcourse bounds)

Before we get started, you will need to get some things setup on your system. If you already have Python 3 setup and know how to create a venv and install libraries with pip, then you don't need to read this. Just create a python3 venv and install angr (pip install angr)

For the rest of anyone reading this:

1. I am going to be using debian linux in these examples So you need to have access to either a linux machine or (if on windows) you can enable the Ubuntu Subsystem for Windows (learn how to do that here: https://docs.microsoft.com/en-us/win.../install-win10 )

Now, open your terminal window and type the following:

0. "sudo apt update"

0.5. "sudo apt install build-essential"

1. "sudo apt install python3 python3-venv python3-dev" < This will install python3 & python3 venv

2. "mkdir ~/phones/"

3. "cd ~/phones/"

4. "python3 -m venv venv"

5. "source ./venv/bin/activate"

And thats it; I suggest you read up on venvs and how they work here, it is very important as a python programmer to understand venvs: https://docs.python.org/3/library/venv.html

6. Now install the "angr" package. Use the command "pip install angr"

7. Annd you are all setup good job. Remember, if you return to your terminal and don't see (venv) prefix, you need to reactivate the venv before you run any of your python files (step 5.) (Read up on venv using the provided link above if you want to know why...)

As a precursor to this section, I suggest you read: https://docs.angr.io/core-concepts/solver

Remember in high-school or university when you did algebra? You were given a system of equations and were then asked to solve for possible solutions or constraints on the input. For example

x + y = 10

x - y = 8

This is a simple system of equations, we can solve it to identify that {y = 1, x = 9}

This is basically all that a Z3 engine does - but it supports various types that computers commonly work with. They are all derived from bit-vectors, which is just a fancy name for "collection of bits." Anyways, you don't really need to work with the Z3 solver directly in most cases since our "system of equations" will be the code we "symbolically execute" rather than mathematical equations.

How do we solve systems of equations in programming? Well, if you've ever tried, you probably used a matrix with some identities, calculating the inverse etc.

Now, I will show you how the cool kids do it using a Z3 solver.

Lets explore the Z3 solver a little bit more using claripy, which is an abstraction layer for Z3 solvers, it comes with angr.

From the terminal you setup in the previous step (make sure there is a venv prefix) spin up a python interactive terminal by typing the command "python"

Next, run the following code:

Code:

`import claripy`

x = claripy.BVS("x", 32)

y = claripy.BVS("y", 32)

s = claripy.Solver()

s.add(x + y == claripy.BVV(10, 32))

s.add(x - y == claripy.BVV(8, 32))

s.batch_eval([x, y], 100)

The next thing we do is instantiate a solver. A solver will basically take (as input) a series of constraints and spit out possible solutions. In this case, we add two constraints. We say x + y is equal to a 32bit BVV (Bit Vector Value) of 10. And we add another constraint as well.

Next, we ask the solver to evaluate x & y - which means, try to solve these variables given the provided constraints. Where does "100" come from? 100 is the number of possible solutions we want the solver to find. The reason we specify a limit is that in some instances, the number of solutions could be infinite.

In this case, there are two solutions for x & y. Now, you math instructor will probably disagree which is fine, but they would be mistaken. Since we are dealing with Bit Vectors, we must acknowledge the effect of wrap-around. If you take a 32 bit unsigned int that is equal to 0xFFFFFFFF and you add 1, what do you get? 0, with an overflow flag set in a CPU register somewhere ofcourse.

The solutions claripy found are:

Code:

`[`

(9, 1),

(2147483657, 2147483649)

]

A symbolic executor BUILDS onto a Z3 solver. A symbolic execution engine will derive constraints from the code it executes. Each step along the way, defining more and more constraints.

Imagine we have the following

Code:

`mov eax, ecx`

add eax, 10

sub eax, 3

cmp eax, 0

je block_b:

block_a:

...

ret

block_b

...

ret

I will end this first section with an important description snippet of information; please tune in for the next tutorial in the series.

When people write programs, the encode the desired behavior INTO code by using tools like a compiler. We can essentially decompose a PROGRAM into a series of possible STATES. Symbolic execution lets us try and examine how we get INTO a state. What are the REAL constraints of entering a state? If you enter a state in unexpected circumstances, you will more or less unlock an EXPLOIT or learn lots about how a program works without having to read the code or sift through obfuscation.

When we look at the system of equations:

x + y = 7

x - y = 4

We can try to solve for these variables using the following code:

Code:

`import claripy`

x = claripy.BVS("x", 32)

y = claripy.BVS("y", 32)

s = claripy.Solver()

s.add(x + y == claripy.BVV(7, 32))

s.add(x - y == claripy.BVV(4, 32))

s.batch_eval([x, y], 100)

Code:

`claripy.errors.UnsatError: unsat`

Code:

`#import turtle libary/create turtle window`

import turtle

turtle = turtle.Turtle()

#list of all commands to draw 'MATT'

turtle.up()

turtle.backward(300)

turtle.down()

turtle.left(90)

turtle.forward(150)

turtle.right(150)

turtle.forward(100)

turtle.left(130)

turtle.forward(100)

turtle.right(150)

turtle.forward(150)

turtle.up()

turtle.left(90)

turtle.forward(30)

turtle.left(70)

turtle.down()

turtle.forward(150)

turtle.right(150)

turtle.forward(150)

turtle.up()

turtle.backward(75)

turtle.right(100)

turtle.down()

turtle.forward(50)

turtle.up()

turtle.home()

turtle.backward(60)

turtle.left(90)

turtle.forward(30)

turtle.down()

turtle.forward(130)

turtle.up()

turtle.right(90)

turtle.backward(30)

turtle.down()

turtle.forward(70)

turtle.up()

turtle.forward(30)

turtle.down()

turtle.forward(70)

turtle.up()

turtle.backward(40)

turtle.right(90)

turtle.down()

turtle.forward(130)

Firstly, you need to make sure you have Python installed and an IDE (a program that simulates a coding environment). Google is your friend here, below I'll name off some IDE's that you can use:

- IDLE (I personally use this for college)

- PyCharm (I hear its a favorite but never tried it)

- Visual Studio (I have it and it functions well, visually appealing to)

Once you have that taken care of, you are ready to start coding Python! But where should you start? Well that all depends on your reasoning, I started coding Python because of college but others may want it as a skill or hobby. Below I've linked some useful resources when it comes to learning Python:

https://stackoverflow.com/

https://www.w3schools.com/python/

https://www.codecademy.com/catalog/language/python

https://docs.python.org/3/

I've personally used all of these to help aid in my coding exercises/projects and understanding certain topics. Of course, Youtube has some very useful tutorials as well and I would never recommend not using them!

Now let's change pace here and say you have started coding Python, maybe you have made your first program! My first Python project was to code a program that calculated the area of a circle given any radius. Below is that code:

Code:

`variable_pi = 3.14 #value of pi rounded for simplicity`

variable_radius = float(input('What is the radius of the circle? ')) #input from user regarding circle radius

variable_area = (variable_pi * variable_radius **2) #area formula of a circle

print ('The area of the circle with given radius', variable_radius, 'is:', variable_area) #print statement that calls on our my radius and area variables

- Variables

- Comments

- Syntax

- Data types (strings, ints, floats)

Just to name a few and of course the ones used in my beginner Python program. Everything program in Python will have these and it's a good foundation to know what these are.

Variables: This right here is a variable. It is used to store a value of sorts, in this instance the float value of Pi.

Code:

`variable_pi = 3.14`

Code:

`#input from user regarding circle radius`

Data types: In Python there are multiple data types. The first 3 you'll most likely come across are strings, integers and floats. I look at strings just like text or words. They are used to build sentences, etc. Integers are whole numbers and are used for mathematical functions in a program. Lastly, floats are decimals and have the same purpose as ints do in Python, just that they aren't a whole number.

In short, these are some of the basic concepts in Python and most users will start here and work up. I myself am still a beginner but peer help has gotten me far so why not share this info here. Hope this helps someone and let me know if I missed something! ]]>

Code:

`import functools`

input_File = input('Name of file: ') #ask user for name of file

file = open(input_File, 'r') #open file

file = file.read() #read file

file = file.split() #put numbers into list

file = list(map(float, file)) #convert list into floats

file_Sum = functools.reduce(lambda x, y: x + y, list(file)) #grabs sum of file

print(file_Sum / len(file)) #average function using file sum from reduce fuction

Code:

`variable_hourlyWage = float(input('What is your hourly wage? ')) #hourly wage input`

variable_regHours = float(input('How many regular hours this week? ')) #regular hours input

variable_overtimeHours = float(input('How many over time hours this week? ')) #overtime hours input

variable_overtimePay = variable_hourlyWage * 1.5 #formula for overtime pay

variable_weeklyPay = (variable_hourlyWage * variable_regHours) + (variable_overtimeHours * variable_overtimePay) #formula for weekly pay

print('You have made', '$' + str(variable_weeklyPay))

Code:

`count = 0 #amount of numbers inputted`

vSum = 0.0 #variable to track sum

average = 0.0 #variable to track average

data = input('Enter a number or press enter to quit:')

while data != '': #while data is actively being inputted it calculates the sum and count total

number = float(data)

count += 1

vSum += number

data = input('Enter a number or press enter to quit:')

average = vSum/count

if data == '': #once the enter key is hit, process stops and prints sum and average

print('The sum is', vSum)

print('The average is', average)

Code:

`plainMessage = input('Enter a message: ') #user inputs text`

distance = int(input('Enter the distance value: ')) #user inputs distance value

encryptText = ''

for ch in plainMessage: #caesar cipher formula

ordValue = ord(ch)

cipherValue = ordValue + distance

encryptText += chr(cipherValue)

file = open('encrypt.txt', 'w') #cipher text to .txt file

file.write(encryptText)

file.close()

with open('encrypt.txt') as file: #.txt file read/copied to another .txt file

copyText = file.read()

file = open('copyfile.txt', 'w')

file.write(copyText)

file.close()