Hi !
In this tutorial I'll show you how to make a scripting language.
I've commented the source code as much as I could. If you don't understand any part, feel free to ask.
Working:
First we'll take the code from the user.
We'll scan and tokenize the code.
We'll decide the things that will be controlled by the language.
We'll parse the code. (Checking if the syntax is followed or not)
Finally we'll execute the code.
What is a scripting language:
Originally Posted by
Wikipedia
A scripting language, script language or extension language is a programming language that allows control of one or more software applications. "Scripts" are distinct from the core code of the application, as they are usually written in a different language and are often created or at least modified by the end-user.
1: Making the Scanner and Tokenizer:
Let's take a look at what scanners and tokenizers are:
Originally Posted by
Wikipedia
The first stage, the scanner, is usually based on a finite state machine. It has encoded within it information on the possible sequences of characters that can be contained within any of the tokens it handles (individual instances of these character sequences are known as lexemes). For instance, an integer token may contain any sequence of numerical digit characters. In many cases, the first non-white space character can be used to deduce the kind of token that follows and subsequent input characters are then processed one at a time until reaching a character that is not in the set of characters acceptable for that token (this is known as the maximal munch rule, or longest match rule). In some languages the lexeme creation rules are more complicated and may involve backtracking over previously read characters.
Originally Posted by
Wikipedia
Tokenization is the process of demarcating and possibly classifying sections of a string of input characters. The resulting tokens are then passed on to some other form of processing. The process can be considered a sub-task of parsing input.
Take, for example, the following string.
The quick brown fox jumps over the lazy dog.
Unlike humans, a computer cannot intuitively 'see' that there are 9 words. To a computer this is only a series of 44 characters.
A process of tokenization could be used to split the sentence into word tokens. Although the following example is given as XML there are many ways to represent tokenized input:
[PHP]<sentence>
<word>The</word>
<word>quick</word>
<word>brown</word>
<word>fox</word>
<word>jumps</word>
<word>over</word>
<word>the</word>
<word>lazy</word>
<word>dog</word>
</sentence>
[/PHP]
In easy words, scanners loop through every character of the code / line.
The scanner then passes the character to Tokenizer which creates a series of characters based on the pattern defined.
OK, lets make it:
Create a new project.
Create a module. Name it "Globals" and add a public variabe "ln" as integer type in it, as shown below:
Code:
Public ln As Integer = 1
Next, create a class and name it "Compiler".
Scanner is actually a 2 lines code, so we will merge the scanner and tokenizer function.
Inside the Compiler class create a function named "TokenizeLine" which should input "line" as a string, as shown below:
[PHP]Public Function TokenizeLine(ByVal line As String)
End Function
[/PHP]
Inside of the TokenizeLine function:
We are going to perform scanning of the line and then we'll join the characters according to a pattern.
Now put the following code inside the TokenizeLine function.
Code:
'Variable that will store the sequence of characters (words).
Dim FinalString As String = ""
'Determines whether user has opened / closed a double quote.
Dim IsQuotes As Boolean = False
'Loop through the line. Scan all the characters and join them.
For i As Integer = 1 To line.Length
'Variable that will hold the current character.
Dim chr As String = Mid(line, i, 1)
'If the current character is a double quote.
If chr = """" Then
'If a quote is already opened then close it. Else open it. (We keep track of it by using IsQuotes variable.
If IsQuotes Then
IsQuotes = False
Else
IsQuotes = True
End If
End If
'We don't need to check anything inside the quotes, so add the current character to Final String variable and continue the loop (skip further checking).
If IsQuotes Then
FinalString += chr
Continue For
End If
'If current character is a letter (alphabet) then
If Char.IsLetter(chr) Then
'Append the alphabet to the FinalString
FinalString += chr
'Similarly do the same if a digit is found.
ElseIf Char.IsDigit(chr) Then
FinalString += chr
'If a whitespace is found (space) then just append a newline, so that we can split the word later.
ElseIf Char.IsWhiteSpace(chr) Then
FinalString += vbCrLf
'If any of the following character is found, add a new line and then the character, so we can distinguish the characters easily.
ElseIf chr = "(" Or chr = ")" Or chr = "+" Or chr = "-" Or chr = "*" Or chr = "/" Or chr = "=" Or chr = "," Or chr = "&" Or chr = "'" Then
FinalString += vbCrLf & chr & vbCrLf
'If quote is found then first check that if quote is true then just put the double quotes to the FinalString variable, so that we can check further without ignoring any character anymore.
ElseIf chr = """" Then
If Not IsQuotes Then
FinalString += """"
End If
'If current character is a character not defined by us above then send a compiler error so that we can know an unknown character that is not supported by the language is found.
Else
Throw New Exception("Unknow character '" & chr & "' at line: " & ln & vbCrLf & "Position = " & i)
End If
'Loop
Next
'Phew out of the loop...
'As you know that we create new lines for distinguishing the words, there might still be some whitespaces / newlines, so to exclude extra stuff we trim the FinalString well.
'Create a multi-dimensional string which will store the splitted values of FinalString....
Dim Temp1() As String = FinalString.Split(ChrW(10))
'After trimming we will need a new final value and the previous final value will be discarded. So, create a new string variable for that.
Dim ValToReturn As String = ""
'Loop through the multi-dimensional string (Temp1) and make sure there's no extra stuff. If so, remove it.
For Each Temp2 As String In Temp1
If Char.IsWhiteSpace(Temp2) Then
Temp2 = Temp2.Replace(Temp2 , "")
End If
ValToReturn += Temp2
Next
Return ValToReturn.Trim()
Now we need to decide what we want to do with the language.
You can do almost 75-80 % of what vb can do. It depends on you, the more you know about the language (In this case, vb) the more powerful your language will be !!
In this tutorial, I will show you how to customize a form and its controls. Plus we will do some cool stuff with OS. This will be just an example, you can do anything you want.
Parsing the code:
This is the stage where we will define our syntax and check if the user follows it or not. Also, this same stage will perform execution.
As you see, I've named the class "Compiler" but its actually not a compiler. It is an interpreter. But we can create the magic of a compiler just by using a single Boolean variable.
Now inside "Compiler" Class, Create a function named "ParseLine" which should input "Line" as a string variable, like shown below:
Code:
Public Function ParseLine(ByVal line As String)
End Function
Now we will split the contents of the line and will check the syntax.
Inside of ParseLine function:
Inside the function enter this line:
Code:
'What it will do is it creates tokens.
'Tokens are the individual words / sequences of characters returned by the Tokenizer... We will later loop through tokens to actually check the syntax.
Dim tokens As String() = TokenizeLine(line).ToString.Split(vbCrLf)
OK, leave the function for now.
Now we need to check the syntax.
We will do it by recognizing the tokens. For example I want to turn off the monitor using the language, I will check the if the first word of the line is "Turn" then I will further check that after "Turn" is there the word "off" present or not. If not then give a syntax error and stop compiling...and so on...!!
Goto the "Globals" module and write the following function:
Code:
'This will help in making our language case-insensitive !!
'Means its ok if I write I set the back color as:
'Set BackColor To Red
'OR
'SET BACKColOr tO REd
Public Function CompareString(ByVal source As String, ByVal target As String)
If String.Compare(source, target, true) = 0 Then
Return True
Else
Return False
End If
End Function
Now add a new function to the same "Globals" module:
Code:
'This will add any known errors to the listview so that we can make user aware of how newb he is LOL .. Just paste the function as is, we'll fix the errors later..
Public Sub AddError(ByVal err As String, ByVal instructions As String)
With Main.errmsgs
.Items.Add(ln)
.Items(.Items.Count - 1).SubItems.Add(err)
.Items(.Items.Count - 1).SubItems.Add(instructions)
End With
End Sub
Add a variable to "Globals" module, as shown below:
Code:
Public ln As Integer = 1
Create a new Function in the "Compiler" class named "SetControlText" and should take "tokens()" as an input, like shown below:
The following function will show you how to add as many functions as you want.
Code:
Public Function SetControlText(ByVal tokens() As String)
'Since just two controls support displaying text, we will just check them to avoid hardworking :D
'Syntax: Set <control> Text To "<Text">
If Not CompareString("Set", tokens(0)) Then
Exit Function
End If
If tokens.Length < 5 Then
AddError("Invalid Set statement.", "Set statement must be called as: 'Set <control> Text To ""<Text"">'")
Exit Function
End If
If tokens.Length > 5 Then
AddError("Too many parameters for Set statement.", "Set statement must be called as: 'Set <control> Text To ""<Text"">'")
Exit Function
End If
'Since variable names are case-sensitive we will not compare it.
'Now check whether the specified control exists or not.
If Not ControlledForm.Controls.ContainsKey(tokens(1)) Then
AddError("The specified control name '" & tokens(1) & "' doesnot exists.", "")
End If
If Not CompareString("Text", tokens(2)) Then
AddError("Property 'Text' expected after 'Control Name'.", "Replace '" & tokens(2) & "' with 'Text'.")
Exit Function
End If
If Not CompareString("To", tokens(3)) Then
AddError("Keyword 'To' expected after 'Text'.", "Replace '" & tokens(3) & "' with 'To'.")
Exit Function
End If
'Check whether the text is enclosed in double quotes or not...
If Not tokens(4).StartsWith("""") And Not tokens(4).EndsWith("""") Then
AddError("Text must be enclosed in double quotes.", "Insert double quotes at the start and end of '" & tokens(4) & "'.")
Exit Function
End If
'Execution...
If Not isCompiling Then
'Add the control and its corresponding text to ctrlText dictionary.
ControlledForm.Controls(tokens(1)).Text = tokens(4).Replace("""", "")
End If
End Function
I've added 10-12 functions to the attached project. Play around with them, so that you can know how to make new functions.
Also I've included some API functions.
Now inside the "ParseLine" function we made earlier, add the functions we created like "SetControlText", etc etc and input tokens, like shown below:
Code:
SetControlText(tokens)
Show_Add_Control(tokens)
Beep_(tokens)
Flip3DVista(tokens)
Pause(tokens)
SetProgress(tokens)
[End](tokens)
ShellExe(tokens)
SetMousePosition(tokens)
SearchYoutube(tokens)
SearchImages(tokens)
FocusApp(tokens)
ToggleKeys(tokens)
Monitor(tokens)
Wallpaper(tokens)
OpticalDrives(tokens)
Say(tokens)
These are the names of the functions I created. (Included in the attached project.)
Now for execution, we need another function that will call these functions on every line. Include the following to the "Compiler" class:
Code:
Public Function ExecuteLine(ByVal line As String)
ParseLine(line)
End Function
For the GUI, compilation and Execution please download and open the attached project, because my hands have stopped writing now !!
Here is a screen shot of what we've created:
Error Handling:
This is the test GUI, try to make a professional GUI, like:
This is the one I made for my programming language that I am creating.
Give control to the end-user in a stylo way
I've shown you how to program your own programming language.
The only limitation is on your side, and that is your imagination.
Add stunning functions and let MPGH know how skilled you are.
It's up to you to choose whether you want to create the most difficult language or the most easiest one !!
I've not shown you how to make the executable out of this source code. Do some work on your side too. I am giving you a hint on how to make the exe:
Use CodeDom (C# Compiler)
You can create a fully functional object oriented programming language using this method.
If you find any error/bug, let me know !!
Virus Scan: Virustotal. MD5: cdeabbe0ae7f8eacec605e191b3e76ad Suspicious.Insight
Hope this helps
Greetz, MJLover