Problems that can be solved with programs come up everywhere in day to day life. Consider the problem:
Build a login system for an MMORPG in C++.
How would this problem or any problem even begin to be solved? First the problem needs to be fully analyzed and understood so all necessary features and requirements are accounted for. The requirements are the features and behaviors, or constraints of a software system. Once the problem is understood expert knowledge is used for insight on how to best interpret the requirements into software. The requirements can then be turned into an algorithm which can then be programmed. Finally, once a program is written any bugs can be resolved and the results compared against the requirements.
Build a login system for an MMORPG in C++.
To first tackle this problem it must first be analyzed, requiring you to:
Understand the whole problem thoroughly.
Understand the requirements for the problem (ex: whether a user will interact with the program, what libraries a program might need, what input devices the program might need, etc.).
Divide complex problems into smaller subproblems, which will require each subproblem to go through steps [1] and [2] (and potentially step [3]).
Understand the problem by asking questions to clarify the requirements for input, data processing, and output. The requirements are the features, behaviors, or constraints of a software system.
For input, requirements to our problem may look like:
Username is a string of text.
Password is a string of text.
For data processing, requirements to our problem may look like:
The username must be between 2 and 16 characters in length.
The username may only contain alpha-numeric characters.
The password must be between 8 and 32 characters in length.
The password may contain alpha-numeric characters, and any punctuation.
The password must contain a capital and a lowercase character, a number, and a punctuation mark.
Require username and password to post to server.
Display fail message if error with credentials.
Move from login to game if successful login.
For output, requirements to our problem may look like:
The key at this step is to fully understand what it is that the program is going to accept, what it needs to do, and how it needs to format output. Once it is fully understood the problem mostly likely can be broken up into smaller subproblems, such as:
Get a username and validate it.
Get a password and validate it.
Submit the credentials to the server and wait for a response.
Output error and redo process if anything fails.
Proceed to game if successful login.
These subproblems can be further analyzed for more requirements which may lead to even more sub problems.
Once the problem has been thoroughly analyzed and all the base requirements are established, how to solve the problem programmatically must be researched.
To solve a problem the limits on the range of input/processing/output, called the domain, must be understood. This way the program is neither over or under engineered. The input needs to be understood as to what kind of input is to be expected. This way any input fields can validate the input and reject anything that is not valid input to simplify validating data. Processing may be limited to how precise a calculation needs to be, what kind of data needs to be used, or any other processing constraint. Output may be limited in how precise displayed data is, what data may be displayed, or any other output limitation. These conditions and situations that occur at extreme ends of expected operation are known as edge cases.
There seems to be endless amounts of programming languages out there. C, C++, C#, Java, JavaScript, Python, the list is endless. Many things must be considered when choosing what language to use to write a program. Where will the program live (downloaded to peoples computers, on the internet, etc.)? What will the program do? How will it need to take in input or produce output? Any questions that affects how the program will interact and process effects what language should be chosen.
To choose a language consider:
What languages similar projects use. If there are no similar projects consider how others are using the languages being considered and weigh the pros and cons.
What libraries are written for a language and how a language performs. These two properties affect the amount and style of the code to be written. To review languages effectively the documentation of the language may need to be read.
Popularity of a language. If a more popular language can be used to solve a task then there will more likely be more resources and support for the language. Though, sometimes it is necessary to use less popular languages to solve niche tasks.
Similar problems can inform not only what programming languages may be effective in solving a problem, but they also inform what is already there, what needs to be solved, what features of languages are being used, what constraints need to be accounted for, etc. This provides a solid foundation while ensuring time is not wasted solving issues that have already been solved.
Programs demand a variety of different hardware resources (CPUs/GPUs/RAM/Secondary Storage/etc.) from a computer, and computer resources have cost ceilings that don't have caps. Thus, it is important to consider what resources a computer has, and the performance of those resources. If significant resources are not needed, the cost for them may not be beneficial to buying them, thus meaning lesser resources can be used. If significant resources are needed, the cost of necessary resources may be too significant, thus creative measures may need to be taken.
Safety must also be considered for both the software system and the hardware resources it uses, and for the users using the system. A program must be be able to defend itself agains malicious activity, as well as it must not starve the hardware system it is running on of the resources on the system. Either of these things could cause a system to fail causing downtime for further development and hardware implementation.
User safety is also critical when developing. If a user is using a product running code they are generally trusting the code to not harm them or anyone else. For example, someone using an electric saw won't want the saw to turn on when the safety switch is on. If code is running this system then the code needs to ensure that this mechanism does not fail.
Programmers are not always experts in every single facet of everything they program. For instance, just because someone is good at programming does not mean they can do astrophysics, but with the right explanation from an astrophysicist they can likely program a problem in astrophysics. This is how outside experts play a key role in developing programs.
Experts bride the gaps between a programmer's knowledge of programming, and their lack of knowledge on something that needs to be programmed. Experts can bring in key outside knowledge to help a programmer fully understand a problem that needs to be programmed. Then, if any semantics errors arise when running a program the programmer can consult with the expert on how to solve the error. This interdisciplinary cooperation allows for quick development of correct programs.
Algorithms are the core of every software system a computer uses from the moment it is powered on. Algorithms allow computers to follow clear steps that can be built upon over and over to create complex systems that control the world around us today. An algorithm is a clear and ordered set of steps (commands/instructions) that a computer can follow to solve a problem.
Algorithms can be thought of as the steps in a recipe. They start off with some starting ingredients (premises), follow a clear set of instructions, and result in something delicious (the program that gets to be used).
Algorithms are developed iteratively, meaning an algorithm may start primitive, be reviewed, and improved upon. For example, for a login system an algorithm may start as:
Read a username.
Read a password.
Login.
But there are more constraints and features that must be explored. What if the size of the username or password is restricted? Or if only certain characters could be used? The algorithm then evolves:
Read a username.
Read a password.
Validate the username: 2 ≤ size ≤ 16 and only alpha-numeric characters.
Validate the password: 8 ≤ size ≤ 32, only alpha-numeric characters, must contain capital and lowercase character/number/punctuation mark
If any errors display error and go back to [1].
Login.
But this still isn't the whole picture. What is logging in? Where do the credentials get sent to? What might be received? What needs to happen once it is received? With all of this the algorithm can further evolve:
Read a username.
Read a password.
Validate the username: 2 ≤ size ≤ 16 and only alpha-numeric characters.
Validate the password: 8 ≤ size ≤ 32, only alpha-numeric characters, must contain capital and lowercase character/number/punctuation mark
If any errors display error and go back to [1].
Send credentials to the server.
Receive response from the server.
If any errors from the server display error and go back to [1].
If successful:
Close the login interface.
Open the game interface.
Now the set of steps that needs to be taken to login has been more clearly defined (although they could be refined even further). These steps can be followed in their exact order to achieve the goal that has been defined. Any errors that may occur as the steps are executed are defined and recovered from, and there is a clear ending. Each step follows from the previous step in a logically sound manner which allows the algorithm to clearly be translated to code.
Writing a program is translating an algorithm in English to some computer language. There are three types of computer languages:
Machine Language - A computer-readable language composed entirely of 0s and 1s..
Low-Level (Assembly) Language - A computer language one level higher than machine language that provides instructions a processor can directly execute.
High-Level Language - A programming language which abstracts away the intricate details of how the computer handles bits and machine code.
Machines can only read instructions in the binary digits (0 and 1), also known as bits, and groupings of these bits typically in groupings of to bits. These instructions are called machine language and the code written in them is called machine code or object code. The instructions in machine language used to be (32) bits long, but in modern systems instructions are (64) bits long to account for modern processing needs. A machine instruction looks similar to the following:
1101 0110 1010 0011 1001 1111 0000 0101 1111 1111 1111 1010 1100 0011 0011 0000
That is just 1 instruction to do 1 extremely simple task such as put some data into a memory location. If programming involved stringing together instructions like this it would be extremely difficult to program even the most simple of tasks. For this reason chip developers like Apple, Intel, and AMD write low-level assembly languages to assist with programming chips. Assembly languages provide the same low-level simple instruction set that machine language does, with the exception that assembly is human readable. These instructions run directly on the CPU and are a good middle ground between machine language and high-level languages (which combine the assembly instructions to make complex instructions). Programs written in assembly languages are called assembly code. Assembly code looks similar to:
mov eax, 5 ; Load 5 into register EAX mov ebx, 3 ; Load 3 into register EBX add eax, ebx ; EAX = EAX + EBX
These assembly level instructions are human readable and much shorter in length than machine instructions, but they still leave room for improvement. The problem with assembly instructions is they are difficult to memorize and use as they are very foreign from the way that operations are taught in math classes. In math you can just add two numbers 5 + 3
. To make programming more approachable and intuitive to what is taught in grade-high school high-level languages have been developed. High level languages combine multiple assembly level instructions to create complex instructions. These complex instructions can be broken down (assembled) into assembly code that can then be directly ran on a processor. Programs written in high-level languages are called source code.
There are many high-level languages to write source code in, to name a few: C, C++, C#, Javascript, and Python. Although these are not all of the high-level languages they cover a broad spectrum of topics, such as C/C++ is used to have the most control over a computer and most software systems including AAA games include some C/C++ code if not entirely written in C/C++. C# is a scripting language used for desktop applications and is the primary and most common programming language in the Unity game engine. Javascript is the language of the internet, powering most websites visited every day. Python is a powerful, yet slow, language that people use for just about anything, though Python's most prevalent use case is data science and machine learning due to the availability of libraries written for the topic.
An example of source code written in C++:
int sum = 5 + 3;
The 3 lines of code needed to do the same task in assembly can be done in a single line of code that looks very similar to how people are used to seeing addition from math class. This familiarity breaks down the entry barrier to coding to allow more access to learning to code.
Source code and assembly code can not be directly run on a CPU, only machine code. Any code written in a language other than machine language must be translated to machine language.
For source code there are two primary ways to translate the program:
Compiler - A computer program that translates source code to machine code.
Interpreter - A computer program that directly executes programs, converting lines to machine code one at a time as the program runs.
The interpreter does not make a new file from the source code. To create machine code the interpreter reads a line of source code, translates it to machine code, runs the machine code on the CPU, then moves onto the next line. C++ is not interpreted, rather it is compiled.
In C++ compilation is the process of converting the entirety of a work of source code to assembly code which can then be assembled. Once the code is assembled it can be linked into a final executable file.
Prior to compilation the code must be preprocessed using a program called the preprocessor-a computer program that performs text-based transformations on the source code. The preprocessor has 4 main functions:
Remove comments (//
and /* */
)
Replace #include
directives with the contents of the header file. For example:
#include <iostream> using namespace std; int main () { cout << "Hello\n"; return 0; }
Preprocessed gets expanded to:
# 1 "main.cpp" # 1 "<built-in>" 1 # 1 "<built-in>" 3 # 513 "<built-in>" 3 ... # 65 "/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/iostream" 3 #pragma clang diagnostic pop # 2 "main.cpp" 2 using namespace std; int main () { cout << "Hello\n"; return 0; }
#define PI 3.14 double area = PI * r * r; // PI is replaced with 3.14
#ifdef DEBUG std::cout << "Debug mode on"; #endif
To run just the preprocessor step on a file main.cpp
first set up a programming environment including installing g++, then run:
g++ -E main.cpp -o main.i
This will create a file called main.i
. Inside of the main.i
file will be the expanded version of the program. At the top will be all the code from the header files, at the bottom will be the code for the program. Once all the necessary code is in the file the code is ready to be compiled.
In C++ the compiler takes the preprocessed source code and converts it to assembly code. The compiler first makes sure there is no errors in the syntax, if there are any errors it outputs the error and doesn't convert to assembly code. If however there is no errors in the syntax, then the code is compiled into assembly code.
To run just the compiler step on a file main.i
first set up a programming environment including installing g++, then run:
g++ -S main.i -o main.s
This will create a file called main.s
. Inside of the main.s
file will be the assembly code of the program ready to be assembled into machine code.
Syntax is the rules that tell which statements (instructions) are legal, or accepted by the programming language, and which are not. Syntax errors, also known as compiler errors, commonly arise when compiling a program For example, if a semi-colon was forgotten at the end of a statement:
#include <iostream> using namespace std; int main() { cout << "Hello World\n" return 0; }
the following compiler error would show:
main.cpp:7:26: error: expected ';' after expression 7 | cout << "Hello World\n" | ^ | ; 1 error generated.
In the error it shows main.cpp:7:26
this is saying in the file main.cpp
on line 7
at character 26
there is an error. The error message detailing what is wrong is then displayed. If there are any syntax errors the program will fail to compile. When compilation fails return to the Write a Program stage to fix the compiler errors.
It is important to note that 1 compiler error can trigger other compiler errors. When resolving compiler errors first scroll to the top of the errors and solve the 1st error, recompile and resolve any further errors (starting from the top of those errors).
Assembly code is simple instructions and memory usage that translate directly to machine instructions that can be run directly on the CPU. This is why C++ source code tends to be converted to assembly first as then the assembly code can just be assembled with an assembler to create object code which can then be run on the CPU directly. An assembler is a computer program that translates assembly code to object code.
To run just the assembler step on a file main.s
first set up a programming environment including installing g++, then run:
g++ -c main.s -o main.o
This will create a file called main.o
. This .o
file is called an object file--a file containing object code. Inside of the main.o
file will be the object code of the program ready to be linked with any other object code.
When multiple C++ files, namely when new header files are added, then multiple object files will be created. These separate object files need to be linked together to create a final executable file. This is the job of the linker--a computer program that links object files and libraries into a final executable. When a simple single source file is compiled and assembled then only a single object file is created, nonetheless the object file must still be linked.
To run just the linker step on a file main.o
first set up a programming environment including installing g++, then run:
g++ main.o
This will create a final executable file called a.out
. Inside of the a.out
file will be the final version of the code that is able to be executed on the CPU.
Running the individual steps of turning a program written in a high-level language can be beneficial to analyzing the output of each stage, but most times all 4 steps need to be ran at once. To run all 4 steps on a file main.cpp
first set up a programming environment including installing g++, then run:
g++ main.cpp
Which automatically runs:
g++ -E main.cpp -o main.i g++ -S main.i -o main.s g++ -c main.s -o main.o g++ main.o
This will preprocess, compile, assemble, and link the program into a final executable file--simply referred to as compiling the program.
If any compiler errors occur the g++ main.cpp
command will stop at the compile stage and output the compiler errors in a similar fashion to the compile step above. Upon successful compilation a file will be created named a.out
. To rename this file add to the command to compile the program -o <name>
:
g++ main.cpp -o <name>
Replacing <name>
with the desired name for the final executable. For example:
g++ main.cpp -o main
Will create a final executable file main
which is stored in secondary memory.
Once a program has been compiled to a final executable file on secondary memory it must be loaded into main memory, then it can be run on the CPU. To load the file into memory the operating system uses a loader--a computer program that loads an executable file into main memory, preparing it to run on the CPU. The CPU then runs the instructions, waiting for any input and producing any output.
To load and run any executable file in a terminal window first give the path to the directory the executable file is in followed by the name of the file. If the terminal is already at the directory the executable is in then the path to the directory would be ./
where .
refers to the current directory the terminal is in.
For example, to run an executable a.out
that was compiled with g++ main.cpp
run the command:
./a.out
If the code from the ** Resolve any Syntax (Compiler) Errors** section were compiled and run using this method the ./a.out
command would produce the output:
Hello World
If the program was compiled with a different name, such as g++ main.cpp -o main
creating main
then the command to run this executable would be:
./main
The program produces output which is displayed on an output device. This output can be as expected, but not always does it go as expected. During the execution phase you may get incorrect results known as semantic errors. Semantics are rules which determine the meaning of a set of instructions. Semantics errors are also known as runtime errors because they are commonly found when running a program. As with compiler errors, it is important to note that 1 runtime error can trigger other runtime errors, so it is important to resolve the 1st error, recompile/run and resolve any further errors.
To fix runtime errors go back to the analysis/research/algorithm/programming stages and re-analyze the problem is being understood and translated into a high-level language properly.
Once all runtime errors are resolved then the program successfully solves the given problem.
Requirements - The features, behaviors, or constraints of a software system.
Edge Cases - Conditions and situations that occur at extreme ends of expected operation.
Algorithm - A clear and ordered set of steps (commands/instructions) that a computer can follow to solve a problem.
Machine Language - A computer-readable language composed entirely of 0s and 1s.
Low-Level (Assembly) Language - A computer language one level higher than machine language that provides instructions a processor can directly execute.
High-Level Language - A programming language which abstracts away the intricate details of how the computer handles bits and machine code.
Machine Code (or Object Code) - The code written in machine language.
Assembly Code - Programs written in assembly languages.
Source Code - Programs written in high-level languages.
Compiler - A computer program that translates source code to machine code.
Interpreter - A computer program that directly executes programs, converting lines to machine code one at a time as the program runs.
Preprocessor - A computer program that performs text-based transformations on the source code.
Syntax - The rules that tell which statements (instructions) are legal, or accepted by the programming language, and which are not.
Syntax Errors (Compiler Errors) - Errors that commonly arise when compiling a program, such as forgetting a semicolon or using incorrect punctuation.
Assembler - A computer program that translates assembly code to object code.
Object file - A file containing object code.
Linker - A computer program that links object files and libraries into a final executable.
Loader - A computer program that loads an executable file into main memory, preparing it to run on the CPU.
Semantics - Rules which determine the meaning of a set of instructions.
Semantic Errors (Runtime Errors) - Incorrect results that occur during execution due to incorrect logic or understanding; they are commonly found when running a program.
What is the first thing a programmer should do when presented with a problem?
How can misunderstanding the problem affect the success of the solution?
What are the risks of skipping the problem analysis phase?
How do you determine whether a problem has been analyzed thoroughly?
Why is it helpful to break large problems into subproblems?
What strategies can be used to uncover hidden requirements?
How does asking the right questions during analysis reduce debugging time later?
Why is research an important step after problem analysis and requirement gathering?
How can research assist in writing a program?
What kinds of things might be researched prior to programming?
What is the goal of research in the software development process?
How does research help prevent a program from being over- or under-engineered?
How does research impact the accuracy and efficiency of the final program?
What does it mean to understand the domain of a problem?
What types of constraints can exist for input, processing, and output?
What are the main factors to consider when choosing a programming language for a project?
Why might a programmer choose a popular language over a niche one?
What is the benefit of using a language with a strong library ecosystem?
What are some reasons a programmer might choose a less popular language?
Why should a programmer look for similar problems before building a new solution?
How can similar projects guide language choice?
What kinds of constraints or features can be identified from analyzing similar problems?
How does researching existing solutions help avoid redundant work?
What hardware resources must be considered when evaluating performance needs?
Why is it important to match program demands to available system resources?
What are the risks of designing software for hardware beyond budget constraints?
How can performance constraints influence program architecture or algorithm design?
What are the two types of safety concerns that must be considered in programming?
How can software harm a hardware system if not properly designed?
Why is user safety important when developing software for hardware devices?
Why are programmers not always able to fully understand the problem they are solving?
How can experts help bridge the gap between problem domain and implementation?
What role do experts play in resolving semantic (logic) errors in domain-specific code?
How does interdisciplinary collaboration improve the quality of software?
Give an example of a scenario where consulting with an expert would be essential?
What are the dangers of skipping research and jumping straight into coding?
What is an algorithm?
How can an algorithm be compared to a recipe?
Why must the steps in an algorithm be clearly ordered?
Why is it important that each step in an algorithm follows naturally from the previous step?
In what manner are algorithms developed?
What is meant for an algorithm to be developed iteratively?
Why might an initial algorithm be primitive or incomplete?
How can constraints (e.g., username length or allowed characters) lead to changes in an algorithm?
What are examples of features or considerations that can lead to algorithm revisions?
How does handling error conditions improve an algorithm?
What does it mean to recover from an error in an algorithm?
Why is defining the end result of an algorithm important?
What role does iteration play in the development of a robust algorithm?
How does a well-defined algorithm make writing code easier?
What makes an algorithm ready to be translated to code?
In your own words, describe how the login algorithm evolved in the example.
If you were designing an algorithm for a different task (e.g., registering a new user), what similar steps might be involved?
How could the login algorithm be improved even further beyond what’s shown?
What are the risks of jumping into code without first developing a detailed algorithm?
Think of a real-world process (e.g., making coffee). Write an algorithm for it using what you’ve learned about algorithm structure and refinement.
What are the 3 types of computer languages?
Which of the 3 types of computer languages is the easiest for programmers to understand?
Which of the 3 types of computer languages is the hardest for programmers to understand?
What is object code?
What is assembly code?
What is source code?
How does source code relate to assembly code?
How does source code relate to object code?
How does assembly code relate to object code?
Why is code not commonly written by programmers using machine languages?
Why might programmers tend to write programs using high-level languages rather than assembly languages?
Name 3 high-level languages.
What 2 methods can be used to translate source code to object code?
What is a compiler?
What is a interpreter?
What is the difference between a compiler and an interpreter?
Not including errors, what are the 4 steps of compiling a program?
What is the relationship between each step in the compilation pipeline?
What is the -o
flag in g++
used for?
What is the -E
flag in g++
used for?
What is the -S
flag in g++
used for?
What is the -c
flag in g++
used for?
What is the preprocessor?
What 4 functions does the preprocessor serve?
Write the command to preprocess a file named program.cpp
to a preprocessed file program.i
.
What is the compiler (just the single step, not the overall process)?
What must be done to source code prior to it being passed to the compiler?
What does the compiler do to a preprocessed program?
Write the command to run just the compiler on a preprocessed file program.i
writing the generated assembly code to a file program.s
.
What is the assembler?
What program is used to convert assembly code to machine code?
What steps must be taken prior to assembling source code?
Write the command to run the assembler on a compiled program program.s
writing the generated object code to a file program.o
.
What is an object file?
What is the linker?
What steps must be taken prior to linking source code?
Write the command to run the linker on an assembled program program.o
generating a final executable file a.out
.
What command is used to run all 4 steps of compiling a program main.cpp
?
What are the 4 command that are automatically run when the command g++ incredicoaster.cpp
is run?
What is added to the command g++ soarin.cpp
to create the final executable soarin
rather than a.out
?
Write the command to compile a program take_tickets.cpp
to the final executable take_tickets
.
What is the loader?
What program is used to load an executable into main memory so it can be run on the CPU?
Write the command to run an executable file a.out
.
Write the command to run an executable file take_tickets
.
What is syntax?
What is semantics?
What is the difference between syntax and semantics?
What is a compiler error?
What is another name for compiler errors?
What can cause a compiler error?
What step does a programmer go back to when a compiler error occurs?
What is a runtime error?
What is another name for a runtime error?
What steps does a programmer go back to when a syntax error occurs?
If there are multiple compiler or runtime errors, what order should they be solved in?
What is the benefit of solving one error at a time?