If you’ve worked or studied information technology, you’ve probably heard about source code. Most textbooks provide a technical definition, but this is hardly practical because programmers have derived nuanced meanings of the term in practice.
I spoke with 3 programmers to really understand what source code is, and to flush out key points professionals learn in the field that you might not find in formal training.
This article explains two key uses: (1) source code as human-written plain text code that a computer interprets, and (2) source code as the code used to produce a specific output under question. There is no difference between the code itself, only the context in which the term is used.
The following is a summary of technical and practical information, and it’s intended for finance, data, and business analysts who want a more-than-basic introduction to the topic.
Don’t forget, you can get the free 67 data skills and concepts checklist to cover source code and other fundamentals.
Source code is any code, with or without comments, created by a human in a programming language typed in plain text, that has been executed and translated into binary by a computer and produced a non-error output, whether desirable or not.
That’s a dense definition. It’s easier to consume as a list:
- Any code
- With or without comments
- Written by a human and readable by a human
- Written in plain text
- Has been executed and translated into binary (object code) by a computer
- Has produced a non-error output
- It’s output is either desirable or requires modification
Each of these criteria is important, and some of them may be surprising. For example, you may have thought that system-generated code could be considered a source, but strictly speaking source code must be written by a human.
Moreover, you may not have realized that source code must be successfully executed at least once. If not, it’s just draft code.
Finally, it’s important to note that the execution must have produced a non-error output. In the case of an error, it’s still just a draft. Moreover, even if the non-error output is undesirable, we consider the text source code. It simply needs to be fixed.
Source code is also known as computer code.
While the above definition is formal and complex, in practice the term “source code” is simpler. It refers to any code written by a programmer that produced a result people are talking about.
This usage disregards the non-error output criteria, and under it the code doesn’t have to be converted into binary by a computer. It could simply be created in a code simulator like those available on w3schools.
Types of Source Code
The types of source code are:
- Compiled source code
- Interpreted source code
- Computer (or operating system) source code
- Software program source code
- Software feature source code
To discuss types of source code, we need to talk about two concepts: types of translation, and implementation hierarchy.
Types of Translation. All source code must be translated to machine code (aka binary, 1 & 0). This translation can happen by compilation or interpretation. These define two types of source code.
Implementation Hierarchy. Source code can be used for outputs to structure a computer (highest level), a software program (mid level), or a software feature (lowest level). More on these below.
Types of Translation: Compiled vs. Interpreted
A compiled source code is one for which the machine directly translates the text into machine code. An interpreted source code is one for which a secondary program, called the interpreter, must analyze the code independently and then feed it to the machine.
Which source code is more efficient between compilers and interpreters?
In practice, a compiler will analyze your source code into machine binary faster than an interpreter. Interpreters only analyzes one statement at a time while a compiler treats the whole program. While the interpreter is faster on single statements, the overall process is faster with a compiler, since manually feeding the interpreter with each statement slows down the translation.
Hierarchy 1: Computer (Operating System)
Source code written for the computer to implement its operating system is the highest level on the hierarchy. Conversely, it’s the lowest level of detail and requires the most effort for the smallest output.
Hierarchy 2: Software Program
Source code for a software program is the second highest in the hierarchy because, while it’s not as complex as an operating system, it’s more difficult to construct than a simple feature.
Hierarchy 3: Software Feature
Source code for software features is the simplest source code. It’s also the most common, and the one most programmers will tackle in their day-to-day work.
Examples on a Website
Source code is easy to find for webpages. Right click anywhere on a page and select “View Page Source.” This will open a new tab with the source code compiled by the computer that creates the page you see.
Here’s an example using AnalystAnswers.com:
Example Source Code for Data Analysts: SQL
In SQL, source code can be either the creation of data tables or the querying of them. However, querying a database is not as powerful as creating because it does not have innovative value. Anyone can retrieve data, but not everyone can create desired data.
Here’s a short clip in which I create a database, data table, some dummy data, and a query to show it in tabular form.
Source Code vs Object Code
While source code is human-written text, object code is computer-written numeric code made of 1s and 0s, or binary. Object code is what the computer reads in order to produce a desired outcome.
Source Code vs Open Source
We now know what source code is, but how does it fit into the open source dialog? Open source is nothing more than the concept that source code is available for viewing and copying to the public.
Open source might as well be called “open source code” or “public source code” because it’s simply the availability of source code to anyone. The best known example of open source code is GitHub. Anyone can visit the website to learn new skills or copy existing projects.
Can source code be reverse engineered?
Technically, source code can be reverse engineered from binary using a decompiler. The obvious constraint, however, is that you need the binary code to do so.
If you’re looking to copy a great webpage or app only using the front end code, you won’t be able to reverse engineer from there. In this way, source code is protected by access rights.
Source Code Repository
Source code repository is the technical term for what’s commercially known as source code hosting facilities. They consist of file archives and documentation systems for source code, which is most useful for teams or companies that work on many different projects.
Source code repositories can be publicly or privately held, and most commercial solutions today include bug tracking features.
Can source code contain a virus?
Most viruses today do not attack source code. Why? Because designing a virus with the ability to be executed on a specific command would require in-depth knowledge of how the specific source code is constructed in the target’s files.
Every programmer writes code in his/her own way, so targeting a specific execution “in the style” of one programmer would be difficult. Instead, viruses usually attack programs that integrate macros, which are standardized and common programs executed with common inputs such as the click of a mouse or stroke of a key.
Can source code be copyrighted or patented?
To answer this question I consulted a GDPR expert and an intellectual property attorney. The answer is yes, source code is in fact immediately copyrighted upon its creation in US law. However, in order to enforce the copyright, the creator needs to secure a certificate of registration of a copyrightable work from the United States Copyright Office.
Applications can be filed online and generally take only about 4 months to process. The cost of an application generally ranges from $45 – $65 depending on the number of owners, the number of works, and whether or not the filer hired someone to create the work for him/her.
Can source code be a trade secret?
A trade secret is a concept and right that refers to any information a company or individual possesses or has created that is valuable because few people know about it. Source code is a perfect example of a trade secret.
According to the World Intellectual Property Organization, source code can only be considered a trade secret when it is:
- commercially valuable because it is secret,
- known only to a limited group of persons, and
- subject to reasonable steps taken by the rightful holder of the information to keep it secret, including the use of confidentiality agreements for business partners and employees.
Source code is what programmers create every day. In most cases, if people are talking about code, we can call it source code. However, technically it must be successfully executed at least once to be considered source code.
You can find source code used to build websites by right clicking on any page and navigating to > View Page Source. However, this source code is protected and you won’t be able to reverse engineer it back to executable source code without access to the object code (binary).
Some source code is available to the public as “open source” on sites such as GitHub.
If you’re worried about viruses, remember they’re unlikely to attack source code, and more likely to attack program macros, since viruses must be executed by the code to have an effect, and it’s easier to find the executing snippet in a macro than unique source code written by a developer.
From a legal perspective, source code is copyrighted the moment it is written. But to defend yourself against abusers, you’ll have to go through the 4-month process of registering with the United States Copyright Office, and you’ll need to front $45-$65 for the application.
If you found this article helpful, check out more free content on data, finance, and business analysis at the AnalystAnswers.com homepage.