Reverse engineering as a part of software engineering
The term reverse engineering, refers to the disassembling of an object, following a thorough examination of its composition/construction so to understand how it works to duplicate or upgrade the object.
Taken by older industries, the practice of reverse engineering is now widely used in the software engineering sector.
In this digital-information age, reverse engineering has become a tool that can be used as a way to create compatible products that are cheaper than the existing ones or even free in some cases, uniquely modify the software, and exchange knowledge as a result into making better, more reliable and secure products.
Can be applied to various aspects of both software, and hardware development to understand how they behave under various conditions, to retrieve the source code that was lost, fix issues, to adapt existing software programs with new hardware, etc.
The use of reverse engineering is also greatly exercised to identify malicious content in the source code of a software, such as viruses, or to expose security flaws(backdoors, virus, misconfigurations) and address possible privacy issues.
Researchers can also use this technique to reverse engineer malware to understand how it works to nullify its properties, identify the potential owner, and use the knowledge gained to update their virus databases and prepare mitigation measures for future malware attacks.
Legalities of Reverse Engineering
The law is not discouraging from taking apart the products available from almost any of the technologies that exist, including electronic, chemical, mechanical, software engineering, etc.
Although reverse engineering a prototype or a source code of a program prior to its release will result in legal consequences of its proven to be so.
Patent law is one of the laws that protect inventions where it prevents other parties from copying an invention, to reverse engineer it.
In return, if the product is disclosed as a patent, the developer must reveal all the technical details.
The supreme court of the United States of America though, established standards that prevent many software inventions from being eligible in the first place for patent protection.
Copyright law protects software from both direct copying and close paraphrasing. The law protects details like software graphics, interface design, file structure, and organization, etc.
The copyright law also protects the software code from being reconstructed, a third party can break the law when copying the key elements of the original software, even if it doesn’t include the original code line by line.
However, the developer has to disclose the source code as a part of the registration process. A downside of this law is when a case is opened about potential copyright violation, the developer must be able to prove that the copier had access to its source code while proving that the copied code is similar to his.
Reverse Engineering Tools Overview
Reverse engineering tools are a must for the “library” of a hacker, software developer, and a security researcher.
Using reverse engineering, hackers can compromise any security system, the use of those reverse engineering programs can allow them to manipulate data into a useful form, thanks to the development of digitizing devices.
Reverse engineering to be achieved at its highest needs the appropriate knowledge combined with the proper tools.
There are a variety of reverse engineering tools and can be divided into categories for their various uses which can be tracking the application running in real times, dissect binary codes into assembly codes, view, and edit binaries or embedded resources in EXE files, etc.
Disassemblers and Decompilers
When a user knows what he/she is dealing with, meaning what programming language and compiler the intended software uses can start analyzing it, their task is to analyze compiled, binary file and display its source code’s structure in such a way a human can understand it.
By extracting strings, functions, libraries, etc., a user can know what fragments of code references to them and what functions of the operating system are used by the program along with what functions are exported.
Disassemblers can show us what the object code of the program in the form of x64 or x86, by analyzing what high-level language the program is compiled to.
Decompilers use is for recreating the original high-level code from the code that the program is compiled.
Debuggers by supporting CPU registers, hex dumping of programs, and more, can help programmers to track application running in real-time, observe how certain instructions affect the contents of memory, edit assembly codes in real-time and detect possible errors/”bugs”.
When the program is open source, debugging a high-level code can be easy compared to when you don’t have access to the source code.
Dedicated debuggers performing advance analysis of binary application structures can solve this issue but, in the end, their use requires knowledge of low-level languages as well as the functioning of the processor for which the program was compiled to.
Can allow the developer to view or edit binaries in order to make corrections or fixes etc., according to software requirements.
Hex editor makes it possible to manipulate the fundamental binary data that makes up a computer file.
There is a variety of hex editors base on their different functions and applications, some allowing them to visually display the internal structure of a file.
Windows applications resources like icons, images, localized texts, version information, etc., can be saved in PE files within the resources area.
Resource editors allow the view and edit of the resources that are embedded in the EXE file.
As all application files are saved in an EXE or DLL format, when a developer needs to change some data the application, as long as their size remains unchanged, you can edit them using the hex editor.
When their size (bigger images, longer text) needs to be changed or there is a need to add new data, is then where a resource editor comes in use for the developer.
Identifiers can help when we are not so sure about how the intended program is created, it can distinguish features lie, section names, imported libraries, etc.
Using the identifiers analysis, which includes a signature base to identify compilers, cryptographic libraries, security systems, etc., developers can decide what their next step should be(e.g. unpacking the application).
Can be used to analyze unknown/suspicious software programs.
By running the unknown program in an uncontrolled environment, can cause irreversible damage if the program runs a payload in the background, so the use of a virtual environment, is a must when we are dealing with an unknown software so we can be safe running and analyzing it first in that environment.
Reverse Engineering Hacking Tools
In 2017, Wikileaks released Vault 7, a substantial collection of material about CIA/NSA cyber-activities along with series of documents, Ghidra became known.
In 2019, NSA officially released the source code of the reverse engineering framework which was developed in the US.
Its suite includes tools for analyzing compiled code on various platforms like Windows, Mac OS, and Linux.
It can be run in both user-interface or command line mode, while It’s GUI is designed for less expert users and features assembler, disassembler, decompiler, and other features including processor instruction sets and executable formats.
Its headless mode enables reverse engineer at scale or if it is used as a server to enable group collaboration when dealing with larger binaries.
A basic programming experience, along with some knowledge of assembly language is required.
When decompiling a code, if you select a portion of the assembly, it automatically highlights in the decompiler window the decompiled code, providing a good way of understanding how high-level code matches the disassembled code.
Getting Started with Ghidra
Ghidra, in order to run, requires Java along with Java SE Development Kit 11 to be installed on the intended device.
Upon opening the program Ghidra provides TIP of the day that can be useful for new users or even experienced users, providing with something of new information about the usage of the program.
Features of Ghidra
Ghidra comes with a contextual menu, by hovering over the most interface elements and pressing F1, a pop-up window with the help menu appears providing the user with more information about the element.
Organize project sections
Ghidra can organize your project sections of disassembly code in various ways, just by hitting right click on the folder of your project, select “Modularize By” and choose between “Subroutine”, “Complexity Depth” or “Dominance”.
The next window under “Program Trees” is “Symbol Tree” which enables viewing import, export, functions, labels, classes, and namespaces of a binary file.
“Listing” window. Here you can see the reverse-engineered code.
Users can configure the listing fields by clicking on the icon “Edit the listing fields” in the top right corner and then the “Instruction/Data” tab.
Any element of the listing interface may be changed, relocated, disabled, or removed.
Loading an executable
Support drag and drop function, a file can be loaded by dropping it into the projected window of Ghidra, launching a dialog box where a format is selected, destination folder, and the name of the program.
Import results summary information appears once the file is imported.
If the file is not analyzed, a list of Analyzers will appear in order for the user to enable various analyzers depending on the format of the file.
Modifying Display Elements
By using CodeBrowser for reviewing the target file, Ghidra offers customizable display elements(where it can help to enhance readability for the user) and various options where can be accessed by clicking edit on the top menu, and then selecting tool options.
Suggested environment changes:
Listing Display: Can increase the font size and enable bold formatting for easier reading.
Listing Fields – Bytes Field: Change “Maximum Lines to Display” to 1 to simplify spacing between lines of assembly code.
Listing Fields – Cursor Text Highlight: “Mouse Button to Activate”, change to left.
It will highlight all instances of the selected text when the left mouse button is clicked — similarly to other disassemblers.
Listing Fields – EOL Comments Field: Check “Show Semicolon at Start of Each Line” to better separate the assembly text from inserted comments
Listing Fields – Operands Field: Check “Add Space After Separator” for improved text readability
View Decompiler Output
Ghidra comes with a built-in decompiler output.
It can display the high-level language of the assembly code.
By highlighting one of the operators in the high-level language decompiler window, it highlights the relevant assembly providing the user with a good idea on how and which groups of the assembler instructions match the high-level instructions.
Ghidra includes support for writing Java and Python (via Jython) scripts to automate analysis.
To view built-in scripts, go to Window – Script Manager.
A user can add its own script by choosing the “create a new script” option in the script manager window top header menu.
It supports scripting with Java and Python.
Investigate a String Reference
Ghidra gives a review of the strings embedded within a target file.
To navigate, click on Window – Defined Strings.
Clicking on the row associated with a string populates the Listing window with the data on the intended address.
To identify references to a string, the user should right-click in the blue area in the listing window – References – Show References to Address:
Function call Graph
The function call graph displays the relationships between functions and provides a high-level overview of function calls.
Binary Ninja is a reverse engineering tool, it’s a program analysis platform and includes a disassembler, it’s not a debugger or a decompiler.
There Is a personal, commercial, and enterprise license to use this tool.
Commercial version provides the capability to use more cores on the machine that is running for making the process faster and also supports a headless mode where the user can create and automate scripts without loading the UI(in the personal license user can also access the same API’s but only through the UI).
Features of Binary Ninja
Works on OS X, Windows, or Linux.
Users don’t have to be worried about losing useful information accidentally, Edit — Undo.
Smooth Scrolling, Zooming
Scanning through large functions
This is included as mentioned above only in the commercial edition for multi-threaded analysis of large binaries files.
An easy to write to IL powers automatic analysis across all architectures.
Inline assembly editing
For fast binary patches.
Faster than inline editing.
Born from the fast-paced CTF environment.
Simultaneous Live View
A feature that a user can write hex and immediately see the changes live in disassembly.
It good to be used for:
- • PE, Mach-O, ELF file formats.
- • Raw binaries executable code for x86,x64, armv7,armv8, MIPS, PPC, and others.
• Firmware — some knowledge for assembly language is required
And it’s not good for things that are not binary code like:
• Web apps
• Virtual machines
Visit the official documentation here.
A tool for reverse engineering Android APK files.
It can be used to reverse engineering 3rd party, closed binary Android Apps, and also can decode resources to the nearly original form and rebuild them.
It could be used for localizing, adding some features or support for custom platforms, analyzing applications, and more.
• Disassembling resources to the nearly original form
• Rebuilding decoded resources back to binary APK/JAR
• Organizing and handling APKs that depend on framework resources
• Smali Debugging (Removed in 2.1.0 in favor of IdeaSmali)
• Helping with repetitive tasks
It can make working with an app easier because of this project like file structure and automation of some repetitive tasks like building APK, etc.
Example decoding APK Facebook lite
Using debug mode (d) to decode the given APK file.
Upon decoding the files, after making changes or inspecting the APK, a user can build the decoded files using the build function (b) by selecting the file with the decoded files of the APK and an output name of APK (-o).
It’s a default tool in the Kali Linux reverse engineering family tool and can intercept Java applications locally.
It can be accessed from the terminal with a simple command — root@kali:~# javasnoop
Intercepts Java applications locally.
It’s a tool developed by Aspect Security to help people intercept Java function calls (e.g. toString) from Java applications.
The tool can allow the user to attach to a process (like a debugger) and intercept a Java function call, view and modify parameter values and print the stackrace or save function calls.
The following features of the tool make the testing easier for any kind of Java-based apps.
• Allows easy interception of any method in the JVM
• Allows the editing of return values and parameters
• Allows custom Java to be inserted into any method
• Able to work on any type of Java application (J2SE, Applet, or Java Web Start)
• Able to work on already-running Java processes
• Not require any target source code (original or decompiled)
Javasnoop’s interface functionality
The first box
Is where a user can select class or method to be hooked and intercepted.
The interface provides a button to add a new Hook so the user can add a method from a specific class available from the list.
Provides features for setting various options for intercepting the method calls.
Set regular expression conditions for matching and intercepting traffic from the method calls.
On execution – Third box(bottom left)
The interface here helps in deciding what to do with a particular hook that the user selected from the first box.
Various options like:
• Printing the parameters/stacktrace on to the console or a particular file
• Running custom scripts
• Tampering with parameters
• Tampering with a return value
• Pausing program
Fourth box (bottom right)
Here in this area, it shows the output from the hooks and the decompiled classes from the target application.
Benefits of reverse engineering:
Using creativity and knowledge, researchers can now innovate through reverse engineering, by understanding, taking apart, modifying, or even creating a new better software than before.
• Examining the structures and processes of an application can lead to product innovation.
• Knowledge gain by other researcher’s work
• Reconstructing a product that is outdated
• The discovery of software vulnerabilities
• The development of applications that are more efficient and cheap.
Importance of reverse engineering and the dark side of it.
In the security world, researchers use reverse engineering to find security risks in programs, to understand malicious applications, etc.
Cyber-criminals can also use this technique to exploit security bugs in applications, the difference with the security researchers is what they do with that vulnerability information.
In a report conducted on April 2019, found that 97% of the 30 mobile financial apps that were tested, were lacking binary protection, making decompiling and review of the source code possible.
Upon decompiling an app all kinds of sensitive information can be exposed, API URLs API keys hardcoded into the apps.
The URLs can lead attackers to nonstandard port numbers, development servers, private keys, application file directories that were used for testing by developers, quality assurance engineers and can lead an attacker to compromise entire applications.
Attackers use many anti-reverse engineering techniques to prevent detection or make it more difficult to recover the contents of their malicious apps.
Some of them are:
Detection of virtual machines, sandboxes — used to study malware, crypters — to encrypt the executable files,
exe-protectors — to compress executable and add code to detect the presence of debuggers and eventually hide or corrupt the true structure of the executable.
To counter reverse-engineering attacks, security teams need to know what tools are available, how they work, and lastly what attackers use for evading those tools.