Yara. Writing rules to search for malware

The content of the article

  • Rule structure
  • We use YARA
  • We write the rules
  • Automation
  • Useful links

YARA is a popular and versatile tool sometimes referred to as the Swiss knife for malware researchers, but it can be used for other purposes as well. Today we will speak in the genre “Your own antivirus” and will tell you how to write rules and search for files by characteristic features using YARA.

YARA  is an open source tool that helps researchers find and classify malicious samples and even conduct Threat Hunting. The utility performs signature analysis based on formal YARA descriptions (rules). They contain indicators of compromise for different types of malware.

The trick is that making the rules is easy and doesn’t take long. That is why YARA is used by AlienVault, Avast, ESET, FireEye, Group-IB, Kaspersky, Trend Micro, Virus Total, x64dbg … In general, almost everyone who deals with malware analysis.

YARA rules can handle not only executable files, but also documents, libraries, drivers, whatever. They can also scan network traffic, data storages, memory dumps. These rules can be incorporated into other tools like SIEM, anti-phishing, IDS, sandboxes.

Let’s figure out what these rules look like and how to compose them.


Usually rules are stored in text format in a .yar file and consist of two sections:

  • definitions sections (strings) – contains malware-specific constants, hashes, HEX fragments, links, strings;
  • section conditions (condition) – contains the conditions by which decisions are made regarding the analyzed file.

It looks like this:

rule SomeMalwareName { meta: author = "AuthorName" strings: … condition: … }


The minimum required sections are the name of the rule and its condition. For example, with the rule below, we will detect objects only by their imphash (on test objects from the previous article):

import "pe" rule MyLittleAgentTeslaRuleDetect { condition: pe.imphash() == "b21a7468eedc66a1ef417421057d3157" or pe.imphash() == "f34d5f2d4577ed6d9ceec516c1f5a744" }

Let’s save the file as  and run it on the directory with Agent Tesla samples. Let’s look at the result and make sure that the rule has worked for all Agent Tesla representatives.AT.yar

C:\>yara64.exe -r AT.yar AgentTesla
AgentTeslaAT11.exe MyLittleAgentTeslaRuleDetect 
MyLittleAgentTeslaRuleDetect AgentTeslaAT12.exe 
MyLittleAgentTeslaRuleDetect AgentTeslaAT13.exe 
MyLittleAgentTeslaRuleDetect AgentTeslaAT14.exe 
MyLittleAgentTeslaRuleDetect AgentTeslaAT16.exe 
MyLittleAgentTeslaRuleDetect AgentTeslaAT2.exe 
MyLittleAgentTeslaRuleDetect AgentTeslaAT18.exe 
MyLittleAgentTeslaRuleDetect AgentTeslaAT15.exe 
MyLittleAgentTeslaRuleDetect AgentTeslaAT10.exe 
MyLittleAgentTeslaRuleDetect AgentTeslaAT3.exe 
MyLittleAgentTeslaRuleDetect AgentTeslaAT6.exe 
MyLittleAgentTeslaRuleDetect AgentTeslaAT7.exe 
MyLittleAgentTeslaRuleDetect AgentTeslaAT1 .exe 
MyLittleAgentTeslaRuleDetect AgentTeslaAT9.exe
MyLittleAgentTeslaRuleDetect AgentTeslaAT4.exe 
MyLittleAgentTeslaRuleDetect AgentTeslaAT17.exe 
MyLittleAgentTeslaRuleDetect AgentTeslaAT5.exe 
MyLittleAgentTeslaRuleDetect AgentTeslaAT8.exe

The result is all of them.

As you can see, YARA rules support importing useful modules. Accordingly, you can write your own modules. Here are some of the most commonly used ones:

  • pe – functions needed when working with Portable Executable objects, for example, imphash checksum, creation timestamp, location of sections;
  • hash – calculation of checksums and cryptographic hashes;
  • math – mathematical calculations such as arithmetic mean or entropy.

You can always look at the full list in the  official YARA documentation.


In the optional definition section, you can use wildcards (similar to regular expressions). For example, for hexadecimal strings, the following masks are possible:

  • symbol ? – the presence of any byte in place of this symbol;
  • ranges (  – 4 to 6 different bytes);[4-6]
  • alternatives (logical OR – |).

Let’s look at another example with several rules and show you the variety of YARA detection capabilities:

import "hash"
rule MyExe_RUFUS {
    $my_HEX_string = { E0 38 D2 21 32 4D 1B C1 79 EC 00 70 76 F5 62 B6 }
    $my_HEX_string and hash.md5(0, filesize) == "d35936d329ac21ac3b59c525e2054078"
rule MyExe_Hercules4 {
    $my_HEX_string = { 59 (E9 D8|FF) 1A 00 00 5? E8 [4-6] 94 8F }
rule MyURL_Yandex {
    $my_URL_string = ""
rule MyRar_SFX {
    $winrar1 = "WINRAR.SFX" nocase wide ascii
    $winrar2 = ";The comment below contains SFX script commands" wide ascii
    $winrar3 = "Silent=1" wide ascii
    $winrar4 = { E8 65 64 00 00 E9 78 FE FF FF}
    //$winrar4 = "sfx" nocase
    1 of them

Let’s analyze each of the rules:

  • MyExe_RUFUS – the rule is triggered if the specified machine code is found and the MD5 hash matches;
  • MyExe_Hercules4 – an example of a rule using a mask by hexadecimal values. 59( ) – means that either two bytes E9 D8, or one byte FF should appear after byte 59. means that there can be a variation from 50 to 5F. E8 [4-6] 94 means that between bytes E8 and 94 there can be any 4 to 6 bytes.E9 D8|FF5?
  • MyURL_Yandex – the rule is triggered when a string containing a link is found in the object.
  • MyRar_SFX – an example of a rule for searching for SFX archives. means that the string can be with any variation of the character case, for example or  . The parameter means that the string must match ASCII characters, which can be encoded in two bytes (wide), which is typical for executable binaries."sfx" nocaseSfxSFXwide ascii

Let’s see what the source files for testing the rules look like.

Let’s run a check with our rule MyExe.yar and evaluate the result.

C:\>yara64.exe -r MyExe.yar C:\SAMPLES
MyExe_Hercules4 C: SAMPLESHercules4.exe 
MyExe_RUFUS C: SAMPLESrufus-2.17p.exe 
MyURL_Yandex C: SAMPLESDear_Recipient.pdf

The PDF and DOCX documents are identical, but YARA could not find the link string in it because the DOCX format is a ZIP archive. To work with it, you will need to add a custom module for working with archives, but we will not delve into this.

The most important thing is to correctly compose the condition by which the file will be checked. It usually combines variables from the definition section, and supports boolean operators like orand  and, brackets and constants. For instance:

  • filesize – file size;
  • at – offset setting;
  • 1 of them – at least one of the conditions listed in the rule must be met;
  • any of them – the same as “1 of them”.

And so on, the full list, see the documentation.


There are three ways to get the rules: download ready-made (links – below), do it yourself (option for enthusiasts), or force the machine to make them yourself. In the latter case, different generators are suitable, for example yabin , YaYaGen , yarGen , BASS . They mostly find unique strings in malware fragments to generate rules, but you yourself understand that this is not the most reliable way.


There is only one question left: where to get all these ready-made indicators of compromise (IoC) and rules? Answer: there is the official repository (and mirror ) and a lot of rules aggregators on the list on GitHub.

By the way, you can combine all the rules into one .yar file. The only requirement is that the rules must have unique names that will be displayed as a result of the program’s operation.

What do you think?

51 Points
Upvote Downvote
Red Hat Professional

Written by Admin

NewbieAvatar uploadFirst contentFirst commentPublishing content 3 times


Leave a Reply



Competition in Privacy. Testing the browsers

Assembling spy gear