Quark-Engine Objects Introduction

Dalvik Bytecode Loader(quark.Evaluator.pyeval)


PyEval, inspired by the ceval.c design principle in CPython’s Interpreter. It takes python bytecode instructions into an infinite loop and interacts with the CPU using C language. Thus, we apply this principle to our Dalvik Bytecode Loader, takes Dalvik’s Bytecode instructions, and apply our custom instructions events to implement tracking whether two function calls operating the same variable in the fifth crime stage. Of course, we haven’t implemented all the bytecode instructions, and we haven’t considered the conditional jump now.


Take Android’s bytecode instruction invoke-direct as an example. When an instruction invoke-direct is passed to our Dalvik bytecode loader, it will enter the PyEval main switch, execute the corresponding INVOKE_DIRECT function, and call self._invoke(), here another _invoke is written for reuse, because there are many instructions related to invoke family, such as invoke-direct, invoke-virtual, but the invoke family is the same for our program implementation.



XRule as an object in the quark-engine responsible for the five-stage inspection, each APK will initialize an XRule object respectively.


Explanation of each function

  • run:

    • This function responsible for starting the five-stage inspection.
  • show_summary_report:

    • Show the summary report.
  • show_detail_report:

    • Show detailed report.
  • find_previous_method:

    • Track the previous function of a specific function.
  • find_intersection:

    • Find if there is a function of intersection in a given two functions.
  • check_sequence:

    • Check if the two function is in order or not.
  • check_parameter:

    • Check if the two functions operate on the same variable or not.

Five-stage inspection

Each function in this XRul object is used to correspond to the five-stage inspection, which will be described in order according to the five-stage inspection.

Level 1

Check whether all permission requirements match the given rules.

Level 2

Check whether all native API functions match the given rules (as long as one of them is met).

Level 3

Check if the native API functions all match to the given rules (both appear).

Level 4

Check if the native API functions all match to the given rules (both appear in the order).

Level 5

Check whether the native API functions all match to the given rules (both appear in the order, and operate on the same variable).


Apkinfo is an object in Quark-Engine used to store APK information which is based on androguard module.


Explanation of each function

  • permissions:

    • Get all permissions required by the APK.
  • find_method:

    • Query function name in APK.
  • upperfunc:

    • Query the upper-level function call for a given function name in the APK.
  • get_method_bytecode:

    • Returns the Android bytecode corresponding to the given function name.


Bytecodeobject is an object in Quark-Engine that stores information about a single bytecode command.


After bytecode generated by androguard, it will look like below.

mnemonic registers parameter
invoke-virtual v3 Lcom/google/progress/APNOperator;->deleteAPN()Z

mnemonic: mnemonic is a human-readable instruction form compared to the machine code.

registers: registers used by each line of instructions in the smali bytecode instruction. They usually express in the form of “v3”, “v4”.

parameter: is the function or parameter used in the smali bytecode instruction.


The ruleobject in Quark-Engine will read a JSON format rule from a given JSON file, for example, sendLocation_SMS.


Explanation of each function

  • get_score:

    • Returns the weight score based on the five stages of malicious behavior.


The tableobject is used to track the usage of variables in the register. We want to know whether the same variable is used by the two APIs we have defined in our rule.


In the output of the bytecode instruction generated by androguard, it is difficult to use a single register to track the usage of the same parameter being APIs call since the register is often reused by the Dalvik machine. Therefore, we creat the TableObject to store each RegisterObject in each row.

Tableobject is composed of multiple registerobjects as below:


Take this bytecode instruction as an example:

We can know what the v2 register stores, which is used by the send function.

However, the “v2” register may be overwritten by other values ​​in the future. As above instruction, the content in v2 becomes “gps” after executing const-string v2 'gps'.

Therefore, the method we want to track must be reversed. We use the function call name as the tracker, and record the contents of the current register in the table.

Once we encounter a function call, we will check the parameters used by it, such as the above command invoke-virtual, then find out the current value of the v2 register, together with the function recorded in the table, assuming v2 finds the value “hello”, the function name and parameter, “hello”, will be recorded in called_by_func column.

You can take a look to this Bytecode example. With this table record, we can track all parameter content by each function call.

Explanation of each function

  • insert:

    • Insert RegisterObject into the nested list in the hashtable.
  • get_obj_list:

    • Return the list which contains the RegisterObject.
  • get_table:

    • Get the entire hash table.
  • pop:

    • Override the built-in pop function, to get the top element, which is RegisterObject on the stack while not delete it.


RegisterObject is used to record the state of each register. Each initialized registerobject will have register_name, value, called_by_func in a single instance.

register_name value called_by_func
“v3” “GPS” Lcom/google/progress/APNOperator;->deleteAPN()Z

register_name: register name, such as “v3”, “v4”.

value: the value stored in the register.

called_by_func: what functions are called with this register as a parameter.

Explanation of each function

  • hash_index:

    • Get the index number from given VarabileObject, given “v34” will return 34.