PHP Analyser internal development information.
This page deals with information that will be useful to folks working on the development of the PHP application analyser. For user inbformation see
PHPAnalysis
Running the unit tests for the PHP Analyzer
There are two sets of tests for the scanner code. The base Java Class (PHPAnalysis.java) is part of the sMash PHP engine and has a set of PHPT unit tests which can be found under in the sMash PHP source tree. Most of the scanner code is written in PHP and is tested using a set of PHPUnit tests which can be found in the Project Zero repository under MODULES/zero.analyzer.php.tests. In order to run these tests it is necessary to have an implementation of PHPUnit available to use. The instructions for using PHPUnit are given
here
Internal structure of the PHP Analyser
The Java class which deals with obtaining analysis data for a script is called PHPAnalysis (com.ibm.p8.engine.library.analysis.PHPAnalysis). This class can be called directly from PHP code using the PHP Java bridge.
PHPAnalysis instantiates a second RuntimeManager instance which is used to analyse the script and to return the results of the analysis in entities which are received by the calling PHP as PHP arrays.
The PHP code is straight forward and does no more than organise the scanners' results and make a determination of which of the functions that the application requires have not been implemented yet in PHP in sMash.
Thee output from the PHPAnalysis.java class is less obvious and so is documented in detail below.
Summary of returned types
| method name | return type | example |
| getConditionalClasses | array() | 0=>classname1, 1=>classname2... |
| getDeclaredClasses | array() | 0=>classname1, 1=>classname2... |
| getConditionalMethods | array() | 0=>classname1::methodname1, 1=>classname1=>methodname2.... |
| getDeclaredMethods | array() | 0=>classname1::methodname1, 1=>classname1=>methodname2.... |
| getConditionalFunctions | array() | 0=>functionname1, 1=>functionname2... |
| getDeclaredFunctions | array() | 0=>functionname1, 1=>functionname2... |
| getInvokedMethods | array() | 0=>methodname1, 1=>methodname2.... |
| getInvokedClassStaticMethods | array() | 0=>classname1::methodname1, 1=>classname1::methodname2.... |
| getInvokedFunctions | array() | 0=>functionname1, 1=>functionname2.... |
| getSuperClasses | hash() | class1extends => class2, class3extends=>class4.... |
| getInterfaces | hash() | interface1extends=>array(0=>interface2), class1implements=>array(0=>interface1, 1=>interface2....), interface3extends=>array(0=>interface4)...... |
| getInvokedComplexClassStaticMethods | array() | 0=>"bar::Astclass_function_call (no token) [id:134] {line:13}AstTerminal_T_VARIABLE ("$a") [id:130] {line:13}" |
| getInvokedComplexFunctions | array() | 0=>"Astfunction_call (no token) [id:114] {line:11}AstTerminal_T_VARIABLE ("$a") [id:110] {line:11}" |
Explanation of the relevant methods of PHPAnalysis
getConditionalClasses
A conditional class is one which is declared within a conditional statement in PHP. In the following PHP example there are two conditional classes:
$a=true;
if($a) {
class animal {
public static function breathes() { echo "An animal breathes\n";}
}
}
animal::breathes();
$b=false;
if($b) {
class fish {
public static function swims() { echo "A fish swims\n";}
}
}
if(class_exists('fish')) {
fish::swims();
} else {
echo "Class fish does not exist\n";
}
?>
The PHPAnalysis method getConditionalClasses returns an array of class names: animal, fish
getConditionalFunctions
A conditional function is one that is defined inside some form of conditional statement. For example:
$a=true;
if($a) {
function x() { echo "My name is x\n";}
}
x();
$b=false;
if($b) {
function y() { echo "My name is y\n";}
$c = false;
if($c) {
function z() { echo "My name is z\n";}
}
}
if(function_exists('y')) {
y();
}
The PHPAnalysis method getConditionalFunctions returns an array of function names: x, y,z
getConditionalMethods
Somewhat confusingly a conditional method is just a method that is declared within a conditional class. In the sample code for conditional classes (above), both breathes() and swims() are returned by getConditionalMethods().
getDeclaredClasses and getDeclaredMethods
These return lists of declared classes and their methods. For example:
class animal {
public static function breathes() { echo "An animal breathes\n";}
}
animal::breathes();
class fish {
public function swims() { echo "A fish swims\n";}
}
$f=new fish();
$f->swims();
Will return both 'animal' and 'fish' as declared classes and will return animal::breathes and fish::swims as declared methods.
Declared classes includes interfaces. Declared methods includes interface methods.
getDeclaredFunctions
This will return a list of declared functions which are not either conditional functions or determined to be complex.
getInterfaces
In PHP classes can implement several interfaces and interfaces can extend other interfaces. Consider the following sample code:
interface lifeform {
function exists();
}
interface animal extends lifeform {
function breathes();
}
interface legs {
function numberoflegs();
}
class dog implements legs, animal {
function breathes() { echo "Dogs breathe\n";}
function numberoflegs() { echo "Dogs have 4 legs\n";}
}
$d = new dog();
$d->breathes();
$d->numberoflegs();
?>
in this case the output from getInterfaces() is a mixture of entities in an array of arrays. The mixture is best illustrated like this:
'interface name' = > array{}
'class name' => array{}
The elements of the first array index sub-arrays.
- If the element in the first array is 'class name', the elements of the array that it points to are the names of interfaces that it implements.
- If the element of the first array in an 'interface name', the single element of the array that it points to is another interfaces that it extends
Empty arrays can only appear after 'interface names' if the interface extends nothing. A 'class name' will never be followed by an empty array.
The var_dump'd out put from analysis of the script looks like this:
array(4) {
["animal"]=>
array(1) {
[0]=>
string(8) "lifeform"
}
["dog"]=>
array(2) {
[0]=>
string(4) "legs"
[1]=>
string(6) "animal"
}
["lifeform"]=>
array(0) {
}
["legs"]=>
array(0) {
}
}
getInvokedClassStaticMethods
This returns invoked static methods. For example, analysing teh code:
class animal {
public static function breathes() { echo "An animal breathes\n";}
}
animal::breathes();
and var_dump'ing the array returned by getInvokedClassStaticMethods() gives:
array(1) {
[0]=>
string(16) "animal::breathes"
}
TODO: write some 'self' and 'parent' tests.
getInvokedComplexClassStaticMethods
This is more complicated. A complex method is a method that is called but we cannot reliably determine the method name.
For example:
class bar {
public static function foo() {
echo "executing foo\n";
}
}
$a = 'foo';
bar::$a();
In this case the analyser can determine that a method belonging to class 'bar' has been called but cannot determine the name, only that it is whatever the variable $a contains at run time.
In this case getInvokedComplexClassStaticMethods() will return:
rray(1) {
[0]=>
string(105) "bar::Astclass_function_call (no token) [id:134] {line:13}AstTerminal_T_VARIABLE ("$a") [id:130] {line:13}"
}
The best that we can do with the PHP code is to issue a Warning: using the name of the file being analysed, the line number and variable name.
getInvokedComplexFuntions
Here again the analyser can determine that a function has been called but cannot reliably determine its name.
function foo() {
echo "executing foo\n";
}
$a = 'foo';
$a();
In this case getInvokedComplexClassStaticMethods() will return:
array(1) {
[0]=>
string(94) "Astfunction_call (no token) [id:114] {line:11}AstTerminal_T_VARIABLE ("$a") [id:110] {line:11}"
}
Again the best strategy for the PHP code is to produce a Warning: and a line number