Discovering Valid Java Main Methods

From AtlasWiki
Revision as of 13:27, 6 February 2015 by BenHolland (Talk | contribs) (Analysis Step 2) Select methods named "main")

Jump to: navigation, search

The Toolbox Commons project defines an Analyzer interface that encapsulates the logic for traversing a program graph to extract an "envelope" (a subgraph that is either empty if a property is satisfied or non-empty containing the necessary information to locate the violation of the property). Analyzers encapsulate their descriptions, assumptions, analysis context, and analysis logic. Of course you can define your own "Analyzer" simply by writing a program with your analysis logic, but we find this abstraction helps keep code organized when contributing to a toolbox project.

Development Process

Let's start off with a simple analysis goal. Write an analyzer that discovers all valid Java main methods in a program. We might want to discover main methods to locate developer test code or alternate entry points into an application.

Step 1) Understanding the problem

The first step is always to ask yourself if you really understand the problem. Now is the time to do some background research. Can main methods be located in inner classes? Can main methods be final? Can main methods return anything other than void? This blog has enumerated through several variations on main methods.

Step 2) Developing test cases

A little upfront work to create a decent test set will likely save you a lot of time in the development of your analyzer since you will be able to quickly identify the cases you are not handling correctly. For this tutorial we've already created an application with several test cases for you! Just checkout the https://github.com/EnSoftCorp/LearningAtlas repository and import the MainMethodTestCases Eclipse project into your workspace. Since you didn't have to go through the work of developing the test cases yourself, its probably a good idea to go through the test cases now. We have created several Java classes, each with a main method. The classes with valid main methods are in the com.example.valid package, whereas the classes with invalid main methods are in the com.example.invalid package.

Step 3) Create Analyzer

From the exercise of going through steps 1 and 2, we can now create a new Analyzer and document our assumptions (at least the assumptions we've made so far).

In the Starter Toolbox create a new class in the toolbox.analysis.analyzers package. Name the class DiscoverMainMethods and extend the com.ensoftcorp.open.toolbox.commons.analysis.Analyzer base class.

Let's take this opportunity to fill out the information that we know so far. We learned that Java main methods must be public, static, void methods named "main" (case-sensitive). Main methods take a single parameter of a one-dimensional String array. The main method may optionally be marked as final, synchronized, or strictfp. After documenting this information, your analyzer should look something like the following.

package toolbox.analysis.analyzers;

import com.ensoftcorp.atlas.core.query.Q;
import com.ensoftcorp.open.toolbox.commons.analysis.Analyzer;

public class DiscoverMainMethods extends Analyzer {
	
	@Override 
	public String getName(){
		return "Discover Main Methods";
	}
	
	@Override
	public String getDescription() {
		return "Locates valid Java main methods.";
	}

	@Override
	public String[] getAssumptions() {
		return new String[]{"Main methods are methods.",
				    "Main methods are case-sensitively named \"main\"",
				    "Main methods are public.", 
				    "Main methods are static.", 
				    "Main methods return void.", 
				    "Main methods take a single String array parameter", 
				    "Main methods may be final.", 
				    "Main methods may have restricted floating point calculations.", 
				    "Main methods may be synchronized."};
	}

	@Override
	protected Q evaluateEnvelope() {
		// TODO: Implement
		return null;
	}
	
}

Step 4) Develop and Debug Analyzer Logic

The evaluateEnvelope method is where we put our analysis logic. The Analyzer base class defines a method getEnvelope that lazily evaluates the result of your analysis defined in the evaluateEnvelope method and caches the result for later. Future calls to getEnvelope return the cached result.

We can run our analyzer on the Atlas shell by instantiating a new DiscoverMainMethods object and calling the getEnvelope method.

var analyzer = new DiscoverMainMethods()
var envelope = analyzer.getEnvelope()
show(envelope)

The show method should fail here because our getEnvelope currently returns null. We've broken out the steps to implement our analysis logic into five steps, but keep in mind there is more than one way to implement this analyzer! If you are ambitious, you could try stopping here and implementing your own analyzer then comparing your solution with ours.

Analysis Step 1) Select public static methods

Let's start off our implementation by making a set of all the public static methods we can find in the program graph. These three properties are all Atlas Tags so we can query for them directly. Node we use nodesTaggedWithAll here instead of nodesTaggedWithAny because we want nodes that are public and static and methods.

protected Q evaluateEnvelope() {
	// Step 1) select nodes from the index that are marked as public, static, methods
	Q mainMethods = Common.universe().nodesTaggedWithAll(Node.IS_PUBLIC, Node.IS_STATIC, Node.METHOD);
	return mainMethods;
}

If we run our analyzer now, we should be returning all public static methods in the universe. Let's test it out. If you haven't already, save your DiscoverMainMethods.java file. Now reload the Atlas Shell. Its important to reload the Atlas shell at this point because if you don't you will be running the version of DiscoverMainMethods that was compiled the last time you reloaded the shell. After reloading the shell, run the following query.

show(new DiscoverMainMethods().getEnvelope())

After running the query, you will notice a Eclipse job progress monitor in the bottom right corner of the Eclipse window as shown below.

SlowJob1.png

If you included Jar Indexing in your Atlas preferences, then this job will be very slow. This is because you are trying to display a very large graph! Displaying a large graph should be avoided because it will be too large for a human to understand anyway. Let's cancel this job. Click on the the Eclipse job progress icon in the bottom right hand corner of the Eclipse window or open the Eclipse Progress view. Now click on the red cancel job button to the right of the task name to cancel the job as shown in the screenshot below.

SlowJob2.png

Let's find out just how big that graph was. Run the following query in the Atlas Shell to count the number of nodes in the graph.

CommonQueries.nodeSize(new DiscoverMainMethods().getEnvelope())

With Jar Indexing, this graph is about 10,000 nodes (no edges). In general a good rule of thumb would be don't try show graphs with anything more than 100 nodes. You will end up spending too much time trying to understand a graph visually if it is too large. Instead focus on refining your queries to produce a graph with only the information you need to understand the result.

In our case, we probably don't care about main methods in the Java runtime libraries, so let's refine our queries to exclude those results. The Analyzer base class defines two default analysis contexts for us; context (defaults to Common.universe()) and appContext (defaults to SetDefinitions.app()). An analyzer "context" defines a subgraph of the universe that the results should be calculated in. Given the Analyzer context the appContext is automatically calculated as the intersection between SetDefinitions.app() and the given context. These analysis contexts can be changed after the Analyzer is instantiated with the setContext method, see the Analyzer Javadocs for Toolbox Commons for more details.

Let's use the appContext to refine our query.

protected Q evaluateEnvelope() {
	// Step 1) select nodes from the index that are marked as public, static, methods
	Q mainMethods = appContext.nodesTaggedWithAll(Node.IS_PUBLIC, Node.IS_STATIC, Node.METHOD);
	return mainMethods;
}

Reload your Atlas Shell, and run the following.

var analyzer = new DiscoverMainMethods()
var envelope = analyzer.getEnvelope()
CommonQueries.nodeSize(envelope)

Since the graph is small, let's show it.

show(envelope)

You should see the following graph.

Step1.png

Analysis Step 2) Select methods named "main"

Currently we select all public static void methods, but we are also selecting methods that are not named "main". Notice that the class Main17 contains a public static method named Main that is included in our results. Let's examine the attributes of the selected nodes and filter based on name.

We are going to open a new Eclipse view to help inspect the attributes and tags of a GraphElement. Navigate to Eclipse or Window > Show View > Other... > Atlas > Element Detail View. Click on the "Main" method in the Main17 class in the graph we displayed in Step 1 (if you closed the graph already, you will need to display it again). Notice that as you click on elements in the graph, the Element Detail View updates to show you the different attributes and tags applied to the selected GraphElement. Notice also the value for the node name is different for public static method in the Main17 class. We will use this knowledge to filter out methods with the wrong method name.

ElementDetailView.png

Note: We could also less conveniently access this information on the Atlas Shell using the selected variable.

var graphElement = selected.eval().nodes().getFirst()

Now we can use the selectNode query to select nodes that match the given attribute key and value. Add the following code to your evaluateEnvelope method.

// Step 2) select nodes from the public static methods that are named "main"
mainMethods = mainMethods.selectNode(Node.NAME, "main");

Test your updated analyzer by running the following Atlas Shell commands and noting that the method in Main17 is no longer included in the result. Don't forget to save your analyzer code and reload the Atlas Shell first! This is the last time we will remind you to reload the shell after editing your analyzer.

show(new DiscoverMainMethods().getEnvelope())

Analysis Step 3) Select methods that return void

TODO

Analysis Step 4) Select methods that only take one parameter

TODO

Analysis Step 5) Select methods that take a String array

TODO

Final Implementation

public class DiscoverMainMethods extends Analyzer {
	
	@Override 
	public String getName(){
		return "Discover Main Methods";
	}
	
	@Override
	public String getDescription() {
		return "Locates valid Java main methods.";
	}

	@Override
	public String[] getAssumptions() {
		return new String[]{"Main methods are methods.",
				    "Main methods are case-sensitively named \"main\"",
				    "Main methods are public.", 
				    "Main methods are static.", 
				    "Main methods return void.", 
				    "Main methods take a single String array parameter", 
				    "Main methods may be final.", 
				    "Main methods may have restricted floating point calculations.", 
				    "Main methods may be synchronized."};
	}

	@Override
	protected Q evaluateEnvelope() {
		// Step 1) select nodes from the index that are marked as public, static, methods
		Q mainMethods = appContext.nodesTaggedWithAll(Node.IS_PUBLIC, Node.IS_STATIC, Node.METHOD);
		
		// Step 2) select nodes from the public static methods that are named "main"
		mainMethods = mainMethods.selectNode(Node.NAME, "main");

		// Step 3) filter out methods that are not void return types
		mainMethods = mainMethods.intersection(Common.stepFrom(Common.edges(Edge.RETURNS), Common.types("void")));
		
		// Step 4) filter out methods that do not take exactly one parameter
		Q paramEdgesInContext = appContext.edgesTaggedWithAny(Edge.PARAM).retainEdges();
		// methods with no parameters will not have a PARAM edge
		Q methodsWithNoParams = mainMethods.difference(Common.stepFrom(paramEdgesInContext, Common.stepTo(paramEdgesInContext, mainMethods)));
		// methods with 2 or more params will have at least one edge with PARAMETER_INDEX == 1 (index 0 is the first parameter)
		Q methodsWithTwoOrMoreParams = Common.stepFrom(paramEdgesInContext, Common.stepTo(paramEdgesInContext, mainMethods).selectNode(Node.PARAMETER_INDEX, 1));
		mainMethods = mainMethods.difference(methodsWithNoParams, methodsWithTwoOrMoreParams);
		
		// Step 5) filter out methods that do not take a String array
		// get the 1-dimensional String array type
		Q stringArrays = Common.stepFrom(Common.edges(Edge.ELEMENTTYPE), Common.typeSelect("java.lang","String"));
		Q oneDimensionStringArray = stringArrays.selectNode(Node.DIMENSION, 1);
		Q mainMethodParams = CommonQueries.methodParameter(mainMethods, 0);
		Q validMethodParams = mainMethodParams.intersection(Common.stepFrom(Common.edges(Edge.TYPEOF), oneDimensionStringArray));
		mainMethods = Common.stepFrom(paramEdgesInContext, validMethodParams);

		return mainMethods;
	}
	
}

Alternative Implementation

TODO


Back to Learning Atlas