Unverified Commit 5ea488b6 authored by Carlos Galindo's avatar Carlos Galindo Committed by GitHub
Browse files

Update README to describe the CLI and library

Removed WIP and development guidelines.
parent e47b42ca
Loading
Loading
Loading
Loading

README.md

0 → 100644
+75 −0
Original line number Diff line number Diff line
# Java SDG Slicer

A program slicer for Java, based on the system dependence graph (SDG). *Program slicing* is a software analysis technique to extract the subset of statements that are relevant to the value of a variable in a specific statement (the *slicing criterion*). The subset of statements is called a *slice*, and it can be used for debugging, parallelization, clone detection, etc. This repository contains two modules:

* `sdg-core`, a library that obtains slices from Java source code via the SDG, a data structure that represents statements as nodes and their dependencies as arcs.
* `sdg-cli`, a command line client for `sdg-core`, which takes as input a Java program and the slicing criterion, and outputs the corresponding slice.

Warning: all method calls must resolve to a method declaration. If your Java program requires additional libraries, their source code must be available and included in the analysis with the `-i` option. Any method call that cannot be resolved will result in a runtime error.

## Quick start

### Build the project

```
cd sdg-core
mvn install
cd ../sdg-cli
mvn package
cd ..
```

A fat jar containing all the project's dependencies can be then located at `./sdg-cli/target/sdg-cli-{version}-jar-with-dependencies.jar`.

### Slice a Java program

The slicing criterion can be specified with the flag `-c {file}#{line}:{var}[!{occurrence}`, where the file, line and variable can be specified. If the variable appears multiple times in the given line, an occurrence can be set (append `:2` to select the second occurrence).

If we wish to slice following program with respect to variable `sum` in line 11, 

```java=
public class Example {
    public static void main(String[] args) {
        int sum = 0;
        int prod = 0;
        int i;
        int n = 10;
        for (i = 0; i < 10; i++) {
            sum += 1;
            prod += n;
        }
        System.out.println(sum);
        System.out.println(prod);
    }
}
```
The program can be saved to `Example.java`, and the slicer run with:

```
java -jar sdg-cli.jar -c Example.java#11:sum
```

A more detailed description of the available options can be seen with:

```
java -jar sdg-cli.jar --help
```

## Library usage

A good usage example of `sdg-core` to obtain a slice from source code is available at [Slicer.java#slice()](/sdg-cli/src/main/java/tfm/cli/Slicer.java#L204), where the following steps are performed:

1. JavaParser is configured to (a) resolve calls in the JRE and the user-defined libraries, and to (b) ignore comments.
2. The user-defined Java files are parsed to build a list of `CompilationUnit`s.
3. The SDG is created based on that list. The kind of SDG created depends on a flag.
4. A `SlicingCriterion` is created, from the input arguments, and the slice is obtained.
5. The slice is converted to a list of `CompilationUnit` (each representing a file).
6. The contents of each `CompilationUnit` are dumped to their corresponding file.

If the graph is of interest, it can be outputted in `dot` or PDF format via `SDGLog#generateImages()`, as can be seen in [PHPSlice.java#124](/sdg-cli/src/main/java/tfm/cli/PHPSlice.java#L124) (this class presents a frontend for an unreleased web Java slicer).

## Missing Java features

* Object-oriented features: abstract classes, interfaces, class, method and field inheritance, anonymous classes, lambdas.
* Parallel features: threads, shared memory, synchronized methods, etc.
* Exception handling: `finally`, try with resources.

readme.md

deleted100644 → 0
+0 −169
Original line number Diff line number Diff line
# TFM

- [TFM](#tfm)
  - [Introduction](#introduction)
  - [Quick start](#quick-start)
    - [Build a graph](#build-a-graph)
    - [Slice a program](#slice-a-program)
  - [Structure](#structure)
    - [Summary](#summary)
  - [Current state](#current-state)
    - [Graphs](#graphs)
    - [Statements covered](#statements-covered)
  - [To do list](#to-do-list)
    - [SDG](#sdg)
    - [General](#general)
  - [Code samples](#code-samples)
    - [Build a CFG from a program](#build-a-cfg-from-a-program)
    - [Get a slice of the PDG of a program](#get-a-slice-of-the-pdg-of-a-program)
  - [Workflow](#workflow)

## Introduction

The main goal of this work is to develop a Java slicer. This is done by building a System Dependence Graph of the program being sliced

## Quick start

### Build a graph

Find `Main` class (`tfm/exec`), modify static fields of the class (the program being analyzed, the graph to build, etc.) and execute it. You will find the output in `tfm/out` as a png image

### Slice a program 

Find `Slice` class (`tfm/slicing`), set the program path and execute. The sliced program will be in `tfm/out`

## Structure

Graphs are built using a library called `JGraphT`.

The main class is the `Graph` class, which extends from `JGraphT`'s `DefaultDirectedGraph` class. This class includes some general interest methods (like `toString`, etc.)

Every graph has a set of nodes and arrows. `GraphNode` and `Arc` classes are used to represent them respectively.

A set of visitors is implemented for many things, such as graph building, data dependence building, etc... (available in `tfm/visitors`)

A bunch of programs are written in `tfm/programs`, you can write more there.

Some naive testing is implemented in the `tfm/validation` folder. Currently, a PDG can be compared with a program to check their equality.

Some util methods are available in `tfm/utils` (such as AST utils, logger, etc.)

Forget about the `tfm/scopes` folder, it was an idea I had to discard and it has to be deleted.

### Summary

- Graphs (`tfm/graphs`)
  - CFGGraph
  - PDGGraph
  - SDGGraph
  
- Nodes (`tfm/nodes`)
  - ~~CFGNode, PDGNode, SDGNode~~ (_Deprecated_)
  - GraphNode
  - MethodCallNode (_idk if this is necessary, maybe it can be deleted_)

- Arcs (`tfm/arcs`)
  - ControlFlowArc
  - DataDependencyArc
  - ControlDependencyArc

- Visitors (`tfm/visitors`)
  - CFGBuilder
  - ~~PDGVisitor~~ (_Deprecated, it was an intent to build a PDG with no CFG needed_)
  - PDGBuilder
  - ControlDependencyBuilder
  - DataDependencyBuilder
  - SDGBuilder (_Probably deprecated_)
  - NewSDGBuilder -**Work in progress**-
  - MethodCallReplacerVisitor (_Replaces method call nodes with in and out variable nodes_) -**Work in progress**-

## Current state

### Graphs

- CFG: Done!
- PDG: Done!
- SDG: PDGs are built for each method

### Statements covered

- Expressions (ExpressionStmt)
- If (IfStmt)
- While, DoWhile (WhileStmt, DoStmt)
- For, Foreach (ForStmt, ForeachStmt)
- Switch (SwitchStmt, SwitchEntryStmt)
- Break (BreakStmt)
- Continue (ContinueStmt)

## To do list

### SDG

- Replace method call nodes with in and out variables nodes and build arrows for them
- Build summary arrows

### General

- Switch to a (much) better graph library like [JGraphT](https://jgrapht.org/). It also supports graph visualization (done).
- Performance review
- Make a test suite (test graph building, slicing, etc.)
- Add support to more Java language features (lambdas, etc.)

## Code samples

### Build a CFG from a program

```java
public class Example {
    public CFG buildCFG(File programFile) {
        // Always disable attribution of comments, just in case
        JavaParser.getStaticConfiguration().setAttributeComments(false);
        
        Node astRoot = JavaParser.parse(programFile);
        Optional<MethodDeclaration> optMethod = astRoot.findFirst(MethodDeclaration.class);
        if (!optMethod.isPresent)
            throw new RuntimeException("No method could be found");
        
        // Creates a new graph representing the program
        CFG cfg = new CFG();
        cfg.build(optMethod.get());
        return cfg;
    }
}
```

### Get a slice of the PDG of a program

```java
public class Example {
    public PDGGraph getSlice(File program, SlicingCriterion slicingCriterion) {
        // Always disable attribution of comments, just in case
        JavaParser.getStaticConfiguration().setAttributeComments(false);
        
        Node astRoot = JavaParser.parse(programFile);
        Optional<MethodDeclaration> optMethod = astRoot.findFirst(MethodDeclaration.class);
        if (!optMethod.isPresent)
            throw new RuntimeException("No method could be found");
        
        // Creates a new graph representing the program
        PDG pdg = new PDG();
        pdg.build(optMethod.get());
        // Slice PDG
        return pdg.slice(slicingCriterion);
    }
}
```

## Workflow

- Branches:
  - `master` (only for stable versions)
  - `develop` (main branch)
  - `<issue number>-name`

1. Discover a new feature/fix
2. Open an issue describing it and assign it
4. Create a new branch from `develop` with the same name as the issue number (e.g. for issue #12 the new branch is called `12`)
5. Write the solution to the issue
6. Once resolved, open a pull request from the issue branch to `develop` branch
7. Finally, when pull request is merged, remove branch
 No newline at end of file