Author: Toshihiro Kamiya Created: 2008/June/2 Last-Modified: 2008/July/18 Contact: info@ccfinder.net Copyright: 2008 ( C ) Tosihiro Kamiya. All rights reserved.
Tool Vaci is a kind of reverse engineering tool. The tool statically analyzes source code, and classifies identifiers with the contexts where the identifiers appear. The tool enables the user to investigate inadequate names and to investigate change names between versions of a software product.
Install the required tools and applications.
Install Vacicmd
Copy the executable files (*.exe in case of Windows) of Vaci distribution to /bin of the CCFinderX install directory.
Install Vaci plug-in
At first, unzip vaciplugin.zip and copy the files ( net.ccfinder.plugin.vaci.*.jar ) to /plugin of Eclipse. Then, invoke Eclipse. In a dialog which appears by menu [Window]-[Show View]-[Other...], select [Vaci]-[Vaci View] and click [OK].
Tool Vaci has two functions: 1) by applying to source code of a product, investigate variation of names in the product, and 2) by applying to source code of versions of a product, investigate change of names between versions.
The command "c" of vacicmd detects sets of identifiers (hereafter, "translation class"), where the identifiers included by the set have distinct names and appear in the similar contexts. When a translation class includes the identifiers that are not regularly named, such variation names may expose a inadequate naming of identifiers.
Vacicmd extracts translation classes and classifies them into the nine types for convenience in analysis. The user can browse these translation classes and their classifications with Vaci plug-in, in order to investigate each translation class by refering the corresponding source code.
We assume the target soruce files are strored in a directory c:\targetsrc . (If you would like to browse the result in Eclipse with Vaci plug-in, select the root directory of a Eclipse project, that is, a directory including a file .project .) Also, we assume that the source code is written in Java programming language. (So in case of that the target is C/C++ source code, replace "java" with "cpp" in the following explanation. )
Type following command line in order to run vacicmd:
c:\>pushd c:\targetsrc c:\targetsrc>vacicmd c java .
When the vacicmd finished successfully, a file _transclasses.txt is generated. This file is a text file, so you can check the content with more command or text editor:
c:\targetsrc>more _tralsclass.txt
This step is to explain how to browse translation classes within Eclipse IDE.
Import c:\target with a import menu of Eclipse in advance.
Push button [Vaci search] at Eclipse toolbar. A dialog will appear.
In this dialog, you can specify the options in showing translation classes. The three types, which are selected (checked) by default, are highly possibly to be variation of naming.
In Vaci view, the translation classes are shown in trees.
) expresses a root directory of Eclipse projects.
) show a type of the node.
As for detail of each type, refer the Types of translation class .
) express a location where the identifier appears.By double clicking a node at the deepest level, you can make Eclipse text editor to show code fragment where the identifier appears in a source file.
This step is to explain how to read translation classes with a text editor. (This step will be helpful in developing a tool to process the output of Vacicmd.)
A file of translation classes contains series of sections. Each section starts with a line "type: ..." and ends with an empty line. A section expresses a translation class. The followings is an example of a section:
type: minority
id|cf.endPos
h:\kamiya\prog\smith2008\GemX\model\layeredgroup\CodeFragment.java:49
id|cf.beginPos
h:\kamiya\prog\smith2008\GemX\model\layeredgroup\CodeFragment.java:46
--
id|right.end
h:\kamiya\prog\smith2008\GemX\model\CodeFragment.java:29
id|right.begin
h:\kamiya\prog\smith2008\GemX\model\CodeFragment.java:25
id|right.leftEnd
h:\kamiya\prog\smith2008\GemX\model\ClonePair.java:54
id|right.rightEnd
h:\kamiya\prog\smith2008\GemX\model\ClonePair.java:62
id|right.leftBegin
h:\kamiya\prog\smith2008\GemX\model\ClonePair.java:50
id|right.rightFile
h:\kamiya\prog\smith2008\GemX\model\ClonePair.java:46
id|right.rightBegin
h:\kamiya\prog\smith2008\GemX\model\ClonePair.java:58
The first line shows a type of the translation class. (The type is determined by whether some rule exists among names, which rule is it, etc. Refer Types of translation class) Among the remaining lines, a line "id|..." shows an identifier. A lines starting with a tab shows a location where the identifier appear in source code, by means of file name and line number.
Moreover, when the type of the translation class is "minority", the section is divided two parts by a line "--". The first part contains the identifiers that are regarded as minorities. The second part contains the identifiers that are regarded as majorities.
The "m" command of vacicmd extracts the identifiers that are sharing the common context between two versions and have the distinct names between versions. That is, a set of identifiers in the older version (O) and a set of identifiers in new version (N), where an identifier in O is sharing a context with an identifier in N and the two identifiers are having the distinct names. Such a pair of sets O, N is called "translation map".
With translation maps, the user can investigate the changes in names between versions.
We assume that the source code of the older version and the new version are stored the directory c:\oldver and c:\newer, respectively. The directory where the detection result will be stored is c:\analsys. The target source code is written in Java programming language.
Type the following command line:
c:\>pushd c:\analsys c:\analsys>vacicmd m java c:\oldver c:\newver
When the vacicmd finished successfully, a file "_transmapss.txt" is generated:
c:\analsys>more _transmaps.txt
This step is to explain how to generate a HTML file from a translation map and browse it.
Run the visualizetransmap at command line, which will make a sub directory named browse and store the generated HTML files in the directory:
c:\analysis>visualizetransmap _transmaps.txt -p browse
By opening a generated file index.html with a web browser, a page like the following will be shown.
The left pane contains a summary, the captions Type 1to1, Type 1toN, Type Nto1, and Type MtoN. The summary means the numbers of the detected translation maps of each type (At the initial state, the content of summary is shown at the right pane). Below each type caption, the serial numbers (#number) of the translation maps of the type are shown. By clicking a serial number of the translation map, the content of the map will appear in the right pane.
The right pane contains the content of summary, or the content of the translation map selected by the left pane. The following figure shows a page where content of a translation map in the right pane.
The right pane contains a serial number of a translation map, a graph of the translation map, and edges of the graph.
The graph is generated from the translation map (this graph will not be visible in the browser without SVG support). Each nodes (box) in the graph expresses an identifier. The nodes at the left side are the identifiers in older version. The nodes at the right side are the identifiers in newer version. Each edge of solid line expresses a relation between identifiers, that is, two identifiers at both sides e are sharing a context. Each edge of dashed edge expresses the same name identiifers, that is, two identifiers at both sides mean an identical name apperas in the older version and the newer version.
Below the caption "Edges", for each of edges in the graph, a shared context, which holds the relation between identifiers, is expressed as locations of code fragments in source code.
This step is to explain how to read translation maps with a text editor. (This step will be helpful in developing a tool to process the output of Vacicmd.)
A file of translation maps ( _translationmaps.txt ) contains series of sections. Each section starts with a line "type: ..." and ends with an empty line. A section expresses a translation map. The followings is an example of a section:
type: MtoN
id|addPreprocessorItem id|addPreprocessScript
c:\analysis\10.1.9\GemX\gemx\MainWindow.java:890 c:\analysis\10.2.3.5\GemX\gemx\MainWindow.java:577
id|add id|addPreprocessorItem
c:\analysis\10.1.9\GemX\gemx\MainWindow.java:470 c:\analysis\10.2.3.5\GemX\gemx\MainWindow.java:1040
id|add id|addPreprocessScript
c:\analysis\10.1.9\GemX\gemx\MainWindow.java:470 c:\analysis\10.2.3.5\GemX\gemx\MainWindow.java:577
The first line shows a type of the translation map (refer Types of translation map). Among the remaining lines, the lines "id|..." shows a pair of identifiers from the older version and the newer version; the identifier before a tab is an older one, and the identifier after the tab is the newer one. A line starting with a tab shows a locations of the older and newer identifiers, by means of file name and line number; the file name and line number before a tab is a location of the older identifier, the file name and line number after the tab is the newer one.
A translation class, which is extracted by tool Vaci from source code of a product, is classified into one of the following types by characteristics of the names that are included by the translation class.
| Icon/Name | Notation in file | Description |
Identical except for name space |
nearly-identical | All identifiers have the same name except for names space (namespace in C++ or package in Java) |
Including distinct numbers |
distinct-numbers | All identifiers have the same name except for numbers in the names. |
Short name |
short-name | Some identifier in the translation class has a short name, when name spaces have been removed from the names. |
Same prefix |
same-prefix | All identifiers have the same prefix. |
Same postfix |
same-postfix | All identifiers have the same postfix. |
Distinct case |
distinct-case | All identifiers have the name, when the names have been converted to lower cases and the tail "s"s (which may mean plural form) have been removed. |
Abbreviated |
abbreviated | It is possible to generate all of names of the identifiers by removing some characters from a name of an identifier. |
including minority |
minority | Identifiers are divided into minorities and majorities by patterns of their names. |
Others |
others | The identifiers do not fit any of the above cases. |
A translation map, which is extracted by tool Vaci from source code of two versions of a product, is classified into one of the following types by number of the identifiers in the older version and number of the identifiers in the newer version.
| Type 1to1 | The translation map includes one identifier in the older version and one identifier in the newer version. This possibly means that the identifier has been renamed between versions. |
| Type 1toN | The translation map includes one identifier in the older version and multiple identifiers in the newer version. This possibly means that the concept described by the name in the older version has been split into multiple concepts in the newer version. |
| Type Nto1 | The translation map includes multiple identifiers in the older version and one identifier in the newer version. This possibly means that the multiple names were used to point one concept in the older version and the names have been integrated into one name in the newer version. |
| Type MtoN | The translation map includes multiple identifiers in the older version and multiple identifiers in the newer version |