Adjust the way of introducing JRE APIs and Introduce Cut-Shortcut into Tai-e#215
Adjust the way of introducing JRE APIs and Introduce Cut-Shortcut into Tai-e#215for-just-we wants to merge 3 commits intopascal-lab:masterfrom
Conversation
- 1.`Map.entrySet().iterator().next()` shoule be deemed as [Transfer] instead of [Exit]. However, this is not easy to recogize in ContainerAccessHandler.CutReturnEdge. Making point-to set of Map.entrySet().iterator().next() call receiver to be empty. Currently we use a stupid method that collect variables to be `Map$Entry` type and recongize which `Iterator.next()` callsite is `Map$Entry` to filter. - 2.PFG edge from return value of `Map.keySet()/values()` to their callsite receivers is marked with `FlowKind.LOCAL_ASSIGN` instead of `FlowKind.RETURN`. Making ptsH of `Map.keySet()/values()` not optimized. - 3.`Array-Initializer` such as `Collections` and `Arrays%ArrayList <init>` making objects point-to by array parameters cross-reference and affect the precision of ptsH of `Collection` variables.
|
All contributors have signed the CLA ✍️ ✅ |
|
I have read the CLA Document and I hereby sign the CLA |
|
Hi @for-just-we! Welcome to contribute to Tai-e! It seems like this PR can be decoupled into several separate PRs, (e.g., changing how to supply specific versions of java libraries, changing collection APIs, integrating cut-shortcut). Could you please separate them (e.g., by submitting smaller, individual, more readable PRs) for conciseness? As for cut-shortcut, as far as I am concerned, there's already a separate artifact repo maintained by the author of cut-shortcut @YangShengYuan here, which is based on a more recent version of tai-e, with more features, improved algorithms, better efficiency and precision (more importantly, better engineering designs) compared to their implementation in the PLDI artifact. I'm not sure if the versions from the artifact is still the best choice to integrate into Tai-e. |
Arguments exampe
java -jar Tai-e-all.jar -cp <path/to/program> -m Main -lj <path/to/JREs> -java 8 -a "pta=cs:ci;solver:csc;dump-yaml:true;only-dump-app:true;Where
-lj <path/to/JREs>specifiy load JRE lib from<path/to/JREs>instead of hard-codedjava-benchmarks/JRE,solver:cscin analysis option specify using Cut-Shortcut instead of default solver,only-dump-app:truemeans when dumping yaml analysis result, only results of application code will be dump.1. Adjust the way of introducing JRE APIs
In AbstractWorldBuilder.getClassPath. Tai-e load JRE libraries from hardcoded path
java-benchmarks/JREs. Which is hard to use when packaged into executable-jar. Here I add a new command optionlibJREPaththat makes user can specify a customized path, default tojava-benchmarks/JREs.2. Dump only the application-code point-to analysis results (May not be necessary)
Currently, when adding
dump-ci:true/dump-yaml:true, Tai-e will dump results for all variables. However, usually application code is a small part of the whole program. Hence I add a optiononly-dump-appinAnalysisOption(Tai-e-analysis.yml), default tofalse, when set totrue, Tai-e will only dump application code results.3. Introducing Cut-Shortcut
I downloaded the code from Cut-Shortcut artefact and integrated it into the latest Tai-e commit. Pass the 4 testcases provided in the following zip file. This implementation supports context-sensitive analysis.
Local flow and field access analysis follow the implementation of the paper artefact while container access differs. To make the analysis result sound, only PFG edge from return value of container/iterator-exit method with modeled type to their callsite will be cut. Container type such as
ArrayList,HashMapis modeled, customized container types such as anonymousAbstractListis not considered, total API list is provided in container-config.Testcases
CutShortcutTestcases.zip
Implementations that may be refined later
Here Cut-Shortcut support following [Transfer] method. We follow raw artefact to match method name at callsite with keywords such as
keySet,entrySet. Maybe solve this inonNewCallEdgewill better? But it's hard to collect all relevant callee method signature.Map.entrySet().iterator().next()Although
Map.entrySet()'s type isSet, andMap.entrySet().iterator().next()callsSet.iterator().next().Since it's element type is
Map$Entry, it is of [Transfer] instead of [Exit].Hence in
ContainerAccessHandler.CutReturnEdge. When cutting return value ofIterator.next(), we need to makes sure the element type is notMap$Entry. Here I collect all potentialMap$Entrytype variable by analyzing cast statements. For example, in following example,$r12is aMap$Entryhence$r12 = invokeinterface $r11.<java.util.Iterator: java.lang.Object next()>();is a [Transfer] call.I guess there may be better [Transfer] analysis method than analyzing cast statements?
Issues
Map.keySet()/values()to their callsite receivers is marked withFlowKind.LOCAL_ASSIGNinstead ofFlowKind.RETURN. HenceCutShortcutSolver.needPropagateHostwill mark they can propagateitem1anditem2.Array-Initializersuch asCollectionsandArrays$ArrayList <init>making objects point-to by array parameters cross-reference and affect the precision of ptsH ofCollectionvariables. For example, the implementation ofCollections.addAllis as follows. However, those algorithm is not marked ignored, so the side-effect of the method is conclude. Hence, for codeCollections.addAll(list, array);, analysis algorithm may add more irrelevant information thanarrayprovide tolist. A way to solve this is to markArray-Initializeras ignorable, but the pts inside container will be unsound.