Soot (software)

{{more footnotes|date=October 2009}}

In static program analysis, Soot is a bytecode manipulation and optimization framework consisting of intermediate languages for Java. It has been developed by the Sable Research Group at McGill University. Soot is currently maintained by the Secure Software Engineering Group at Paderborn University.{{cite web |title=Soot - A Java optimization framework |url=https://github.com/soot-oss/soot |website=github.com |access-date=16 January 2024}}

Soot provides four intermediate representations for use through its API for other analysis programs to access and build upon:{{cite web |url=http://www.sable.mcgill.ca/soot/ |title=A framework for analyzing and transforming Java and Android Applications |website=Sable.mcgill.ca |date= |access-date=2016-08-10 |archive-date=2008-12-28 |archive-url=https://web.archive.org/web/20081228032725/http://www.sable.mcgill.ca/soot/ |url-status=dead }}

  • Baf: a near bytecode representation.
  • Jimple: a simplified version of Java source code that has a maximum of three components per statement.
  • Shimple: an SSA variation of Jimple (similar to GIMPLE).
  • Grimp: an aggregated version of Jimple suitable for decompilation and code inspection.

The current Soot software release also contains detailed program analyses that can be used out-of-the-box, such as context-sensitive flow-insensitive points-to analysis,{{cite web|author= |url=http://www.sable.mcgill.ca/soot/tutorial/analysis/index.html |title=Tutorials · Sable/soot Wiki · GitHub |website=Sable.mcgill.ca |date=2016-01-12 |access-date=2016-08-10}} call graph analysis and domination analysis (answering the question "must event a follow event b?"). It also has a decompiler called dava.

Soot is free software available under the GNU Lesser General Public License (LGPL).

In 2010, two research papers on Soot ({{harvnb|Vallée-Rai|Co|Gagnon|Hendren|1999}} and {{harvnb|Pominville|Qian|Vallée-Rai|Hendren|2000}}) were selected as IBM CASCON First Decade High Impact Papers among 12 other papers from the 425 entries.{{cite web|url=http://dl.acm.org/citation.cfm?id=1925805 |title=CASCON First Decade High Impact Papers |website=Dl.acm.org |date= |access-date=2016-08-10}}

Jimple

Jimple is an intermediate representation of a Java program designed to be easier to optimize than Java bytecode. It is typed, has a concrete syntax and is based on three-address code.

Jimple includes only 15 different operations, thus simplifying flow analysis. By contrast, java bytecode includes over 200 different operations.{{cite web |url=http://www.sable.mcgill.ca/publications/techreports/#report1 |title=The Jimple Framework |last1=Vallee-Rai |first1=Raja|website=Sable.mcgill.ca |year=1998 }}{{cite web |url=http://www.sable.mcgill.ca/publications/techreports/#report4 |title=Jimple: Simplifying Java Bytecode for Analyses and Transformations |last1=Vallee-Rai |first1=Raja |last2=Hendren |first2=Laurie J.|website=Sable.mcgill.ca |year=1998 }}

Unlike java bytecode, in Jimple local and stack variables are typed and Jimple is inherently type safe.

Converting to Jimple, or "Jimplifying" (after "simplifying"), is conversion of bytecode to three-address code. The idea behind the conversion, first investigated by Clark Verbrugge, is to associate a variable to each position in the stack. Hence stack operations become assignments involving the stack variables.

=Example=

Consider the following bytecode, which is from the {{Sfn|Vallee-Rai|1998}}

iload 1 // load variable x1, and push it on the stack

iload 2 // load variable x2, and push it on the stack

iadd // pop two values, and push their sum on the stack

istore 1 // pop a value from the stack, and store it in variable x1

The above translates to the following three-address code:

stack1 = x1 // iload 1

stack2 = x2 // iload 2

stack1 = stack1 + stack2 // iadd

x1 = stack1 // istore 1

In general the resulting code does not have static single assignment form.

SootUp

Soot is now succeeded by the SootUp framework developed by the Secure Software Engineering Group at Paderborn University.{{cite web |title=A new version of Soot with a completely overhauled architecture |url=https://github.com/soot-oss/SootUp |website=github.com |access-date=16 January 2024}} SootUp is a complete reimplementation of Soot with a novel design, that focuses more on static program analysis, rather than bytecode optimization.

References

{{Reflist}}

Further reading

  • {{cite conference|year=1998|chapter=Soot: A Java bytecode optimization framework|first1=Raja|last1=Vallée-Rai|first2=Phong|last2=Co|first3=Etienne|last3=Gagnon|first4=Laurie|last4=Hendren|first5=Patrick|last5=Lam|first6=Vijay|last6=Sundaresan|title=Proceedings of the 1999 conference of the Centre for Advanced Studies on Collaborative research |conference=CASCON '99|chapter-url=http://dl.acm.org/citation.cfm?id=782008}} Republished in {{cite conference|title=CASCON First Decade High Impact Papers|conference=CASCON '10|url=http://dl.acm.org/citation.cfm?id=1925818 |pages=214–224 |doi=10.1145/1925805.1925818|url-access=subscription}}
  • {{cite conference|year=2000|title=A framework for optimizing Java using attributes|first1=Patrice|last1=Pominville|first2=Feng|last2=Qian|first3=Raja|last3=Vallée-Rai|first4=Laurie|last4=Hendren|first5=Clark|last5=Verbrugge}} Republished in {{cite conference|title=CASCON First Decade High Impact Papers|conference=CASCON '10|url=http://dl.acm.org/citation.cfm?id=1925819 |pages=225–241 |doi=10.1145/1925805.1925819|url-access=subscription}}
  • {{cite journal|year=2011|first1=Patrick|last1=Lam|first2=Eric|last2=Bodden|first3=Ondřej|last3=Lhoták|first4=Laurie|last4=Hendren|title=The Soot framework for Java program analysis: a retrospective|journal=Cetus Users and Compiler Infrastructure Workshop|url=http://plg.uwaterloo.ca/~olhotak/pubs/cetus.pdf}}