Skip to content

Commit

Permalink
add example code, update README and add docs
Browse files Browse the repository at this point in the history
  • Loading branch information
t81lal committed Apr 17, 2024
1 parent a5f7adc commit 5abd0d6
Show file tree
Hide file tree
Showing 11 changed files with 282 additions and 10 deletions.
94 changes: 93 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,30 @@ analysis.
- Integration and unit tests - partially implemented
- Language server


### Project Structure
- example/ - Demo/example code for understanding the API
- src/solidity_parser
- filesys.py - Solidity project environment management
- VirtualFileSystem: file system environment manager in solp(and solc), resolves imports and loads files
- LoadedSource: loaded source unit(file)
- errors.py - Parsing/Compiler errors
- util/version_util.py - Compiler/parser version helper
- grammar/ - Solidity antlr grammars and generated lexers/parsers
- ast/parsers/ - Antlr to AST1 parsers
- ast - Main SOLP analysis/parsing package
- symtab.py: AST1 symbol table builder
- nodebase.py: AST1 and AST2 shared base code
- types.py: AST1 and AST2 shared type system
- solnodes.py: AST1 only tree nodes
- solnodes2.py: AST2 only tree nodes
- ast2builder.py: translates an AST1 parse tree into AST2
- test/solidity_parser - SOLP unit + integration tests
- testcases - Solidity data used in above tests
- vendor - external dependencies
- antlr-4.11.1-complete.jar: antlr for generating lexers/parsers
- setup.py - real setup script for the module
- setup.sh - legacy helper script for generating ANTLR developer bindings
### Requirements

- Python 3.11+ is required
Expand All @@ -37,12 +61,80 @@ analysis.

`setup.py` generates the antlr Python grammar stubs, builds and installs solp as `solidity-parser` with `pip install .`

For development setup install with `pip install -e .`


### Usage

Solp is not a standalone application so has no entry point to speak of. Example snippets and currently run
configurations for development are in `main.py` but these aren't application usecases themselves.
configurations for development are in `example/` but these aren't application usecases themselves.

The example code in `example/quickstart.py` enables you to load in a Solidity project and generate AST2 parse trees.
These can then be used for analysis.


### How it works

The idea is to get AST2 parse trees. ANTLR and AST1 parse trees don't contain enough information in the nodes to be
useful on their own(e.g. imports, using statements, function calls, and more don't get resolved). To build AST2 parse
trees you take AST1 parse trees, generate symbol information and then pass both into the AST2 builder. This gives you
"linked" AST2 nodes, i.e. relationship program information is embedded into the nodes themselves.

For example, with the ANTLR grammar for 0.8.22 a library call such as `myVariable = adder.add(myVariable, value);` (line
11 in the `example/librarycall/TestContract.sol` file) would have the following parse tree

<img src="docs/imgs/parseTree.png">

In AST1 this would parsed as:
```
ExprStmt(
expr=BinaryOp(
left=Ident(text='myVariable'),
right=CallFunction(
callee=GetMember(
obj_base=Ident(text='adder'),
name=Ident(text='add')
),
modifiers=[],
args=[Ident(text='myVariable'), Ident(text='value')]
),
op=<BinaryOpCode.ASSIGN: '='>
)
)
```

There are many things left to be desired but here are some the most obvious:
1. The store operation is a BinaryOp instead of a state variable store
2. The `callee` for the library call is a `GetMember` consisting of `Idents` only. Without the import
information in the current scope, we can't resolve this call.
3. Similarly, the arguments are `Idents` and represent state variable lookup and local variable lookup respectively. We
can't poll any information from this parse tree on its own because these Idents aren't bound to anything.

Here is the AST2 output of the same code:

```
StateVarStore(
base=SelfObject(),
name=Ident(text='myVariable'),
value=FunctionCall(
named_args=[],
args=[
StateVarLoad(base=SelfObject(), name=Ident(text='myVariable')),
LocalVarLoad(
var=Var(
name=Ident(text='value'),
ttype=IntType(is_signed=False, size=256),
location=None
)
)
],
base=StateVarLoad(base=SelfObject(), name=Ident(text='adder')),
name=Ident(text='add')
)
)
```

The tree is much clearer and explicit in the operations performed. Additional functionality is also available due to the
linking, for example, calling `FunctionCall.resolve_call()` gives us the corresponding `FunctionDefinition` in the
`AdderLib.Adder` library, `base.type_of()` gives us a `ResolvedUserType(MyContract)` which can then be explored like in
the quickstart.py example.
Binary file added docs/imgs/parseTree.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 8 additions & 0 deletions example/librarycall/AdderLib.sol
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
// SPDX-License-Identifier: UNLICENSED
pragma solidity ^0.8.0;

library Adder {
function add(uint256 a, uint256 b) public pure returns (uint256) {

}
}
13 changes: 13 additions & 0 deletions example/librarycall/TestContract.sol
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
// SPDX-License-Identifier: UNLICENSED
pragma solidity ^0.8.0;

import "AdderLib.sol";

contract MyContract {
Adder private adder;
uint256 public myVariable;

function addToVariable(uint256 value) public {
myVariable = adder.add(myVariable, value);
}
}
21 changes: 21 additions & 0 deletions example/project/contracts/TestContract.sol
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
// SPDX-License-Identifier: UNLICENSED
pragma solidity ^0.8.0;

import "@openzeppelin/access/Ownable.sol";

contract MyContract is Ownable {
uint256 public myVariable;

// Function to set the value of myVariable
function setMyVariable(uint256 newValue) public onlyOwner {
myVariable = newValue;
}

function getMyVariable() public view returns (uint256) {
return myVariable;
}

function addToVariable(uint256 value) public onlyOwner {
myVariable += value;
}
}
23 changes: 23 additions & 0 deletions example/project/lib/oz/access/Ownable.sol
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
// SPDX-License-Identifier: UNLICENSED
pragma solidity ^0.8.0;

contract Ownable {
address public owner;

event OwnershipTransferred(address indexed previousOwner, address indexed newOwner);

function Ownable() public {
owner = msg.sender;
}

modifier onlyOwner() {
require(msg.sender == owner);
_;
}

function transferOwnership(address newOwner) public onlyOwner {
require(newOwner != address(0));
emit OwnershipTransferred(owner, newOwner);
owner = newOwner;
}
}
1 change: 1 addition & 0 deletions example/project/remappings.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
@openzeppelin/=lib/oz/
68 changes: 68 additions & 0 deletions example/quickstart.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# makes this file easily runnable in Pycharm
if __name__ == '__main__':
pass


from pathlib import Path

from solidity_parser import filesys
from solidity_parser.ast import symtab, ast2builder, solnodes2


project_dir = Path('./project')
source_dir = project_dir / 'contracts'
library_dir = project_dir / 'lib'
remappings_file = project_dir / 'remappings.txt'


# setup our VFS with all of the source directories and remappings if required, this is where AST1 parse trees come from

vfs = filesys.VirtualFileSystem(
# base_path is the project path
project_dir,
# don't pass in CWD, VFS will get it
None,
# pass in source and library directories as "include_paths", i.e. source paths
[source_dir, library_dir],
# no forced compiler version
None
)

if remappings_file.exists():
vfs.parse_import_remappings(remappings_file)

# symbol table builder is required to get symbol info from AST1 for AST2
sym_builder = symtab.Builder2(vfs)

file_to_analyse = Path('TestContract.sol')
# searches for the file to analysis and gets us back a FileScope(the input to AST2builder)
# pass in the str representation of the Path
file_sym_info: symtab.FileScope = sym_builder.process_or_find_from_base_dir(file_to_analyse)

# setup the AST2 builder
ast2_builder = ast2builder.Builder()
ast2_builder.enqueue_files([file_sym_info])

# run the builder, this will create AST2 parse trees for the file_to_analyse and any files that are referenced from
# there and need to be analysed in the process(all lazily)
ast2_builder.process_all()

# AST2 creates "top level units" for contracts, interfaces, enums, libraries and also "synthetic top level units" for
# functions, errors, events and constants that are "free", i.e. not defined in a top level node, in our example code
# we only have a contract defined at the top level, so get that
all_units: list[solnodes2.TopLevelUnit] = ast2_builder.get_top_level_units()

# u.name is a solnodes2.Ident, str(u.name) makes it comparable to strings
# hint for ContractDefinition since we know the type
my_contract: solnodes2.ContractDefinition = [u for u in all_units if str(u.name) == 'MyContract'][0]

for p in my_contract.parts:
if isinstance(p, solnodes2.FunctionDefinition):
print(f'Found a function: {p.descriptor()}')

# Should print:
# Found a function: TestContract.sol.MyContract::myVariable() returns (uint256)
# Found a function: TestContract.sol.MyContract::setMyVariable(uint256) returns ()
# Found a function: TestContract.sol.MyContract::getMyVariable() returns (uint256)
# Found a function: TestContract.sol.MyContract::addToVariable(uint256) returns ()

11 changes: 9 additions & 2 deletions src/solidity_parser/ast/symtab.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
from solidity_parser.filesys import LoadedSource, VirtualFileSystem

from solidity_parser.ast.mro_helper import c3_linearise
import logging
import logging, pathlib

import solidity_parser.util.version_util as version_util

Expand Down Expand Up @@ -1010,6 +1010,10 @@ def __init__(self, file_scope, unit_scope):
self.unit_scope = unit_scope

def __init__(self, vfs: VirtualFileSystem, parser_version: version_util.Version = None):
if parser_version is None:
# may still be None
parser_version = vfs.compiler_version

self.root_scope = RootScope(parser_version)
self.vfs = vfs
self.parser_version = parser_version
Expand All @@ -1025,7 +1029,10 @@ def process_or_find(self, loaded_source: LoadedSource):

return found_fs

def process_or_find_from_base_dir(self, relative_source_unit_name: str):
def process_or_find_from_base_dir(self, relative_source_unit_name: str | pathlib.Path):
if isinstance(relative_source_unit_name, pathlib.Path):
relative_source_unit_name = str(relative_source_unit_name)

# sanitise inputs if windows paths are given
relative_source_unit_name = relative_source_unit_name.replace('\\', '/')

Expand Down
37 changes: 35 additions & 2 deletions src/solidity_parser/filesys.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,28 +14,43 @@

@dataclass
class Source:
""" Structure of a source unit defined in the standard JSON input """

# keccak256: Optional[str]
urls: Optional[List[str]]
content: str


@dataclass
class StandardJsonInput:
"""
Solidity standard JSON input see:
https://docs.soliditylang.org/en/v0.8.25/using-the-compiler.html#compiler-api
"""

# language: str # 'Solidity'
sources: Dict[str, Source] # source unit name -> source
# settings, not required for now


@dataclass
class LoadedSource:
"""
Source unit loaded inside the virtual filesystem
"""

source_unit_name: str
""" The computed source unit name, see the solidity docs for how this is computed """
contents: str
""" Source code """
origin: Optional[Path]
""" Path to the source unit on disk, if it was loaded from disk """
ast_creator_callback: Optional[Callable[[str], List[solnodes.SourceUnit]]]
""" Optional function for changing the AST creation method, e.g. for testing and forcing the parser version """

@property
def ast(self) -> List[solnodes.SourceUnit]:
# Mechanism for creating the AST on demand and caching it
""" Property for getting the AST from the source code lazily """
if not hasattr(self, '_ast'):
logging.getLogger('VFS').debug(f'Parsing {self.source_unit_name}')
creator = self.ast_creator_callback
Expand All @@ -44,10 +59,28 @@ def ast(self) -> List[solnodes.SourceUnit]:


ImportMapping = namedtuple('ImportMapping', ['context', 'prefix', 'target'])
""" An import remapping for changing the source unit name before the import is resolved """


class VirtualFileSystem:
def __init__(self, base_path: str, cwd: str = None, include_paths: List[str] = None, compiler_version: Version = None):
"""
This is the "virtual file system" defined in the Solidity docs and implemented in solc. The idea is to abstract
away the specifics of how the sources are stored, such as on disk or in memory and the paths used in the source
files to resolve imports. The code is not ideal but it emulates the behaviour of the c++ code of solc.
https://docs.soliditylang.org/en/v0.8.17/path-resolution.html
"""

def __init__(self, base_path: str | Path,
cwd: str | Path = None,
include_paths: List[str | Path] = None,
compiler_version: Version = None):
"""
:param base_path: Project base path (e.g. the directory containing all project files)
:param cwd: Current working directory(e.g. invocation directory of solc)
:param include_paths: List of paths of libraries and source folders
:param compiler_version: Version of the parser to use, if not specified, the version in source files will be used
"""
if cwd is None:
cwd = os.getcwd()
self.cwd = cwd
Expand Down
Loading

0 comments on commit 5abd0d6

Please sign in to comment.