The Smart Contract Repository

The Smart Contract Repository is a collection of Solidity Smart Contracts that are publicly available on GitHub. The repository contains multiple versions of the same contract to help the analysis of the effects of incremental changes in source code. This User Interface is connected to a REST API in the background that is based on a MongoDB database. The database holds the contracts and their versions including the source code. The database is split into to collections. The first collection holds a number of plain Solidity files that were retreived from GitHub and that are called Smart Contracts in the following. The second collection holds a smaller number of flattened Solidity Smart Contracts that are optimized to be standalone and compilable without any dependencies. The flattened contracts are called Flat Contracts in the following. The Flat Contracts were generated using a flattener tool and guarantee no correctness.

# About

Blockchain technologies are a growing field of research and public interest. Second generation blockchains, like Ethereum, allow users to execute smart contracts, which are distributed applications executing user-defined logic. This not only expands the utility of blockchains, but also provides new opportunities for tools for solving the same issues that arise with conventional programs. Such tools include optimization of code, detecting code smells and vulnerabilities and developing automated code generation models among others. Such developments require the presence of a dataset of similar code which can be annotated and analyzed according to the desired end application. Existing sources of smart contract code include block explorers that provide limited search and retrieval capabilities on the single version of the contract deployed on the blockchain. Hence, to aid the progress of research, this tool introduces the Smart Contract Repository — a repository of publicly available Solidity smart contracts, complete with multiple versions of the same contract to help the analysis of the effects of incremental changes in source code.

The Smart Contracts in this repository are retrieved from GitHub using the GitHub File Scraper. This is a tool that was also developed by the author of this repository. The scraper is a Python script that exploits the GitHub API to retrieve a list of repositories that contain a specific file. The scraper then retrieves the files from each repository and stores them. The scraper is currently configured to retrieve all Solidity smart contracts from GitHub that are licensed. The scraper is run periodically to collect new contracts for this repository.


# API

This repository is also available as an API. The API is available at https://scr.ide.tuhh.de/api/contracts/ or https://scr.ide.tuhh.de/api/flatcontracts/ respectively. The API is documented at https://scr.ide.tuhh.de/api/docs. Take into account that a rate limit of 1000 requests per minute is applied to the API in order to prevent abuse.



# Credits

This is a project from the Christian Doppler Labor at the Technical University Hamburg. Thanks to the following people for their contributions:

  • Stefan Schulte
  • Avik Banerjee
  • Michael Schröder
  • Jürgen Cito
  • Carl Egge

# License

This project is licensed under the MIT License.