Here we provide a few informal suggestions to help you fill in your Artifact Appendix for submission while avoiding common pitfalls. Keep in mind that this document is meant to be public, therefore avoid including confidential information in the final document.
Briefly and informally describe your artifact including minimal hardware and software requirements, how it supports your paper, how it can be validated, and what is the expected result. It will be used to select appropriate reviewers. It will also help readers understand what was evaluated and how.
Together with the artifact abstract, this checklist will help us make sure that reviewers have appropriate competency and an access to the technology required to evaluate your artifact. It can also be used as meta information to find your artifacts in Digital Libraries.
Fill in whatever is applicable with some informal keywords and remove unrelated items (please consider questions below just as informal hints that reviewers are usually concerned about):
- Algorithm: Are you presenting a new algorithm?
- Program: Which benchmarks do you use (PARSEC, ARM real workloads, NAS, EEMBC, SPLASH, Rodinia, LINPACK, HPCG, MiBench, SPEC, cTuning, etc)? Are they included or should they be downloaded? Which version? Are they public or private? If they are private, is there a public analog to evaluate your artifact? What is the approximate size?
- Compilation: Do you require a specific compiler? Public/private? Is it included? Which version?
- Transformations: Do you require a program transformation tool (source-to-source, binary-to-binary, compiler pass, etc)? Public/private? Is it included? Which version?
- Binary: Are binaries included? OS-specific? Which version?
- Model: Do you use specific models (ImageNet, AlexNet, MobileNets)? Are they included? If not, how to download and install? What is their approximate size?
- Data set: Do you use specific data sets? Are they included? If not, how to download and install them? What is their approximate size?
- Run-time environment: Is your artifact OS-specific (Linux, Windows, MacOS, Android, etc) ? Which version? Which are the main software dependencies (JIT, libs, run-time adaptation frameworks, etc); Do you need root access?
- Hardware: Do you need specific hardware (supercomputer, architecture simulator, CPU, GPU, neural network accelerator, FPGA) or specific features (hardware counters to measure power consumption, SUDO access to CPU/GPU frequency, etc)? Are they publicly available?
- Run-time state: Is your artifact sensitive to run-time state (cold/hot cache, network/cache contentions, etc.)?
- Execution: Any specific conditions should be met during experiments (sole user, process pinning, profiling, adaptation, etc)? How long will they approximately run?
- Security, privacy, and ethical concerns: Are there any specific security, privacy, or ethical concerns with running the experiments (malware sample sandboxing, ethical network scanning, etc.)?
- Metrics: Which metrics are reported (execution time, inference per second, Top1 accuracy, static and dynamic energy consumption, vulnerabilities mitigated, etc)?
- Output: What is your output (console, file, table, graph) and what is your result (exact output, numerical results, measured characteristics, etc)? Is the expected result included?
- Experiments: How to prepare experiments and replicate/reproduce results (OS scripts, manual steps by user, IPython/Jupyter notebook, automated workflows, etc)? Do not forget to mention the maximum allowable variation of empirical results!
- How much disk space is required (approximately)?: This can help evaluators and end-users to find appropriate resources.
- How much time is needed to prepare workflow (approximately)?: This can help evaluators and end-users to estimate resources needed to evaluate your artifact.
- How much time is needed to complete experiments (approximately)?: This can help evaluators and end-users to estimate resources needed to evaluate your artifact.
- Publicly available?: Will your artifact be publicly available?
- Code licenses (if publicly available)?: If your workflows and artifacts will be publicly available, please provide information about licenses. This will help the community to reuse your components.
- Data licenses (if publicly available)?: If your data sets will be publicly available, please provide information about licenses. This will help the community to reuse your components.
- Workflow frameworks used?: Did authors use any workflow framework which can automate and customize experiments?
- Archived (provide DOI or stable reference)?: Are your artifacts archived with a DOI or stable reference (relevant only when applying for an “Artifacts Available” badge)? We encourage authors to use Zenodo, which is a publicly-funded long-term storage platform that also assigns a DOI for your artifact. Other valid hosting options include institutional repositories and third-party digital repositories (e.g., FigShare, Dryad, Software Heritage, GitHub, or GitLab—not personal web pages). For repositories that can evolve over time (e.g., GitHub), a stable reference to the evaluated version (e.g., a commit hash) is required.
How to access
When applying for the Artifact Available badge, describe how readers can access your artifact, for instance:
- Clone repository from GitHub, GitLab, BitBucket or any similar service. Since those are evolving repositories, please provide a stable reference for readers to access (e.g., commit hash).
- Download a self-contained package from a website (please specify any credentials that might be needed)
Describe any specific hardware features required to evaluate your artifact (vendor, CPU/GPU/FPGA, number of processors/cores, microarchitecture, interconnect, memory, hardware counters, etc).
Please describe the approximate disk space required after unpacking your artifact (to avoid surprises when the artifact requires 100GB of free space). We do not have a strict limit but strongly suggest to limit the space to a few GBs and avoid including unnecessary software packages to your VM images.
Describe any specific OS and software packages required to evaluate your artifact. This is particularly important if you share your source code and it must be compiled or if you rely on some proprietary software that you cannot include in your package. In such a case, you must describe how to obtain and to install all third-party software, data sets, and models. If obtaining them for the reviewers is infeasible (e.g., due to licensing issues such as in the case of SPEC benchmarks) and there are no public alternatives for evaluation, please provide reviewers with remote access to a target evaluation machine instead.
If third-party data sets are not included in your packages (for example, they are very large or proprietary), please provide details about how to download and install them (or remote access to a target machine, if infeasible).
If third-party models are not included in your packages (for example, they are very large or proprietary), please provide details about how to download and install them (or remote access to a target machine, if infeasible).
Security, Privacy, and Ethical Concerns
If there are any special security, privacy, or ethical concerns with running the experiments, please elaborate on them here.
Describe the setup procedures for your artifact targeting novice users (even if you use a VM image or access to a remote machine).
Describe the experimental workflow and how it is implemented, invoked and customized (if needed), i.e. some OS scripts, IPython/Jupyter notebook, portable CK workflow, etc. See past reproduced papers with a similar Artifact Appendix.
Evaluation and Expected Result
Start by listing the main claims in your paper. Next, list your key results and detail how they each support the main claims. Finally, detail all the steps to reproduce each of the key results in your paper by running the artifacts. Describe the expected results and the maximum variation of empirical results (particularly important for performance numbers). See the SIGARCH empirical checklist, the NeurIPS reproducibility checklist and the AE FAQ.
It is currently optional since it is not always trivial. If possible, describe how to customize your workflow, i.e., if it is possible to use different data sets, benchmarks, real applications, predictive models, software environment (compilers, libraries, run-time systems), hardware, etc. Also, describe if it is possible to parameterize your workflow (whatever is applicable such as changing number of threads, applying different optimizations, CPU/GPU frequency, autotuning scenario, model topology, etc).
You can add informal notes to draw the attention of reviewers about specific requirements to evaluate your artifact.
This document was prepared by Grigori Fursin with contributions from Bruce Childers, Michael Heroux, Michela Taufer and other colleagues for ctuning.org, and with contributions from Cristiano Giuffrida and Clémentine Maurice for its adaptation to USENIX Security.