diff --git a/tools/sctrace/README.md b/tools/sctrace/README.md index a993542ea..27bc0c440 100644 --- a/tools/sctrace/README.md +++ b/tools/sctrace/README.md @@ -1,189 +1,268 @@ # Syscall Compatibility Tracer -Syscall Compatibility Tracer (`sctrace`) is a powerful system call compatibility -verification tool that analyzes and validates system call against user-defined -patterns. Written in -[SCML (System Call Matching Language)](https://asterinas.github.io/book/kernel/linux-compatibility/syscall-flag-coverage/system-call-matching-language.html), -these patterns describe supported functionality of system calls. -`sctrace` supports both real-time monitoring of running programs and post-analysis of -existing trace logs, providing comprehensive insights into system call compatibility -with intuitive pattern matching and visual feedback. +Syscall Compatibility Tracer (`sctrace`) is an oracle +to answer the following questions +for developers and users of a Linux ABI-compatible OS +(such as [Asterinas](https://github.com/asterinas/asterinas)): +**Is a target Linux application supported by the OS? +If not, where are the gaps?** -## Features +## Motivation -- **Pattern-based filtering**: Define system call patterns using SCML syntax -- **Dual mode operation**: - - Online mode: Real-time tracing of running programs - - Offline mode: Analysis of existing strace log files -- **Multi-threaded support**: Automatic handling of multi-threaded program traces with syscall reconstruction. -When tracing multi-threaded programs, strace may split system calls across multiple lines due to thread interleaving. -`sctrace` automatically handles this reconstruction. -- **Multiple SCML files support**: Specify multiple `.scml` files as arguments to load all of them. -Each file maintains its own scope for bitflags and struct definitions, preventing cross-file pollution. +There are tons of Linux ABI-compatible OSes out there: +some of them have been deployed at scale +(e.g., [HongMeng Kernel](https://www.usenix.org/conference/osdi24/presentation/chen-haibo) and [gVisor](https://gvisor.dev/)), +some are in rapid development +(e.g., [Asterinas](https://github.com/asterinas/asterinas)), +some target niche markets +(e.g., [Occlum](https://github.com/occlum/occlum)), +some are research prototypes +(e.g., [Graphene](https://grapheneproject.io/)), +and some are just hobby projects +(e.g., [Maestro](https://github.com/maestro-os/maestro)). +They all provide a subset of Linux system calls and features +so that at least some Linux applications can run on them _unmodified_. -## How to build and install +But here is a pain point for the developers and early adopters +of such a Linux ABI-compatible OS: +**when a Linux application is to be ported to this (imperfectly) +Linux ABI-compatible OS, +how can we know beforehand if the target application +is supposed to be supported or not? +And if not, where are the gaps?** -### Prerequisites - -- [**strace**](https://strace.io/) version 5.15 or higher (for online mode) - - Debian/Ubuntu: `sudo apt install strace` - - Fedora/RHEL: `sudo dnf install strace` -- Rust toolchain - -### Build instructions - -Make sure you have Rust installed, then build the project: +A common practice is to run the target Linux application +with the classic [strace](https://strace.io/) tool, +which traces and prints all system calls invoked by an application. +For example, running a Hello World program with `strace`: ```bash -cargo build --release +strace ./hello_world ``` -The binary will be available at `target/release/sctrace`. +would generate output as shown below: -### Installation instructions - -To install the binary (for example, to `/usr/local/bin`), -you can use: - -```bash -sudo cp target/release/sctrace /usr/local/bin/ +``` +execve("./hello_world", ["./hello_world"], 0xffffffd3f710 /* 4 vars */) = 0 +brk(NULL) = 0xaaaabdc1b000 +mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffff890f4000 +openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 +read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\360\206\2\0\0\0\0\0"..., 832) = 832 +fstat(3, {st_mode=S_IFREG|0755, st_size=1722920, ...}) = 0 +... +write(1, "Hello, World!\n", 14) = 14 +exit_group(0) = ? ``` -Or you can install from `crates.io` directly (Recommended): +As `strace` captures all interactions between the application +and the OS kernel, +its log provides sufficient information to assess if there are any +compatibility issues. +But `strace`-ing a complex application might output millions of lines of log. +It would be too tedious for a human to review. +Writing an ad-hoc log processing script or tool would greatly reduce the human labor. +But this approach is error-prone and its results would be inaccurate +as it lacks the ground truth about all supported (or unsupported) +Linux system calls of the target OS. + +This is where `sctrace` can be a life saver. + +## Introduction + +We introduce Syscall Compatibility Tracer (`sctrace`), +a tool that checks whether all system calls invoked +by a target Linux application are supported +by a target Linux ABI-compatible OS or not. +To achieve this goal, +it combines the classic `strace` tool +with a mini domain-specific language called +[System Call Matching Language (SCML)](https://asterinas.github.io/book/kernel/linux-compatibility/syscall-flag-coverage/system-call-matching-language.html). +SCML adopts a `strace`-inspired syntax, +with which one can specify all supported system call patterns +in a concise, accurate, and human-readable way. + +The `sctrace` tool originates from the [Asterinas](https://github.com/asterinas/asterinas) project +and is released so that it may be useful to the wider OS community. + +## Getting Started + +### Installation + +The `sctrace` tool has two prerequisites: + +* [**strace**](https://strace.io/) version 5.15 or higher + * Install on Debian/Ubuntu: `sudo apt install strace` + * Install on Fedora/RHEL: `sudo dnf install strace` +* The Rust toolchain (the version `nightly-2025-12-06` is tested; other versions may be supported as well) + * Install via [Rustup](https://rustup.rs/): `curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh` + +To install the `sctrace` tool, execute the following command: ```bash cargo install sctrace ``` -This will automatically download, build, and install the latest version of `sctrace`. - -## Usage - -### Basic Syntax +### Basic syntax ```bash -sctrace [SCML_FILE2 ...] [OPTIONS] -- [program] [args...] +sctrace ... -- ... ``` -### Options +* `...` gives one or more SCML files +that specify the supported system call patterns of a target OS. +* `` gives the name or path of the target Linux program. +* `...` provides zero, one, or more arguments for ``. -- `--input `: Specify input file for offline mode -- `--quiet`: Enable quiet mode (only show unsupported calls) +### A Hello World Example -### Online Mode (Real-time tracing) +As `sctrace` requires SCML files as its input, +let's write a simple one called `open.scml`: -Trace a program in real-time: +```c +// Some (not all) valid flag patterns for the `open` and `openat` syscalls +open_flags = O_RDONLY | O_WRONLY | O_RDWR | O_CLOEXEC; + +// Some (not all) valid patterns for the `open` syscall +open(path, flags = ); +// Some (not all) valid patterns for the `openat` syscall +openat(dirfd, path, flags = ); +``` + +This file describes some (not all) valid patterns +for Linux's [`open` and `openat`](https://man7.org/linux/man-pages/man2/open.2.html) system calls. +SCML's syntax and semantics resemble those of `strace` and the C language. +For more explanation about SCML syntax, +see its [documentation](https://asterinas.github.io/book/kernel/linux-compatibility/syscall-flag-coverage/system-call-matching-language.html). + +To see `sctrace` in action, +we now use it to track the execution of a simple command +that prints out the content of `open.scml`: ```bash -sctrace pattern1.scml pattern2.scml -- ls -la -sctrace file_ops.scml network.scml --quiet -- ./my_program arg1 arg2 +sctrace open.scml -- cat open.scml ``` -### Offline Mode (Log file analysis) +The output would look like below: -Analyze an existing strace log file: +``` +1045884 execve("/usr/bin/cat", ["cat", "open.scml"], 0x5d08ad413588 /* 25 vars */) = 0 (unsupported) +1045884 brk(NULL) = 0x5a3f59f86000 (unsupported) +1045884 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7d321cf1f000 (unsupported) +1045884 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) (unsupported) +1045884 openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 5 +1045884 fstat(5, {st_mode=S_IFREG|0644, st_size=39847, ...}) = 0 (unsupported) +1045884 mmap(NULL, 39847, PROT_READ, MAP_PRIVATE, 5, 0) = 0x7d321cf15000 (unsupported) +1045884 close(5) = 0 (unsupported) +1045884 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 5 +1045884 read(5, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220\243\2\0\0\0\0\0"..., 832) = 832 (unsupported) +``` + +The `(unsupported)` tag is appended to almost every system call entry +(except `open` and `openat`) +as `sctrace` only recognizes the valid patterns specified by `open.scml`. +Expanding the pattern rules in the input SCML files would allow `sctrace` +to recognize more system calls, as we will show later. + +## User Guide + +### Command-Line Interface (CLI) + +The `sctrace` tool supports two modes: +the *online* and *offline* modes. + +#### Online Mode (Real-time Tracking) + +In the online mode, `sctrace` tracks a running command in real time. +This is the mode we described in the Hello World example. +Its complete CLI syntax is shown below: ```bash -sctrace pattern1.scml pattern2.scml --input trace.log -sctrace file_ops.scml network.scml --input trace.log --quiet +sctrace ... [--quiet] -- ... ``` -**Note**: When generating strace logs for offline analysis, use `-yy` and `-f` flags: +If the `--quiet` option is given, +then only the **unsupported** system calls are shown in the output, +making it easier to spot compatibility gaps. + +#### Offline Mode (Log Analysis) + +The offline mode does not run a user-given command; +instead, it analyzes a user-given `strace` log of a command. +The CLI syntax is shown below: ```bash -strace -yy -f -o trace.log ls -la +sctrace ... [--quiet] --input ``` -- `-yy`: Print paths associated with file descriptor arguments -- `-f`: Trace child processes created by fork/vfork/clone - -## Examples - -### Example 1: Basic File Operations - -Create `file_ops.scml`: -```scml -openat(dirfd, flags = O_RDONLY | O_WRONLY | O_RDWR, mode); -read(fd, buf, count = ); -write(fd, buf, count = ); -close(fd); -``` - -Run: -```bash -sctrace file_ops.scml -- cat /etc/passwd -``` - -### Example 2: Network Operations - -Create `network.scml`: -```scml -socket(domain = AF_INET | AF_INET6, type = SOCK_STREAM | SOCK_DGRAM, protocol); -connect(sockfd, addr, addrlen); -send(sockfd, buf, len, flags); -recv(sockfd, buf, len, flags); -``` - -Run: -```bash -sctrace network.scml -- curl http://example.com -``` - -### Example 3: Using Asterinas Compatibility Patterns - -Use the provided directory [syscall-flag-coverage](../../book/src/kernel/linux-compatibility/syscall-flag-coverage) (work in progress) and -test with various commands: +The input file `strace_log` is expected to be generated +using the following specific form of `strace`: ```bash -# Monitor file system operations -sctrace $(find . -name "*.scml") -- tree . - -# Monitor process information calls -sctrace $(find . -name "*.scml") -- top - -# Monitor network operations -sctrace $(find . -name "*.scml") -- ping 127.0.0.1 +strace -yy -f -o ... ``` -### Example 4: Offline Analysis +The meaning of the `--quiet` option is the same as that in the online mode. + +### Using `sctrace` for Asterinas + +The syscall coverage of Asterinas has been formally [documented](https://asterinas.github.io/book/kernel/linux-compatibility/syscall-flag-coverage/system-call-matching-language.html) in SCML. +The entire set of SCML files can be found +in the [`syscall-flag-coverage/`](../../book/src/kernel/linux-compatibility/syscall-flag-coverage/) directory of the Asterinas book. + +To fetch these SCML files, run the following command: ```bash -# Generate trace log -strace -yy -f -o trace.log ls -la - -# Analyze with sctrace -sctrace patterns.scml --input trace.log +git clone --depth 1 https://github.com/asterinas/asterinas +cd asterinas/book/src/kernel/linux-compatibility/syscall-flag-coverage/ ``` -## Output +You can now leverage these files to +check if a program can be ported to Asterinas: -`sctrace` provides colored output to distinguish between supported and unsupported system calls: - -- **Supported calls**: Normal output (or hidden in quiet mode) -- **Unsupported calls**: Highlighted in red with "unsupported" message - -### Example Output - -``` -openat(AT_FDCWD, "/etc/passwd", O_RDONLY) = 3 -read(3, "root:x:0:0:root:/root:/bin/bash\n"..., 4096) = 1234 -close(3) = 0 -chmod("/tmp/test", 0755) (unsupported) +```bash +sctrace $(find . -name "*.scml") -- ... ``` -## Dependencies +In the Asterinas development Docker image, +we have pre-installed the `sctrace` tool. +For convenience, +the Docker image sets an environment variable called `ASTER_SCML`, +which is the list of all Asterinas SCML files. +This helps simplify using `sctrace` for Asterinas. -- `clap`: Command-line argument parsing -- `regex`: Regular expression support -- `nom`: Parser combinator library -- `nix`: Unix system interface for process management +```bash +sctrace $ASTER_SCML [--quiet] -- ... +sctrace $ASTER_SCML [--quiet] --input +``` -## Troubleshooting +### Troubleshooting -### Permission Issues - -For online tracing, you may need elevated privileges: +For online tracing, you may need elevated privileges +to attach to the target process using `ptrace`: ```bash sudo sctrace patterns.scml -- target_program ``` + +## Developer Guide + +The source code of `sctrace` resides within the Asterinas project. +So the first step is to download the Asterinas codebase: + +```bash +git clone https://github.com/asterinas/asterinas +``` + +The `sctrace` tool can be located in `tools/sctrace/`: + +```bash +cd tools/sctrace +``` + +The tool is written in Rust. +So you will need to use Cargo to build and test it. + +```bash +cargo build +cargo test +```