Overview
We welcome contributions to BOOM! This guide covers our development conventions and best practices for error handling, instrumentation, and logging.Errors, instrumentation, and logging
Here are some conventions we try to adhere to. It’s important to emphasize that none of this is set in stone. We evolve as our needs change and as we discover better ways of doing things.Error handling
Panicking viaResult::unwrap, Result::expect, etc. is almost always discouraged.
All other errors are recoverable errors and should be expressed using custom error types (usually enums) that implement std::error::Error. These errors should be propagated to the caller using the ? operator.
Using
thiserror to help define custom errors is recommended, but not required.- Whether to reuse an existing error type in BOOM versus create a new one isn’t always clear, but can be discussed during review
- We may at some point shift to using the “accumulator” pattern with types like
anyhow::Errororeyre::Report, reducing the need for as many custom error types - Each error is propagated until it’s handled, i.e., when a caller intercepts the error value to either use it for control flow or log it
Instrumentation
We usetracing::Spans as the foundation for instrumenting BOOM. Briefly, a span represents a duration, e.g., the execution of a function. Each span has a name and optionally a set of values recorded as fields. Spans can be nested.
When a tracing::Event is created, e.g., using tracing::info! or tracing::error!, it comes with the list of its active parent spans and their fields. This provides structured context that helps make events easier to understand.
Using the instrument attribute
Thetracing::instrument attribute is a preferred way to define a span for a function or method. This is not the only way to define a span, and spans are not required to be tied to functions, but using instrument is very convenient.
instrument automatically records each function argument as a field, so we need to be mindful about what gets collected, skipping anything that has a large repr or that doesn’t provide useful context. A notable example is self in methods, which isn’t useful to track, so we skip it by instrumenting methods with #[instrument(skip(self))].What to instrument
As for deciding what to instrument, we should consider instrumenting:- Anything that helps provide context for events (particularly ERROR events)
- Anything that we’re interested in profiling for performance
- Subscribers can be configured to consume “span close” events, which include execution time details (see
BOOM_SPAN_EVENTSin the README) - The
tracing_flamecrate uses spans to generate flamegraphs - Having instrumented functions also unlocks further observability capabilities that we could add in the future, such as metrics
Logging conventions
Logging is done by issuingtracing::Events, as with tracing::info!, tracing::error!, and other similar macros. These events are picked up by a subscriber, which can do various things with them, such as print them to the console or write them to a file. Currently we just log to the console.
Log levels
Strive to log things at the appropriate level:INFO level
INFO level
Reserved for reporting major stages in the app lifecycle.
WARN level
WARN level
For minor issues or unexpected things that aren’t quite errors but are more severe than INFO.
ERROR level
ERROR level
For actual errors. In general, we should only log errors where they’re handled. A function doesn’t need to log an error if it propagates the error back to the caller via
?.DEBUG and TRACE levels
DEBUG and TRACE levels
For when we need to rerun BOOM, usually locally, to get more information.
Logging errors with context
When we log an error we want as much context as possible, including the error itself, the cause chain, and any additional context that can be provided.- The
log_errorandas_errormacros help standardize how we report errors as ERROR events. See theo11ymodule for details - Some errors occur more than once in a given function.
log_error/as_errorcan be used to give each occurrence a unique event callsite and therefore a clear origin when logged
Generating flame graphs
BOOM supports generating a flame graph to visualize performance bottlenecks:Run BOOM with flame graph generation
Run BOOM with the Terminate BOOM when you’re done profiling.
BOOM_FLAME_FILE environment variable. This instructs BOOM to generate a flame graph and save the output at the given path: