Roadmap

about 2 hours ago

feat: improve observability

General cleanup/improvements of the observability stack. For more context about those changes please refer to this [document](https://gist.github.com/jeluard/d4034da7290e3702f0166e6bfec539f3).

Switch to the new uplc_turbo to avoid memory leaks

Bring in the fixes from https://github.com/pragma-org/uplc/pull/36

add new track_peers stage

This stage will replace pull+receive_header+validate_header stages as well as incorporate the upstream peer tracking currently performed in the HeadersTree that is employed by select_chain. This is a sketch of the envisioned consensus stages, whereof this PR is the first step. ![EE60C7DC-EBBF-440D-8027-88D2387BF137_1_102_a](https://github.com/user-attachments/assets/9e5bf2c3-7f5e-43c6-a71e-ecb5f61d694c) This is also the first step of #658.

add EraName enum and era_name property to era Summary

This will be used in a subsequent PR to validate blocks and headers received from the network.

Document the uplc memory leak

We found ourselves in an architecturally awkward situation with Bump in uplc. Let's document some of the decisions and implications.

test: add a test showing that an initiator can eventually connect when the responder is started

The PR makes sure that we can connect to an upstream node if:

feat: finish the implementation of the block fetch protocol

This PR completes (on the responder side) and tests the retrieval of blocks with the blockfetch protocol:

Add ledger checks to the simulation

### Abstract

Add the txsubmission protocol to the simulation

### Abstract

Integrate the new network stack to the simulation

### Abstract

move upstream peer tracking and block validation out of `select_chain` stage

### Abstract

Integrate the new network stack downstream

### Abstract

test: add a proto-ledger simulating roll forwards and backwards

*WIP!!!*

RocksDB error handling, health monitoring, best practices

### Abstract

tracing improvements

While establishing why syncing in the e2e tests is slow, I found the need to improve some observability features:

chore: tidy kernel modules

This is a big and unpleasant commit; but vastly necessary. Multiple things are going on, but before I explain further: why are we doing this?

fix: allow the ledger to rollback a state that has just been rolled forward on the volatile state

This issue has been found when trying the reconnection behavior of the node where we happen to roll forward some blocks, and then try to rollback to the last block of the volatile sequence.

initiator mode on incoming connections

### Abstract

initial step of dynamic peer selection

### Abstract

initial set of netsim tests

### Abstract

fix: set the proper max buffer size for the blockfetch and txsubmission protocols

According to [the network specification](https://ouroboros-network.cardano.intersectmbo.org/pdfs/network-spec/network-spec.pdf).

pedantic conway txbody

## Summary by CodeRabbit

feat: allow to customize AMARU_TRACE when running e2e tests

## Summary by CodeRabbit

Research and draft a (Linear) Leios-ready mempool design

### Abstract

fix scripts/demo JSON handling

`read line` removes backslashes which destroys the proper JSON emitted by our tracing subscriber

Error on preview

Preview with current `main` doesn't sync the whole chain:

Amaru not crashing after running out of device space, get stuck.

### What revision are you using?

Better observability

### Abstract

Too many similar logs when amaru fails to connect to otel collector

When one starts amaru with `--with-open-telemetry` and the required collector is not available, we get a lot of error messages in the logs:

Generate more recent snapshots & document (manual) bootstrap configuration & steps

### Abstract

downstream connections and integration tests

This PR:

add chainsync impl based on pure-stage

no individual test yet, will test when new stages are complete

fix: rework cli for consistency and UX

Mainly:

perf: tiny memory cleanups

Three very small and unrelated optimizations, each one of which improves the number of short lived allocations we make while not having a noticeable difference to performance or total memory usage. It should not be too hard to collect quantitative results if requested.

feat: revise CONTRIBUTING guidelines, README, and issue templates.

This commit also proposes the new maintainers committee members as EDR-018.

Publish on brew

Add bench capacities

Add a new `bench` subcommands allowing to provide metrics about current hosts. Those should be relevant to `amaru` e.g.

Allow to browse static JSON

`amaru` can output JSON traces. Allow to open them and display details.

Allow to browse metrics

OpenTelemetry offers metrics primitives. Create a new screen allowing to see at a glimpse the most important ones exposed by amaru. It should leverage ratatui primitives to display metrics.

Visualise current network connections state

Showing the state of incoming/outgoing (or initiator/responder as mini protocols' terminology has it) would be useful to troubleshoot a node's state. In particular, we'll need to be able to trigger disconnections of "faulty" nodes so while we'll obviously trace that, visualising the links from/to a node would be nice.

Publish on crates.io

Will allow use of https://github.com/cargo-bins/cargo-binstall

Add static querying capacities

`amaru` should offer static querying capacities e.g. access db details

Allow to search DReps by ids

Add VIM keybindings for navigation

Make sure VIM like keybindings are added on top of regular keybindings for navigation between various panels.

Add a main screen showing off instances details

Could take inspiration from https://github.com/blinklabs-io/nview

Publish on crates

This is currently blocked by amaru not being published on crates

Introduce indexing

Searching entities by most key requires full scan. Introduce a local index mechanism to speed things up.

Allow to be used as a library

Make sure `amaru-doctor` can be used as a library in `amaru` directly.

Allow to browse resource deltas between two epochs

Allow fine-grained time analysis of turbo benches.

## Standard bench

Add Value type, 9 new builtins, case on constants

This passes the latest conformance test suite freshly downloaded from the [plutus](https://github.com/IntersectMBO/plutus) repository. Much of this is needed for PlutusV4 support.

Nightly bechmarks

one more place alloc_integer is needed

Ci in https://github.com/pragma-org/amaru/pull/669 pointed out that we missed one. It seams this code path is not hit in the tests in this repo.

Make sure integers get dropped by a list to be freed

The integer type stores its data in a separate allocation in the heap, relying on the drop implementation free that backing data. We put that integer in a `bumpalo::Bump`. Which does not run drop on its contents. The result is a pretty dramatic memory leak.

fix budget when traversing value structures

Community engagement

We build in public. Want to request something or upvote a feature? Join the conversation on: