Roadmap
feat: improve observability
General cleanup/improvements of the observability stack. For more context about those changes please refer to this [document](https://gist.github.com/jeluard/d4034da7290e3702f0166e6bfec539f3).
Switch to the new uplc_turbo to avoid memory leaks
Bring in the fixes from https://github.com/pragma-org/uplc/pull/36
add new track_peers stage
This stage will replace pull+receive_header+validate_header stages as well as incorporate the upstream peer tracking currently performed in the HeadersTree that is employed by select_chain. This is a sketch of the envisioned consensus stages, whereof this PR is the first step.  This is also the first step of #658.
add EraName enum and era_name property to era Summary
This will be used in a subsequent PR to validate blocks and headers received from the network.
Document the uplc memory leak
We found ourselves in an architecturally awkward situation with Bump in uplc. Let's document some of the decisions and implications.
test: add a test showing that an initiator can eventually connect when the responder is started
The PR makes sure that we can connect to an upstream node if:
feat: finish the implementation of the block fetch protocol
This PR completes (on the responder side) and tests the retrieval of blocks with the blockfetch protocol:
Add ledger checks to the simulation
### Abstract
Add the txsubmission protocol to the simulation
### Abstract
Integrate the new network stack to the simulation
### Abstract
move upstream peer tracking and block validation out of `select_chain` stage
### Abstract
Integrate the new network stack downstream
### Abstract
test: add a proto-ledger simulating roll forwards and backwards
*WIP!!!*
RocksDB error handling, health monitoring, best practices
### Abstract
tracing improvements
While establishing why syncing in the e2e tests is slow, I found the need to improve some observability features:
chore: tidy kernel modules
This is a big and unpleasant commit; but vastly necessary. Multiple things are going on, but before I explain further: why are we doing this?
fix: allow the ledger to rollback a state that has just been rolled forward on the volatile state
This issue has been found when trying the reconnection behavior of the node where we happen to roll forward some blocks, and then try to rollback to the last block of the volatile sequence.
initiator mode on incoming connections
### Abstract
initial step of dynamic peer selection
### Abstract
initial set of netsim tests
### Abstract
fix: set the proper max buffer size for the blockfetch and txsubmission protocols
According to [the network specification](https://ouroboros-network.cardano.intersectmbo.org/pdfs/network-spec/network-spec.pdf).
pedantic conway txbody
<!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit
feat: allow to customize AMARU_TRACE when running e2e tests
<!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit
Research and draft a (Linear) Leios-ready mempool design
### Abstract
fix scripts/demo JSON handling
`read line` removes backslashes which destroys the proper JSON emitted by our tracing subscriber
Error on preview
Preview with current `main` doesn't sync the whole chain:
Amaru *not* crashing after running out of device space, get stuck.
### What revision are you using?
Better observability
### Abstract
Too many similar logs when amaru fails to connect to otel collector
When one starts amaru with `--with-open-telemetry` and the required collector is not available, we get a lot of error messages in the logs:
Generate more recent snapshots & document (manual) bootstrap configuration & steps
### Abstract
downstream connections and integration tests
This PR:
add chainsync impl based on pure-stage
no individual test yet, will test when new stages are complete
fix: rework cli for consistency and UX
Mainly:
perf: tiny memory cleanups
Three very small and unrelated optimizations, each one of which improves the number of short lived allocations we make while not having a noticeable difference to performance or total memory usage. It should not be too hard to collect quantitative results if requested.
feat: revise CONTRIBUTING guidelines, README, and issue templates.
This commit also proposes the new maintainers committee members as EDR-018.
Publish on brew
Add bench capacities
Add a new `bench` subcommands allowing to provide metrics about current hosts. Those should be relevant to `amaru` e.g.
Allow to browse static JSON
`amaru` can output JSON traces. Allow to open them and display details.
Allow to browse metrics
OpenTelemetry offers metrics primitives. Create a new screen allowing to see at a glimpse the most important ones exposed by amaru. It should leverage ratatui primitives to display metrics.
Visualise current network connections state
Showing the state of incoming/outgoing (or initiator/responder as mini protocols' terminology has it) would be useful to troubleshoot a node's state. In particular, we'll need to be able to trigger disconnections of "faulty" nodes so while we'll obviously trace that, visualising the links from/to a node would be nice.
Publish on crates.io
Will allow use of https://github.com/cargo-bins/cargo-binstall
Add static querying capacities
`amaru` should offer static querying capacities e.g. access db details
Allow to search DReps by ids
Add VIM keybindings for navigation
Make sure VIM like keybindings are added on top of regular keybindings for navigation between various panels.
Add a main screen showing off instances details
Could take inspiration from https://github.com/blinklabs-io/nview
Publish on crates
This is currently blocked by amaru not being published on crates
Introduce indexing
Searching entities by most key requires full scan. Introduce a local index mechanism to speed things up.
Allow to be used as a library
Make sure `amaru-doctor` can be used as a library in `amaru` directly.
Allow to browse resource deltas between two epochs
Allow fine-grained time analysis of turbo benches.
## Standard bench
Add Value type, 9 new builtins, case on constants
This passes the latest conformance test suite freshly downloaded from the [plutus](https://github.com/IntersectMBO/plutus) repository. Much of this is needed for PlutusV4 support.
Nightly bechmarks
one more place alloc_integer is needed
Ci in https://github.com/pragma-org/amaru/pull/669 pointed out that we missed one. It seams this code path is not hit in the tests in this repo.
Make sure integers get dropped by a list to be freed
The integer type stores its data in a separate allocation in the heap, relying on the drop implementation free that backing data. We put that integer in a `bumpalo::Bump`. Which does not run drop on its contents. The result is a pretty dramatic memory leak.