Contributing to OpenNANDLab
Thank you for your interest in contributing. This guide covers everything you need to get from zero to a merged pull request.
Table of Contents
1. Quick start
# Fork the repo on GitHub, then:
git clone https://github.com/YOUR_USERNAME/OpenNANDLab.git
cd OpenNANDLab
# Python 3.10+ required
python -m venv .venv
source .venv/bin/activate # Linux / macOS
# .venv\Scripts\activate.bat # Windows CMD
# .venv\Scripts\Activate.ps1 # Windows PowerShell
pip install -e ".[dev]" # installs all dev tools
pre-commit install # installs pre-commit hooks
# Verify setup
pytest --tb=short -q # all tests should pass
mypy src/ # should report 0 errors
If any of these fail on a clean clone, open an issue — that’s a bug.
2. Project structure
OpenNANDLab/
├── src/opennandlab/ # All library code lives here
│ ├── nand/ # Physical device model
│ ├── ftl/ # Flash Translation Layer + GC
│ ├── ecc/ # BCH, LDPC, ECCHandler
│ ├── defect/ # Bad block manager, wear leveling
│ ├── optimization/ # Compression, caching
│ ├── workloads/ # Workload generators, trace replay
│ ├── analytics/ # Metrics, report generation
│ ├── visualization/ # Plotly charts, Streamlit dashboard
│ ├── firmware/ # Spec generation, validation
│ ├── simulator.py # Top-level Simulator class
│ ├── config.py # Pydantic config models
│ └── cli.py # Click CLI entry point
├── tests/
│ ├── unit/ # Module-level unit tests
│ ├── integration/ # End-to-end pipeline tests
│ └── property/ # Hypothesis property-based tests
├── examples/ # Runnable example scripts
├── docs/ # Sphinx source + design docs
├── scripts/ # Benchmark + characterization scripts
├── resources/ # Config templates, images
├── pyproject.toml # Build system + tool config
├── tox.ini # Test environment matrix
└── ARCHITECTURE.md # Internal design reference
3. Development workflow
Branch naming
Type |
Pattern |
Example |
|---|---|---|
Bug fix |
|
|
Feature |
|
|
Documentation |
|
|
Refactor |
|
|
Test |
|
|
Commit messages
Follow Conventional Commits:
feat(ftl): add greedy garbage collector
- Select victim block by max invalid-page count
- Copy valid pages to fresh block before erasing
- Update WAF counter on every GC page move
Closes #42
Types: feat, fix, docs, test, refactor, perf, chore, ci
Running checks locally
# Run all tests
pytest
# Run with coverage report
pytest --cov=src/opennandlab --cov-report=html tests/
# Type checking
mypy src/
# Linting + formatting
ruff check src/ tests/
ruff format src/ tests/
# Full tox matrix (all Python versions)
tox
# Just the property-based tests
pytest tests/property/ -v
# Run a specific benchmark
python scripts/performance_test.py --workload random_write --iterations 10000
4. Code standards
Type annotations
All public functions and methods must be fully annotated:
# Good
def write_page(self, lbn: int, data: bytes) -> None: ...
def read_page(self, lbn: int) -> bytes: ...
# Bad — no annotations
def write_page(self, lbn, data): ...
mypy --strict must pass on all files in src/.
Docstrings (NumPy style)
def rber_model(pe_count: int, cfg: NANDConfig) -> float:
"""
Compute the raw bit error rate as a function of erase cycle count.
Uses a Weibull-inspired model where RBER rises from rber_floor
toward rber_ceil with characteristic lifetime rber_lambda.
Parameters
----------
pe_count : int
Number of program/erase cycles the block has undergone.
cfg : NANDConfig
NAND configuration containing rber_floor, rber_ceil, rber_lambda.
Returns
-------
float
Estimated RBER in the range [rber_floor, rber_ceil).
Examples
--------
>>> cfg = NANDConfig()
>>> rber_model(0, cfg) # should be close to rber_floor
1e-08
"""
Data structures
Use case |
Required structure |
Complexity |
|---|---|---|
LRU cache |
|
O(1) get/put/evict |
Wear leveling |
|
O(log N) insert, O(1) peek min |
L2P mapping |
|
O(1) random access |
Free-block pool |
|
O(1) popleft/append |
Do not use a plain list for wear tracking (linear scan) or a dict for the L2P table (excessive memory overhead vs flat array).
Error handling
Define domain exceptions in src/opennandlab/exceptions.py:
class OpenNANDLabError(Exception): ...
class UncorrectableECCError(OpenNANDLabError): ...
class BadBlockError(OpenNANDLabError): ...
class UnmappedLBNError(OpenNANDLabError): ...
class NANDReadError(OpenNANDLabError): ...
class GCFailedError(OpenNANDLabError): ...
Never use bare except Exception. Always catch the most specific exception.
No stubs in critical paths
This is the single most important rule for this project:
# ILLEGAL — this was the v1.1.0 bug
def write_page(self, ...):
# ... compression ...
# Perform error correction coding
# ... (comment only, no implementation)
If you cannot implement something yet, raise NotImplementedError with a descriptive message and link to the tracking issue. Never leave a comment where code should be.
5. Writing tests
Unit tests
Every module in src/ must have a corresponding tests/unit/test_<module>.py. Tests should be fast (< 100 ms each) and have no filesystem or network I/O.
# tests/unit/test_bch.py
import pytest
from opennandlab.ecc.bch import BCHCodec
class TestBCHCodec:
def test_encode_decode_no_errors(self):
codec = BCHCodec(m=8, t=4)
data = b"hello NAND world" * 16 # 256 bytes
codeword = codec.encode(data)
assert codec.decode(codeword) == data
def test_corrects_exactly_t_errors(self):
codec = BCHCodec(m=8, t=4)
data = bytes(range(256))
codeword = bytearray(codec.encode(data))
# Flip exactly t bits
for i in range(4):
codeword[i * 10] ^= 0x01
assert codec.decode(bytes(codeword)) == data
def test_raises_on_t_plus_1_errors(self):
codec = BCHCodec(m=8, t=4)
data = bytes(256)
codeword = bytearray(codec.encode(data))
for i in range(5): # t + 1 errors
codeword[i * 10] ^= 0x01
with pytest.raises(UncorrectableECCError):
codec.decode(bytes(codeword))
Property-based tests
Use hypothesis for invariant testing. Add all property tests to tests/property/:
# tests/property/test_ecc_properties.py
from hypothesis import given, settings, strategies as st
from opennandlab.ecc.bch import BCHCodec
from opennandlab.exceptions import UncorrectableECCError
@given(
data=st.binary(min_size=128, max_size=4096),
num_errors=st.integers(min_value=0, max_value=4),
)
@settings(max_examples=200)
def test_bch_corrects_up_to_t_errors(data: bytes, num_errors: int):
codec = BCHCodec(m=8, t=4)
codeword = bytearray(codec.encode(data))
# Inject num_errors random bit flips
for pos in random.sample(range(len(codeword)), num_errors):
codeword[pos] ^= (1 << random.randint(0, 7))
assert codec.decode(bytes(codeword)) == data
@given(st.integers(min_value=0, max_value=10_000))
def test_rber_monotonically_increases(pe_count: int):
"""RBER must never decrease as P/E count increases."""
cfg = NANDConfig()
r1 = rber_model(pe_count, cfg)
r2 = rber_model(pe_count + 1, cfg)
assert r2 >= r1
@given(st.integers(min_value=1, max_value=1000))
def test_waf_always_gte_1(num_host_writes: int):
"""Write amplification factor must always be ≥ 1.0."""
sim = Simulator(SimulatorConfig())
sim.initialize()
for i in range(num_host_writes):
sim.write(lbn=i % 1000, data=bytes(4096))
assert sim.metrics.waf >= 1.0
Coverage requirement
New code must not decrease overall coverage below 80%. Check before opening a PR:
pytest --cov=src/opennandlab --cov-fail-under=80 tests/
6. Domain knowledge primer
If you’re new to NAND flash internals, read these before contributing to ECC, FTL, or GC:
Essential concepts:
NAND pages cannot be overwritten — they must be erased first, and erasing is block-granular (256+ pages at once). This is why FTLs exist.
Every block has a finite P/E cycle limit (~1000 for QLC, ~3000 for TLC, ~10 000 for MLC).
Raw bit error rate (RBER) increases with wear. Error correction (ECC) masks this, but eventually a block’s errors become uncorrectable.
Write amplification factor (WAF) = (NAND bytes written) / (host bytes written). WAF = 1 is perfect. GC always makes WAF > 1.
Recommended reading:
Agrawal et al., “Design Tradeoffs for SSD Performance” USENIX ATC 2008
Kim et al., “A Survey of Flash Translation Layer” JCST 2009
Luo et al., “Improving 3D NAND Flash Memory Lifetime…” arXiv:1807.05140
docs/design_docs/in this repository
Useful simulator implementation to study:
MQSim — CMU SAFARI’s C++ SSD simulator
7. Good first issues
Look for issues tagged good first issue on GitHub. Some concrete starter tasks:
Task |
File |
Difficulty |
|---|---|---|
Fix Windows venv command in README |
|
⭐ Easy |
Replace static CI badge with live Actions URL |
|
⭐ Easy |
Add |
|
⭐ Easy |
Split |
|
⭐⭐ Medium |
Implement |
|
⭐⭐ Medium |
Write Hypothesis test for LRU cache |
|
⭐⭐ Medium |
Add retention loss model |
|
⭐⭐⭐ Hard |
Implement Forney’s algorithm in BCH decoder |
|
⭐⭐⭐ Hard |
8. Submitting a pull request
Open an issue first for anything larger than a typo fix.
Reference the issue in your PR:
Closes #<number>.Fill in the PR template fully — description, testing done, screenshots if UI changes.
All CI checks must be green before requesting review.
At least one approving review is required to merge.
Squash-merge is preferred for feature branches; merge commit for releases.
PR checklist:
- [ ] Tests added / updated for all changed code
- [ ] `mypy src/` passes with zero errors
- [ ] `ruff check src/ tests/` reports no issues
- [ ] Docstrings added to all new public APIs
- [ ] CHANGELOG.md updated under [Unreleased]
- [ ] No placeholder comments in critical paths
9. Decision-making
Architectural decisions (new modules, data structure choices, API changes): open a GitHub Discussion before writing code.
Bug fixes: open an issue, comment with your proposed fix, then open a PR.
Documentation: PRs welcome without prior issue for anything < 100 lines.
External dependencies: new runtime dependencies require discussion. The project aims to keep
pip install opennandlablightweight (< 10 non-stdlib deps).
Questions? Open a GitHub Discussion. Found a security issue? See SECURITY.md.