Contributing to OpenNANDLab

Thank you for your interest in contributing. This guide covers everything you need to get from zero to a merged pull request.

Table of Contents

Quick start
Project structure
Development workflow
Code standards
Writing tests
Domain knowledge primer
Good first issues
Submitting a pull request
Decision-making

1. Quick start

# Fork the repo on GitHub, then:
git clone https://github.com/YOUR_USERNAME/OpenNANDLab.git
cd OpenNANDLab

# Python 3.10+ required
python -m venv .venv
source .venv/bin/activate          # Linux / macOS
# .venv\Scripts\activate.bat       # Windows CMD
# .venv\Scripts\Activate.ps1       # Windows PowerShell

pip install -e ".[dev]"            # installs all dev tools
pre-commit install                 # installs pre-commit hooks

# Verify setup
pytest --tb=short -q               # all tests should pass
mypy src/                          # should report 0 errors

If any of these fail on a clean clone, open an issue — that’s a bug.

2. Project structure

OpenNANDLab/
├── src/opennandlab/     # All library code lives here
│   ├── nand/            # Physical device model
│   ├── ftl/             # Flash Translation Layer + GC
│   ├── ecc/             # BCH, LDPC, ECCHandler
│   ├── defect/          # Bad block manager, wear leveling
│   ├── optimization/    # Compression, caching
│   ├── workloads/       # Workload generators, trace replay
│   ├── analytics/       # Metrics, report generation
│   ├── visualization/   # Plotly charts, Streamlit dashboard
│   ├── firmware/        # Spec generation, validation
│   ├── simulator.py     # Top-level Simulator class
│   ├── config.py        # Pydantic config models
│   └── cli.py           # Click CLI entry point
├── tests/
│   ├── unit/            # Module-level unit tests
│   ├── integration/     # End-to-end pipeline tests
│   └── property/        # Hypothesis property-based tests
├── examples/            # Runnable example scripts
├── docs/                # Sphinx source + design docs
├── scripts/             # Benchmark + characterization scripts
├── resources/           # Config templates, images
├── pyproject.toml       # Build system + tool config
├── tox.ini              # Test environment matrix
└── ARCHITECTURE.md      # Internal design reference

3. Development workflow

Branch naming

Type	Pattern	Example
Bug fix	`fix/<short-description>`	`fix/write-page-ecc-missing`
Feature	`feat/<module>-<description>`	`feat/ftl-greedy-gc`
Documentation	`docs/<description>`	`docs/architecture-update`
Refactor	`refactor/<description>`	`refactor/bch-forney-algorithm`
Test	`test/<description>`	`test/hypothesis-ecc-roundtrip`

Commit messages

Follow Conventional Commits:

feat(ftl): add greedy garbage collector

- Select victim block by max invalid-page count
- Copy valid pages to fresh block before erasing
- Update WAF counter on every GC page move

Closes #42

Types: feat, fix, docs, test, refactor, perf, chore, ci

Running checks locally

# Run all tests
pytest

# Run with coverage report
pytest --cov=src/opennandlab --cov-report=html tests/

# Type checking
mypy src/

# Linting + formatting
ruff check src/ tests/
ruff format src/ tests/

# Full tox matrix (all Python versions)
tox

# Just the property-based tests
pytest tests/property/ -v

# Run a specific benchmark
python scripts/performance_test.py --workload random_write --iterations 10000

4. Code standards

Type annotations

All public functions and methods must be fully annotated:

# Good
def write_page(self, lbn: int, data: bytes) -> None: ...
def read_page(self, lbn: int) -> bytes: ...

# Bad — no annotations
def write_page(self, lbn, data): ...

mypy --strict must pass on all files in src/.

Docstrings (NumPy style)

def rber_model(pe_count: int, cfg: NANDConfig) -> float:
    """
    Compute the raw bit error rate as a function of erase cycle count.

    Uses a Weibull-inspired model where RBER rises from rber_floor
    toward rber_ceil with characteristic lifetime rber_lambda.

    Parameters
    ----------
    pe_count : int
        Number of program/erase cycles the block has undergone.
    cfg : NANDConfig
        NAND configuration containing rber_floor, rber_ceil, rber_lambda.

    Returns
    -------
    float
        Estimated RBER in the range [rber_floor, rber_ceil).

    Examples
    --------
    >>> cfg = NANDConfig()
    >>> rber_model(0, cfg)    # should be close to rber_floor
    1e-08
    """

Data structures

Use case	Required structure	Complexity
LRU cache	`collections.OrderedDict`	O(1) get/put/evict
Wear leveling	`heapq` min-heap	O(log N) insert, O(1) peek min
L2P mapping	`array.array('i')`	O(1) random access
Free-block pool	`collections.deque`	O(1) popleft/append

Do not use a plain list for wear tracking (linear scan) or a dict for the L2P table (excessive memory overhead vs flat array).

Error handling

Define domain exceptions in src/opennandlab/exceptions.py:

class OpenNANDLabError(Exception): ...
class UncorrectableECCError(OpenNANDLabError): ...
class BadBlockError(OpenNANDLabError): ...
class UnmappedLBNError(OpenNANDLabError): ...
class NANDReadError(OpenNANDLabError): ...
class GCFailedError(OpenNANDLabError): ...

Never use bare except Exception. Always catch the most specific exception.

No stubs in critical paths

This is the single most important rule for this project:

# ILLEGAL — this was the v1.1.0 bug
def write_page(self, ...):
    # ... compression ...
    # Perform error correction coding
    # ... (comment only, no implementation)

If you cannot implement something yet, raise NotImplementedError with a descriptive message and link to the tracking issue. Never leave a comment where code should be.

5. Writing tests

Unit tests

Every module in src/ must have a corresponding tests/unit/test_<module>.py. Tests should be fast (< 100 ms each) and have no filesystem or network I/O.

# tests/unit/test_bch.py
import pytest
from opennandlab.ecc.bch import BCHCodec

class TestBCHCodec:
    def test_encode_decode_no_errors(self):
        codec = BCHCodec(m=8, t=4)
        data = b"hello NAND world" * 16   # 256 bytes
        codeword = codec.encode(data)
        assert codec.decode(codeword) == data

    def test_corrects_exactly_t_errors(self):
        codec = BCHCodec(m=8, t=4)
        data = bytes(range(256))
        codeword = bytearray(codec.encode(data))
        # Flip exactly t bits
        for i in range(4):
            codeword[i * 10] ^= 0x01
        assert codec.decode(bytes(codeword)) == data

    def test_raises_on_t_plus_1_errors(self):
        codec = BCHCodec(m=8, t=4)
        data = bytes(256)
        codeword = bytearray(codec.encode(data))
        for i in range(5):    # t + 1 errors
            codeword[i * 10] ^= 0x01
        with pytest.raises(UncorrectableECCError):
            codec.decode(bytes(codeword))

Property-based tests

Use hypothesis for invariant testing. Add all property tests to tests/property/:

# tests/property/test_ecc_properties.py
from hypothesis import given, settings, strategies as st
from opennandlab.ecc.bch import BCHCodec
from opennandlab.exceptions import UncorrectableECCError

@given(
    data=st.binary(min_size=128, max_size=4096),
    num_errors=st.integers(min_value=0, max_value=4),
)
@settings(max_examples=200)
def test_bch_corrects_up_to_t_errors(data: bytes, num_errors: int):
    codec = BCHCodec(m=8, t=4)
    codeword = bytearray(codec.encode(data))
    # Inject num_errors random bit flips
    for pos in random.sample(range(len(codeword)), num_errors):
        codeword[pos] ^= (1 << random.randint(0, 7))
    assert codec.decode(bytes(codeword)) == data


@given(st.integers(min_value=0, max_value=10_000))
def test_rber_monotonically_increases(pe_count: int):
    """RBER must never decrease as P/E count increases."""
    cfg = NANDConfig()
    r1 = rber_model(pe_count, cfg)
    r2 = rber_model(pe_count + 1, cfg)
    assert r2 >= r1


@given(st.integers(min_value=1, max_value=1000))
def test_waf_always_gte_1(num_host_writes: int):
    """Write amplification factor must always be ≥ 1.0."""
    sim = Simulator(SimulatorConfig())
    sim.initialize()
    for i in range(num_host_writes):
        sim.write(lbn=i % 1000, data=bytes(4096))
    assert sim.metrics.waf >= 1.0

Coverage requirement

New code must not decrease overall coverage below 80%. Check before opening a PR:

pytest --cov=src/opennandlab --cov-fail-under=80 tests/

6. Domain knowledge primer

If you’re new to NAND flash internals, read these before contributing to ECC, FTL, or GC:

Essential concepts:

NAND pages cannot be overwritten — they must be erased first, and erasing is block-granular (256+ pages at once). This is why FTLs exist.
Every block has a finite P/E cycle limit (~1000 for QLC, ~3000 for TLC, ~10 000 for MLC).
Raw bit error rate (RBER) increases with wear. Error correction (ECC) masks this, but eventually a block’s errors become uncorrectable.
Write amplification factor (WAF) = (NAND bytes written) / (host bytes written). WAF = 1 is perfect. GC always makes WAF > 1.

Recommended reading:

Agrawal et al., “Design Tradeoffs for SSD Performance” USENIX ATC 2008
Kim et al., “A Survey of Flash Translation Layer” JCST 2009
Luo et al., “Improving 3D NAND Flash Memory Lifetime…” arXiv:1807.05140
docs/design_docs/ in this repository

Useful simulator implementation to study:

MQSim — CMU SAFARI’s C++ SSD simulator

7. Good first issues

Look for issues tagged good first issue on GitHub. Some concrete starter tasks:

Task	File	Difficulty
Fix Windows venv command in README	`README.md`	⭐ Easy
Replace static CI badge with live Actions URL	`README.md`	⭐ Easy
Add `constants.py` and move `META_SIGNATURE`	`src/nand_controller.py`	⭐ Easy
Split `initialize()` into helper methods	`src/nand_controller.py`	⭐⭐ Medium
Implement `_scramble_data` (XOR with block/page seed)	`src/nand_controller.py`	⭐⭐ Medium
Write Hypothesis test for LRU cache	`tests/property/`	⭐⭐ Medium
Add retention loss model	`src/opennandlab/nand/reliability.py`	⭐⭐⭐ Hard
Implement Forney’s algorithm in BCH decoder	`src/opennandlab/ecc/bch.py`	⭐⭐⭐ Hard

8. Submitting a pull request

Open an issue first for anything larger than a typo fix.
Reference the issue in your PR: Closes #<number>.
Fill in the PR template fully — description, testing done, screenshots if UI changes.
All CI checks must be green before requesting review.
At least one approving review is required to merge.
Squash-merge is preferred for feature branches; merge commit for releases.

PR checklist:

- [ ] Tests added / updated for all changed code
- [ ] `mypy src/` passes with zero errors
- [ ] `ruff check src/ tests/` reports no issues
- [ ] Docstrings added to all new public APIs
- [ ] CHANGELOG.md updated under [Unreleased]
- [ ] No placeholder comments in critical paths

9. Decision-making

Architectural decisions (new modules, data structure choices, API changes): open a GitHub Discussion before writing code.
Bug fixes: open an issue, comment with your proposed fix, then open a PR.
Documentation: PRs welcome without prior issue for anything < 100 lines.
External dependencies: new runtime dependencies require discussion. The project aims to keep pip install opennandlab lightweight (< 10 non-stdlib deps).

Questions? Open a GitHub Discussion. Found a security issue? See SECURITY.md.