Contributing to OpenNANDLab

Thank you for your interest in contributing. This guide covers everything you need to get from zero to a merged pull request.


Table of Contents

  1. Quick start

  2. Project structure

  3. Development workflow

  4. Code standards

  5. Writing tests

  6. Domain knowledge primer

  7. Good first issues

  8. Submitting a pull request

  9. Decision-making


1. Quick start

# Fork the repo on GitHub, then:
git clone https://github.com/YOUR_USERNAME/OpenNANDLab.git
cd OpenNANDLab

# Python 3.10+ required
python -m venv .venv
source .venv/bin/activate          # Linux / macOS
# .venv\Scripts\activate.bat       # Windows CMD
# .venv\Scripts\Activate.ps1       # Windows PowerShell

pip install -e ".[dev]"            # installs all dev tools
pre-commit install                 # installs pre-commit hooks

# Verify setup
pytest --tb=short -q               # all tests should pass
mypy src/                          # should report 0 errors

If any of these fail on a clean clone, open an issue — that’s a bug.


2. Project structure

OpenNANDLab/
├── src/opennandlab/     # All library code lives here
│   ├── nand/            # Physical device model
│   ├── ftl/             # Flash Translation Layer + GC
│   ├── ecc/             # BCH, LDPC, ECCHandler
│   ├── defect/          # Bad block manager, wear leveling
│   ├── optimization/    # Compression, caching
│   ├── workloads/       # Workload generators, trace replay
│   ├── analytics/       # Metrics, report generation
│   ├── visualization/   # Plotly charts, Streamlit dashboard
│   ├── firmware/        # Spec generation, validation
│   ├── simulator.py     # Top-level Simulator class
│   ├── config.py        # Pydantic config models
│   └── cli.py           # Click CLI entry point
├── tests/
│   ├── unit/            # Module-level unit tests
│   ├── integration/     # End-to-end pipeline tests
│   └── property/        # Hypothesis property-based tests
├── examples/            # Runnable example scripts
├── docs/                # Sphinx source + design docs
├── scripts/             # Benchmark + characterization scripts
├── resources/           # Config templates, images
├── pyproject.toml       # Build system + tool config
├── tox.ini              # Test environment matrix
└── ARCHITECTURE.md      # Internal design reference

3. Development workflow

Branch naming

Type

Pattern

Example

Bug fix

fix/<short-description>

fix/write-page-ecc-missing

Feature

feat/<module>-<description>

feat/ftl-greedy-gc

Documentation

docs/<description>

docs/architecture-update

Refactor

refactor/<description>

refactor/bch-forney-algorithm

Test

test/<description>

test/hypothesis-ecc-roundtrip

Commit messages

Follow Conventional Commits:

feat(ftl): add greedy garbage collector

- Select victim block by max invalid-page count
- Copy valid pages to fresh block before erasing
- Update WAF counter on every GC page move

Closes #42

Types: feat, fix, docs, test, refactor, perf, chore, ci

Running checks locally

# Run all tests
pytest

# Run with coverage report
pytest --cov=src/opennandlab --cov-report=html tests/

# Type checking
mypy src/

# Linting + formatting
ruff check src/ tests/
ruff format src/ tests/

# Full tox matrix (all Python versions)
tox

# Just the property-based tests
pytest tests/property/ -v

# Run a specific benchmark
python scripts/performance_test.py --workload random_write --iterations 10000

4. Code standards

Type annotations

All public functions and methods must be fully annotated:

# Good
def write_page(self, lbn: int, data: bytes) -> None: ...
def read_page(self, lbn: int) -> bytes: ...

# Bad — no annotations
def write_page(self, lbn, data): ...

mypy --strict must pass on all files in src/.

Docstrings (NumPy style)

def rber_model(pe_count: int, cfg: NANDConfig) -> float:
    """
    Compute the raw bit error rate as a function of erase cycle count.

    Uses a Weibull-inspired model where RBER rises from rber_floor
    toward rber_ceil with characteristic lifetime rber_lambda.

    Parameters
    ----------
    pe_count : int
        Number of program/erase cycles the block has undergone.
    cfg : NANDConfig
        NAND configuration containing rber_floor, rber_ceil, rber_lambda.

    Returns
    -------
    float
        Estimated RBER in the range [rber_floor, rber_ceil).

    Examples
    --------
    >>> cfg = NANDConfig()
    >>> rber_model(0, cfg)    # should be close to rber_floor
    1e-08
    """

Data structures

Use case

Required structure

Complexity

LRU cache

collections.OrderedDict

O(1) get/put/evict

Wear leveling

heapq min-heap

O(log N) insert, O(1) peek min

L2P mapping

array.array('i')

O(1) random access

Free-block pool

collections.deque

O(1) popleft/append

Do not use a plain list for wear tracking (linear scan) or a dict for the L2P table (excessive memory overhead vs flat array).

Error handling

Define domain exceptions in src/opennandlab/exceptions.py:

class OpenNANDLabError(Exception): ...
class UncorrectableECCError(OpenNANDLabError): ...
class BadBlockError(OpenNANDLabError): ...
class UnmappedLBNError(OpenNANDLabError): ...
class NANDReadError(OpenNANDLabError): ...
class GCFailedError(OpenNANDLabError): ...

Never use bare except Exception. Always catch the most specific exception.

No stubs in critical paths

This is the single most important rule for this project:

# ILLEGAL — this was the v1.1.0 bug
def write_page(self, ...):
    # ... compression ...
    # Perform error correction coding
    # ... (comment only, no implementation)

If you cannot implement something yet, raise NotImplementedError with a descriptive message and link to the tracking issue. Never leave a comment where code should be.


5. Writing tests

Unit tests

Every module in src/ must have a corresponding tests/unit/test_<module>.py. Tests should be fast (< 100 ms each) and have no filesystem or network I/O.

# tests/unit/test_bch.py
import pytest
from opennandlab.ecc.bch import BCHCodec

class TestBCHCodec:
    def test_encode_decode_no_errors(self):
        codec = BCHCodec(m=8, t=4)
        data = b"hello NAND world" * 16   # 256 bytes
        codeword = codec.encode(data)
        assert codec.decode(codeword) == data

    def test_corrects_exactly_t_errors(self):
        codec = BCHCodec(m=8, t=4)
        data = bytes(range(256))
        codeword = bytearray(codec.encode(data))
        # Flip exactly t bits
        for i in range(4):
            codeword[i * 10] ^= 0x01
        assert codec.decode(bytes(codeword)) == data

    def test_raises_on_t_plus_1_errors(self):
        codec = BCHCodec(m=8, t=4)
        data = bytes(256)
        codeword = bytearray(codec.encode(data))
        for i in range(5):    # t + 1 errors
            codeword[i * 10] ^= 0x01
        with pytest.raises(UncorrectableECCError):
            codec.decode(bytes(codeword))

Property-based tests

Use hypothesis for invariant testing. Add all property tests to tests/property/:

# tests/property/test_ecc_properties.py
from hypothesis import given, settings, strategies as st
from opennandlab.ecc.bch import BCHCodec
from opennandlab.exceptions import UncorrectableECCError

@given(
    data=st.binary(min_size=128, max_size=4096),
    num_errors=st.integers(min_value=0, max_value=4),
)
@settings(max_examples=200)
def test_bch_corrects_up_to_t_errors(data: bytes, num_errors: int):
    codec = BCHCodec(m=8, t=4)
    codeword = bytearray(codec.encode(data))
    # Inject num_errors random bit flips
    for pos in random.sample(range(len(codeword)), num_errors):
        codeword[pos] ^= (1 << random.randint(0, 7))
    assert codec.decode(bytes(codeword)) == data


@given(st.integers(min_value=0, max_value=10_000))
def test_rber_monotonically_increases(pe_count: int):
    """RBER must never decrease as P/E count increases."""
    cfg = NANDConfig()
    r1 = rber_model(pe_count, cfg)
    r2 = rber_model(pe_count + 1, cfg)
    assert r2 >= r1


@given(st.integers(min_value=1, max_value=1000))
def test_waf_always_gte_1(num_host_writes: int):
    """Write amplification factor must always be ≥ 1.0."""
    sim = Simulator(SimulatorConfig())
    sim.initialize()
    for i in range(num_host_writes):
        sim.write(lbn=i % 1000, data=bytes(4096))
    assert sim.metrics.waf >= 1.0

Coverage requirement

New code must not decrease overall coverage below 80%. Check before opening a PR:

pytest --cov=src/opennandlab --cov-fail-under=80 tests/

6. Domain knowledge primer

If you’re new to NAND flash internals, read these before contributing to ECC, FTL, or GC:

Essential concepts:

  • NAND pages cannot be overwritten — they must be erased first, and erasing is block-granular (256+ pages at once). This is why FTLs exist.

  • Every block has a finite P/E cycle limit (~1000 for QLC, ~3000 for TLC, ~10 000 for MLC).

  • Raw bit error rate (RBER) increases with wear. Error correction (ECC) masks this, but eventually a block’s errors become uncorrectable.

  • Write amplification factor (WAF) = (NAND bytes written) / (host bytes written). WAF = 1 is perfect. GC always makes WAF > 1.

Recommended reading:

  • Agrawal et al., “Design Tradeoffs for SSD Performance” USENIX ATC 2008

  • Kim et al., “A Survey of Flash Translation Layer” JCST 2009

  • Luo et al., “Improving 3D NAND Flash Memory Lifetime…” arXiv:1807.05140

  • docs/design_docs/ in this repository

Useful simulator implementation to study:

  • MQSim — CMU SAFARI’s C++ SSD simulator


7. Good first issues

Look for issues tagged good first issue on GitHub. Some concrete starter tasks:

Task

File

Difficulty

Fix Windows venv command in README

README.md

⭐ Easy

Replace static CI badge with live Actions URL

README.md

⭐ Easy

Add constants.py and move META_SIGNATURE

src/nand_controller.py

⭐ Easy

Split initialize() into helper methods

src/nand_controller.py

⭐⭐ Medium

Implement _scramble_data (XOR with block/page seed)

src/nand_controller.py

⭐⭐ Medium

Write Hypothesis test for LRU cache

tests/property/

⭐⭐ Medium

Add retention loss model

src/opennandlab/nand/reliability.py

⭐⭐⭐ Hard

Implement Forney’s algorithm in BCH decoder

src/opennandlab/ecc/bch.py

⭐⭐⭐ Hard


8. Submitting a pull request

  1. Open an issue first for anything larger than a typo fix.

  2. Reference the issue in your PR: Closes #<number>.

  3. Fill in the PR template fully — description, testing done, screenshots if UI changes.

  4. All CI checks must be green before requesting review.

  5. At least one approving review is required to merge.

  6. Squash-merge is preferred for feature branches; merge commit for releases.

PR checklist:

- [ ] Tests added / updated for all changed code
- [ ] `mypy src/` passes with zero errors
- [ ] `ruff check src/ tests/` reports no issues
- [ ] Docstrings added to all new public APIs
- [ ] CHANGELOG.md updated under [Unreleased]
- [ ] No placeholder comments in critical paths

9. Decision-making

  • Architectural decisions (new modules, data structure choices, API changes): open a GitHub Discussion before writing code.

  • Bug fixes: open an issue, comment with your proposed fix, then open a PR.

  • Documentation: PRs welcome without prior issue for anything < 100 lines.

  • External dependencies: new runtime dependencies require discussion. The project aims to keep pip install opennandlab lightweight (< 10 non-stdlib deps).


Questions? Open a GitHub Discussion. Found a security issue? See SECURITY.md.