User Manual
Welcome to OpenNANDLab! This user manual provides a comprehensive guide on how to install, configure, and use the tool effectively.
Table of Contents
Installation
Ensure that you have Python 3.10 or above installed on your system.
Clone the repository:
git clone https://github.com/muditbhargava66/OpenNANDLab.git
Navigate to the project directory:
cd 3d-nand-optimization-tool
Install the required dependencies:
pip install -r requirements.txt
Configuration
OpenNANDLab primarily relies on Pydantic configuration models (src/opennandlab/config.py) to validate setup parameters. However, you can still provide a traditional config.yaml file (and firmware templates like template.yaml) when invoking CLI commands or setting up the simulator instance.
config.yaml
The YAML format is seamlessly validated by the Pydantic SimulatorConfig object. A standard configuration includes:
NAND Configuration
nand_config:
page_size: 4096 # Page size in bytes
block_size: 256 # pages per block
num_blocks: 1024
oob_size: 128
num_planes: 1
Optimization Configuration
optimization_config:
error_correction:
algorithm: "bch" # Options: "bch", "ldpc", "none"
bch_params:
m: 8 # Galois Field parameter (3-16)
t: 4 # Error correction capability
strength: 4 # Error correction strength parameter for template
compression:
algorithm: "lz4" # Options: "lz4", "zstd"
level: 3 # Compression level (1-9)
enabled: true # Enable/disable compression
caching:
capacity: 1024 # Number of entries to cache
policy: "lru" # Options: "lru", "lfu", "fifo", "ttl"
enabled: true # Enable/disable caching
parallelism:
max_workers: 4 # Maximum number of parallel worker threads
wear_leveling:
wear_level_threshold: 1000 # Threshold for wear leveling activation
Firmware Configuration
firmware_config:
version: "1.0.0"
read_retry: true
max_read_retries: 3
data_scrambling: false
Bad Block Management Configuration
bbm_config:
max_bad_blocks: 100
Wear Leveling Configuration
wl_config:
wear_leveling_threshold: 1000
Logging Configuration
logging:
level: "INFO" # Options: "DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"
file: "logs/nand_optimization.log"
max_size: 10485760 # 10 MB
backup_count: 5
User Interface Configuration
ui_config:
theme: "light" # Options: "light", "dark"
font_size: 12
window_size: [1200, 800]
Simulation Configuration
simulation:
enabled: true # Use simulator by default for safety
error_rate: 0.0001
initial_bad_block_rate: 0.002
To modify the configuration, open the config.yaml file in a text editor and adjust the values according to your requirements. Save the file after making the desired changes.
template.yaml
The template.yaml file serves as a template for generating the firmware specification. It is located in the docs/resources/config/ directory. The file contains placeholders that are replaced with actual values from the config.yaml file during the firmware specification generation process.
---
firmware_version: "{{ firmware_version }}"
nand_config:
page_size: {{ nand_config.page_size }}
block_size: {{ nand_config.block_size }}
num_blocks: {{ nand_config.num_blocks }}
oob_size: {{ nand_config.oob_size }}
ecc_config:
algorithm: "{{ ecc_config.algorithm }}"
strength: {{ ecc_config.strength }}
bbm_config:
max_bad_blocks: {{ bbm_config.max_bad_blocks }}
wl_config:
wear_leveling_threshold: {{ wl_config.wear_leveling_threshold }}
The placeholders in the template.yaml file correspond to the configuration settings in the config.yaml file. For example, {{ firmware_version }} will be replaced with the actual firmware version specified in the config.yaml file.
You can customize the template.yaml file to match your specific firmware requirements. Modify the file in a text editor, adding or removing placeholders as needed, and save the changes.
Usage
The OpenNANDLab tool can be used through either a graphical user interface (GUI) or a command-line interface (CLI).
Streamlit Dashboard (GUI)
To launch the interactive dashboard, use the following command:
opennandlab dashboard
The dashboard will open in your default web browser, providing an intuitive interface to interact with the tool:
Simulation Configuration: Adjust NAND cell types, GC policies, and device constraints in real time via the sidebar.
Metrics & Visualization: Track live telemetry including WAF, host writes, and free pages.
Wear Distribution: View dynamic heatmaps of block erase counts.
Command-Line Interface (CLI)
OpenNANDLab provides a unified click-based command-line interface. To run simulations and benchmarks:
Run a single simulation:
opennandlab run --config config.yaml --workload random_write --iterations 500
Run standard benchmarks:
opennandlab benchmark --config config.yaml
Characterize NAND behavior:
opennandlab characterize --samples 100 --output-dir results/
For detailed options on each command, run opennandlab --help or opennandlab <command> --help.
Optimization Techniques
The OpenNANDLab tool employs various optimization techniques to enhance the performance, reliability, and efficiency of NAND flash storage systems.
NAND Defect Handling
Error Correction
The tool implements advanced error correction algorithms to detect and correct bit errors in NAND flash pages:
BCH (Bose-Chaudhuri-Hocquenghem):
Powerful algebraic error correction code especially effective for random bit errors
Configurable parameters:
m(Galois Field size) andt(error correction capability)Complete implementation with Galois Field arithmetic, generator polynomial calculation, and error location algorithms
LDPC (Low-Density Parity-Check):
High-performance error correction code approaching Shannon limit error-correcting capability
Progressive Edge-Growth (PEG) algorithm for optimized parity-check matrices
Belief propagation algorithm for soft-decision decoding
Configurable parameters for different code rates and error correction capabilities
Bad Block Management
The tool maintains a bad block table and implements strategies for handling blocks that have been marked as bad:
Records and tracks factory-marked bad blocks and runtime-detected bad blocks
Provides methods to mark blocks as bad and check if a block is bad
Implements next-good-block finding algorithms and block replacement strategies
Stores the bad block table in a reserved area of the NAND flash for persistence
Wear Leveling
The tool implements wear leveling algorithms to evenly distribute write and erase operations:
Tracks the erase count and wear level of each block
Dynamic block remapping based on wear thresholds
Statistical wear analysis for optimization decisions
Reserved blocks for wear leveling operations
Performance Optimization
Data Compression
The tool applies data compression to reduce the amount of data written to the NAND flash:
LZ4: Fast compression algorithm with good compression ratio, ideal for storage applications
Zstandard (zstd): Higher compression ratio with reasonable speed, suitable for archival data
Configurable compression levels to balance speed vs. compression ratio
Adaptive compression based on data patterns and achieved ratios
Advanced Caching
The tool implements a sophisticated caching mechanism with multiple eviction policies:
LRU (Least Recently Used): Evicts items that haven’t been accessed recently
LFU (Least Frequently Used): Evicts items that are accessed least often
FIFO (First In First Out): Simple queue-based eviction
TTL (Time To Live): Automatic expiration of stale data
Thread-safe operations for concurrent access
Detailed cache statistics and monitoring
Parallel Access
The tool utilizes parallel access techniques to optimize read and write operations:
Thread pool-based execution for I/O operations
Concurrent operation handling for multi-plane NAND devices
Task submission and management interface
Automatic resource management and clean shutdown
Firmware Integration
Firmware Specification Generation
The tool generates firmware specifications based on the current configuration:
Uses the
template.yamlfile as a template for the firmware specificationReplaces placeholders with actual values from the
config.yamlfileSupports customization of firmware parameters (version, NAND configuration, etc.)
Produces a complete YAML-based firmware specification file
Firmware Specification Validation
The tool provides comprehensive validation of firmware specifications:
Schema validation for structure, required fields, and data types
Semantic validation for format rules and logical constraints
Cross-field validation for parameter compatibility
Detailed error reporting with specific validation failures
Test Benches
The tool includes a framework for executing test benches to validate firmware functionality:
Utilizes the
test_cases.yamlfile to define test casesExecutes the firmware with the provided test cases
Compares actual output with expected output
Reports discrepancies or failures
NAND Characterization
Data Collection
The tool collects data from the NAND flash device to characterize its behavior:
Gathers page data, block metadata, and error correction information
Samples blocks and pages across the device
Records erase counts, bad block status, and other metrics
Stores collected data in CSV format for analysis
Data Analysis
The tool analyzes collected data to extract insights about the NAND flash:
Calculates statistical measures of erase count distribution
Analyzes bad block trends and correlations with erase counts
Identifies patterns and anomalies in the data
Provides metrics for performance and reliability assessment
Data Visualization
The tool generates visualizations to help interpret the analyzed data:
Creates erase count distribution histograms
Plots bad block trend analysis
Visualizes wear leveling effectiveness
Exports visualizations as PNG files for documentation and reporting
Troubleshooting
Common Issues and Solutions
Initialization Failures
Issue: The NAND controller fails to initialize.
Solution:
Check that the configuration file exists and has valid parameters
Verify that all required resources are available in the
resourcesdirectoryRun the tool with the
--check-resourcesflag to create missing resource filesCheck the log file for specific error messages
Read/Write Errors
Issue: Read or write operations fail with error messages.
Solution:
Check if the block is marked as bad using the
is_bad_blockmethodEnsure the block has been erased before writing to it
Verify that the page number is within the valid range for the block
Check if the data size exceeds the page size
Performance Issues
Issue: The tool is running slower than expected.
Solution:
Adjust the compression level in the configuration
Increase the cache capacity for frequently accessed data
Optimize the number of parallel workers based on your system’s capabilities
Disable features that are not needed for your specific use case
GUI Not Displaying Correctly
Issue: The GUI elements are not displaying correctly or are not responsive.
Solution:
Ensure all dependencies are installed, including PyQt5
Try switching to a different theme in the settings
Check if the window size settings are appropriate for your display
Restart the application with the
--guiflag
Error Messages and Their Meanings
“NAND hardware not initialized”
Meaning: The NAND controller has not been initialized yet.
Action: Call the
initialize()method on the NAND controller before performing operations.
“Block X is marked as bad”
Meaning: The specified block is marked as bad in the bad block table.
Action: Use a different block or get the next good block using
get_next_good_block().
“Program operation failed for block X, page Y”
Meaning: Writing to the specified page failed, possibly due to a worn-out block.
Action: Mark the block as bad and use a different block for future operations.
“Erase operation failed for block X”
Meaning: Erasing the specified block failed, possibly due to a hardware issue.
Action: Mark the block as bad and use a different block for future operations.
“Too many errors to correct”
Meaning: The number of bit errors exceeds the error correction capability.
Action: Try reading with read retry if enabled, or use a more powerful error correction algorithm.
FAQ
Q: Can I use this tool with real NAND flash hardware?
A: Yes, the tool supports both simulation mode and real hardware mode. For real hardware, ensure that the hardware interface is properly configured and connected.Q: Which error correction algorithm should I use?
A: BCH is suitable for most applications with moderate error correction needs. LDPC provides stronger error correction capabilities but requires more computational resources.Q: How does the tool handle bad blocks?
A: The tool maintains a bad block table to track blocks that are marked as bad. When a bad block is encountered, the tool automatically finds a replacement block using the bad block management module.Q: Can I customize the firmware specification template?
A: Yes, you can customize thetemplate.yamlfile to include additional parameters or modify existing ones based on your firmware requirements.Q: How do I interpret the NAND characterization results?
A: The characterization results provide insights into the wear distribution and bad block trends of the NAND flash. Higher erase counts indicate more worn blocks, and a positive correlation between erase counts and bad blocks suggests wear-induced failures.Q: Is there a way to speed up the optimization process?
A: Yes, you can adjust several settings to improve performance:Increase the number of parallel workers
Use a faster compression algorithm (LZ4 instead of Zstandard)
Adjust the cache capacity and policy based on your access patterns
Disable features that are not essential for your use case
Q: How can I contribute to the project?
A: Contributions are welcome! Please refer to the README.md file in the project repository for guidelines on how to contribute, report issues, and submit pull requests.Q: Does the tool work with different types of NAND flash?
A: Yes, the tool is designed to work with various types of NAND flash, including SLC, MLC, TLC, and QLC. You can configure the parameters in theconfig.yamlfile to match your specific NAND flash characteristics.
For more detailed information and API references, please refer to the api_reference.md file.