Morten Ofstad, Chief Architect at Bluware
The Volume Data Store (VDS) format represents a revolutionary approach to seismic data organization, combining efficient storage with sophisticated access capabilities. Let’s explore how this innovative structure works and why it matters.
Brick Architecture
VDS default brick size is 128×128×128 samples, occupying 8 megabytes of uncompressed space when using 32-bit floats. The bricks can be significantly compressed and offer customizable dimensions for optimization, allowing flexibility in data storage and processing.
Brick Size Considerations
VDS brick size plays a crucial role in system performance and resource utilization.
- Larger bricks reduce I/O requests and work well on high-memory systems
- Smaller bricks enable more frequent I/O requests and are ideal for memory-constrained environments, offering more flexibility in resource allocation.
The brick size is fundamentally important as it determines the Bluware compute engine’s request size, directly impacting processing efficiency and overall system performance in memory-intensive computing scenarios.
Scalable Design
VDS is designed with a highly scalable architecture that can efficiently manage massive datasets, accommodating volumes up to petabytes in size. By leveraging variable-sized compressed bricks, the system minimizes data redundancy, while maximizing data accessibility. This innovative approach supports regular array data structures, enabling flexible and efficient data storage and retrieval across large-scale computing environments.
Flexible Data Structure
VDS offers a flexible data structure that can accommodate any regular array data, with specialized support for seismic processing and interpretation datasets. The system is designed to preserve all metadata during format transcoding, ensuring data integrity and continuity across different processing stages.
Types of VDS Metadata
1. Data-Specific Metadata
- VDS metadata
- Dimensional parameters
- Start and end coordinates
- Processing parameters
2. Global Metadata
Organized in hierarchical categories with key-value pairs:
- Survey Coordinate System
- Defines bin grid with coordinate reference system (CRS)
- Enables precise spatial positioning of 3D pre- and post-stack data
- SEG-Y
- Captures original SEG-Y headers and keys
- Enables exact restoration of original SEG-Y file
- Trace Coordinates
- Original coordinates, ensemble (CDP) numbers and energy source points (SP) for all traces/gathers
- Enables precise spatial positioning of 2D pre- and post-stack data
- Import Information
Metadata Features
VDS metadata system employs a sophisticated, flexible approach using string-based keys organized into named categories. The structure supports a wide range of value types, including:
- Strings
- Numeric types (single and double precision float, integers, booleans, scalars, and 2/3/4 component vectors)
- Binary Large Objects (BLOBs)
This extensible design allows for comprehensive workflow customization and enables robust tracking of lineage information.
VDS Best Practices
1. Compression
VDS defaults to using virtually lossless wavelet compression with a tolerance of 0.01, which provides optimal performance across diverse use-cases. For archival purposes, lossless wavelet compression is preferred, while uncompressed is suitable for temporary storage scenarios where high-bandwidth I/O is guaranteed. Caution is advised against using lossless wavelet compressed or uncompressed formats when storing additional dimension groups like ‘fast slices’. When reading VDS, users should always specify an adaptive tolerance appropriate to their specific use-case, such as 1.0 for visualization purposes.
2. Brick Size Optimization
For uncompressed data, a 64×64×64 configuration is recommended to achieve an object size of approximately 1MB, which is optimal for I/O operations. When utilizing wavelet compression, a 128×128×128 brick size can similarly produce an approximately 1MB object size, ensuring efficient data handling and transfer.
3. Metadata Management
Effective metadata management involves maintaining a reasonable size (in the megabytes range) since all metadata is read when opening a VDS. It is recommended to include all OpenVDS-specified categories for seismic data and utilize the definitions in the Known Metadata class instead of string constants. Maintaining clear documentation of custom metadata is crucial for system clarity and usability.
4. Data Organization
Data organization should focus on structuring data in optimal brick sizes while balancing compression ratios. The primary optimization goal should be to prioritize reading efficiency over writing speed, ensuring smoother data access and retrieval processes.
5. Format Conversion
During format conversions, it is essential to preserve all critical metadata and verify data integrity after transfers. Maintaining format compatibility is key to ensuring seamless data migration and preventing information loss across different storage and processing systems.
Real-World Benefits
VDS offers tangible advantages:
- By implementing optimized brick sizes, the system enables efficient data retrieval while simultaneously reducing storage overhead through intelligent compression techniques.
- The configurable architecture enhances processing performance, allowing for greater flexibility across different computing environments.
- Comprehensive metadata management ensures enhanced data integrity.
- The system’s inherent design supports seamless format compatibility and provides flexible memory management options.
These features collectively make VDS a powerful solution for handling complex, large-scale data storage and processing challenges across the energy industry domain.
Looking Forward
VDS’s structured approach to data organization represents more than just efficient storage—it’s a comprehensive solution for modern seismic data management. By combining flexible data organization with robust metadata management and customizable brick architectures, VDS provides a foundation for advanced seismic processing and interpretation workflows while maintaining compatibility with existing systems.
The format’s ability to handle massive datasets while preserving all metadata makes it an invaluable tool for organizations dealing with complex seismic data operations. As the industry continues to generate larger and more complex datasets, VDS’s structured approach becomes increasingly crucial for efficient data management and analysis.