Anatomy of VDS

Developers
>
Anatomy of VDS

Introduction

Bluware Volume Data Store (VDS™) can be used to store data originating from a seismic survey in the SEG-Y format. The resulting VDS supports fast-random access and contains sufficient data to make it possible to convert the VDS back to SEG-Y in a bitwise identical SEG-Y file. Due to the flexibility of the format, it is possible to tune the balance between performance and disk storage to suit a wide variety of workflows.

The conversion from SEG-Y to VDS can be made either by commercial tools built by Bluware or by using the OpenVDS open-source library, available through the Open Group OSDU™ Data Platform.

For the actual realization of how the described organization is stored, either in a single-file container on a disk or on a cloud object store, refer to the VDS Specification. The VDS Deep Dive white paper gives a full overview of all the options available when importing a VDS dataset.

Example

VDS consists of one or more data channels each having a name, unit, and data format that determine which values are stored for each position in the VDS. Each channel might be compressed to reduce storage size.

In addition to the channels, VDS might also have metadata, such as the survey coordinate system used to position the dataset. The VDS specification defines some mandatory metadata categories that should be present for a seismic dataset.

Figure: This figure illustrates how VDS can be used to store the contents of the SEG-Y, including sufficient data to make it possible to convert VDS back to SEG-Y in a bitwise identical fashion

When SEG-Y is imported to VDS, three channels are created:

The seismic amplitudes.
The SEG-Y per trace text header.
The trace mask, which indicates if the trace was present or not during conversion.

In addition, SEG-Y text and binary header information is added and stored as metadata in VDS.

The channels in VDS have different data and compression formats. The amplitudes are 32-bit floating-point values and compressed using lossless wavelet compression. The amplitudes are stored as bricks of 128*128*128 samples. This layout allows for efficient random queries through the dataset (e.g., sub-volumes or oblique slices).

The amplitudes are compressed using lossless wavelet compression. A hierarchy of lower-resolution copies of the amplitudes, called LODs (Level of Detail), are also created. LODs are used to support efficient visualization as well as the conversion to other formats that require LODs, such as ZGY. The LODs use lossy wavelet compression so minimal storage is required.

In addition to being stored as bricks, the amplitude values are also duplicated and organized as Fast Slices. This allows for increased performance when querying for per-slice data, for a significant increase in storage requirements. The fast slices are also compressed using lossy wavelet compression, which reduces the storage cost of duplicating the slices. The use of Fast Slices is optional.

The SEG-Y trace header consists of 240 bytes per trace and is compressed using lossless ZIP compression. LODs are not added for the SEG-Y trace header.

The trace mask is compressed using run-length-encoding, since it will typically contain many identical values in a row.

Finally, the metadata is stored as part of VDS. This information is not used directly when reading data, but applications might query and use this information (e.g., the SEG-Y text header might be displayed to show information about the survey).

Conclusion

By organizing data as described above, a VDS is created that is usable directly for fast random access and allows for re-creating a bitwise identical copy of the original SEG-Y. If bitwise recreation is not required, it is possible to use lossy wavelet compression, which will reduce the storage requirements significantly. For many use cases, a compression ratio of 1:10 is achievable.