Quick Start¶
This guide will get you up and running with gwframe in just a few minutes.
Installation¶
Reading GWF Files¶
Basic Reading¶
import gwframe
# Read a single channel
data = gwframe.read('data.gwf', 'L1:STRAIN')
# Access the data and metadata
print(f"Channel: {data.name}")
print(f"Sample rate: {data.sample_rate} Hz")
print(f"Duration: {data.duration} seconds")
print(f"Data shape: {data.array.shape}")
The read() function returns a TimeSeries object with:
array: NumPy array containing the dataname: Channel namedtype: NumPy dtype of the samples (mirrorsarray.dtype)start: Start time (GPS seconds)dt: Sample spacing (seconds)duration: Total duration (seconds)sample_rate: Sampling rate (Hz)unit: Physical unittype: Channel type ('proc', 'adc', or 'sim')
Reading Multiple Channels¶
# Read all channels
all_data = gwframe.read('data.gwf', channels=None)
for name, ts in all_data.items():
print(f"{name}: {len(ts.array)} samples")
# Read specific channels
channels = ['L1:STRAIN', 'L1:AUX-CHANNEL']
data_dict = gwframe.read('data.gwf', channels)
Time-Based Slicing¶
# Read data for a specific time range
data = gwframe.read(
'multi_frame.gwf',
'L1:STRAIN',
start=1234567890.0, # GPS start time
end=1234567900.0 # GPS end time
)
This automatically finds, reads, and stitches together all frames overlapping with the requested time range.
Reading from Memory¶
# Read from file-like object
with open('data.gwf', 'rb') as f:
data = gwframe.read(f, 'L1:STRAIN')
# Read from bytes
from io import BytesIO
with open('data.gwf', 'rb') as f:
gwf_bytes = f.read()
data = gwframe.read_bytes(gwf_bytes, 'L1:STRAIN')
Writing GWF Files¶
Simple Write¶
import numpy as np
import gwframe
# Generate some data
t = np.linspace(0, 1, 16384)
data = np.sin(2 * np.pi * 10 * t)
# Write to file
gwframe.write(
'output.gwf',
data,
start=1234567890.0, # GPS start time
sample_rate=16384, # Hz
name='L1:TEST',
unit='strain'
)
Writing Multiple Frames¶
The key feature of gwframe is efficient multi-frame writing:
with gwframe.FrameWriter('output.gwf') as writer:
for i in range(100):
data = np.random.randn(16384)
writer.write(
data,
start=1234567890.0 + i,
sample_rate=16384,
name='L1:TEST'
)
If a context manager doesn't fit your workflow, use open() and close() directly:
writer = gwframe.FrameWriter('output.gwf')
writer.open()
for i in range(100):
data = np.random.randn(16384)
writer.write(
data,
start=1234567890.0 + i,
sample_rate=16384,
name='L1:TEST'
)
writer.close()
Writing Multiple Channels¶
# Single frame with multiple channels
gwframe.write(
'output.gwf',
channels={
'L1:STRAIN': strain_data,
'L1:AUX': aux_data
},
start=1234567890.0,
sample_rate=16384,
name='L1'
)
Advanced Frame Creation¶
For more control, use the Frame class:
# Create frame
frame = gwframe.Frame(
start=1234567890.0,
duration=1.0,
name='L1',
run=1
)
# Add channels
frame.add_channel(
'L1:STRAIN',
strain_data,
sample_rate=16384,
unit='strain',
comment='Calibrated strain'
)
# Add metadata
frame.add_history('CREATOR', 'my_pipeline')
frame.add_history('VERSION', '1.0.0')
# Write frame
frame.write('output.gwf')
Inspecting GWF Files¶
# Get file information
info = gwframe.get_info('data.gwf')
print(f"Number of frames: {info.num_frames}")
for frame in info.frames:
print(f"Frame {frame.index}: {frame.name} at GPS {frame.start}, duration {frame.duration}s")
# Get available channels
channels = gwframe.get_channels('data.gwf')
for channel in channels:
print(channel)
Masked Arrays and Invalid Data¶
ADC channels in GWF files can carry a data-valid flag indicating the entire
channel contains suspect data. By default, reading such channels raises an
InvalidDataError:
import gwframe
# Raises InvalidDataError if the channel is flagged invalid
data = gwframe.read('data.gwf', 'H1:ADC-CHANNEL')
To read the data anyway, pass allow_invalid=True. The result is a NumPy
masked array with all samples masked:
data = gwframe.read('data.gwf', 'H1:ADC-CHANNEL', allow_invalid=True)
print(type(data.array)) # numpy.ma.MaskedArray
When writing masked arrays, the behavior depends on the channel type:
- ADC channels: The channel-level data-valid flag is set. Per-sample mask detail is lost (the entire channel is flagged invalid).
- Proc/sim channels: The mask is discarded (these channel types have no data-valid field in the frame format).
Use on_mask_loss to control what happens when mask information is lost:
import numpy as np
masked = np.ma.MaskedArray(data, mask=quality_mask)
frame = gwframe.Frame(start=1234567890.0, duration=1.0, name='H1')
# Default: warns when mask info is lost
frame.add_channel('H1:TEST', masked, sample_rate=16384,
channel_type='proc')
# Raise an error instead
frame.add_channel('H1:TEST', masked, sample_rate=16384,
channel_type='proc', on_mask_loss='raise')
# Silently discard the mask
frame.add_channel('H1:TEST', masked, sample_rate=16384,
channel_type='proc', on_mask_loss='ignore')
Data Validation¶
Enable CRC checksum validation for data integrity:
# Validate checksums when reading
data = gwframe.read(
'data.gwf',
'L1:STRAIN',
validate_checksum=True
)
Next Steps¶
- See the Examples page for more detailed examples
- Check the API Reference for complete documentation
- Read the Migration Guide if you're coming from SWIG bindings