BSA Data Extraction¶
LCLS-Live provides simple functions to extract beam synchronous acquisition (BSA) data from archived HDF5 files. These files are stored on SLAC systems, and the user must have access to these to use these functions.
See the documentation: LCLS BEAM SYNCHRONOUS DATASTORE USER GUIDE at: https://www.slac.stanford.edu/grp/ad/docs/model/matlab/bsd.html
In [1]:
Copied!
%load_ext autoreload
%autoreload 2
%load_ext autoreload
%autoreload 2
BSA snapshots¶
This is the basic high-level function.
In [2]:
Copied!
from lcls_live.bsa import bsa_snapshot
from lcls_live.bsa import bsa_snapshot
In [3]:
Copied!
?bsa_snapshot
?bsa_snapshot
Signature: bsa_snapshot(timestamp, beampath, pvnames=None) Docstring: Extract as a snapshot (PV values) nearest a timestamp from a BSA HDF5 file. Parameters ---------- h5file: str BSA HDF5 file with data that includes the timestamp timestamp: pd.DateTime or datetime.datetime This must be localized (not naive time) pvnames : list or None List of PV names to extract. If None, all PVs in the source file will be extracted. Optional, default=None Returns ------- snapshot: dict Dict with: 'pvdata' : dict of {pv name:pv value} 'timestamp' : pd.Timestamp, including the nanosecond. 'source' : Original HDF5 file that the data came from. Examples -------- >>>bsa_snapshot('2021-11-11T00:00:00-08:00', 'cu_hxr') File: ~/GitHub/lcls-live/lcls_live/bsa.py Type: function
In [4]:
Copied!
%%time
snapshot = bsa_snapshot('2021-12-11T00:00:00-08:00', 'cu_hxr')
snapshot.keys()
%%time
snapshot = bsa_snapshot('2021-12-11T00:00:00-08:00', 'cu_hxr')
snapshot.keys()
CPU times: user 439 ms, sys: 38.6 ms, total: 478 ms Wall time: 1.28 s
Out[4]:
dict_keys(['pvdata', 'timestamp', 'source'])
In [5]:
Copied!
# The data is a simple dict
pvdata = snapshot['pvdata']
len(pvdata)
# The data is a simple dict
pvdata = snapshot['pvdata']
len(pvdata)
Out[5]:
1091
In [6]:
Copied!
# Here are a few keys in the dict
list(pvdata)[0:10]
# Here are a few keys in the dict
list(pvdata)[0:10]
Out[6]:
['ACCL_IN20_300_L0A_A', 'ACCL_IN20_300_L0A_P', 'ACCL_IN20_400_L0B_A', 'ACCL_IN20_400_L0B_P', 'ACCL_LI21_180_L1X_A', 'ACCL_LI21_180_L1X_P', 'ACCL_LI21_1_L1S_A', 'ACCL_LI21_1_L1S_P', 'BLD_SYS0_500_ANG_X', 'BLD_SYS0_500_ANG_Y']
In [7]:
Copied!
# And some values
for k in list(pvdata)[0:10]:
print(k, pvdata[k])
# And some values
for k in list(pvdata)[0:10]:
print(k, pvdata[k])
ACCL_IN20_300_L0A_A 57.99538201588009 ACCL_IN20_300_L0A_P -0.00917495265120749 ACCL_IN20_400_L0B_A 69.47708616061887 ACCL_IN20_400_L0B_P -2.564164251349297 ACCL_LI21_180_L1X_A 21.016761493860674 ACCL_LI21_180_L1X_P -160.00175793392177 ACCL_LI21_1_L1S_A 111.53502637024258 ACCL_LI21_1_L1S_P -22.394640079900164 BLD_SYS0_500_ANG_X -0.03628770291941744 BLD_SYS0_500_ANG_Y -0.0050121151142197545
In [8]:
Copied!
# This is the exact time the data is at
snapshot['timestamp']
# This is the exact time the data is at
snapshot['timestamp']
Out[8]:
Timestamp('2021-12-11 08:00:00.003286466+0000', tz='UTC')
In [9]:
Copied!
# And the original HDF5 source file
snapshot['source']
# And the original HDF5 source file
snapshot['source']
Out[9]:
'/gpfs/slac/staas/fs1/g/bsd/BSAService/data/2021/12/11/CU_HXR_20211211_080825.h5'
In [10]:
Copied!
# Note that some values are nan
pvdata['BLM_UNDH_0235_QDCRAW']
# Note that some values are nan
pvdata['BLM_UNDH_0235_QDCRAW']
Out[10]:
array(nan)
In [11]:
Copied!
# Adding a list pv names to be extracted. Note that any PV not present is simply returned as None
bsa_snapshot('2021-12-11T00:00:00-08:00', 'cu_hxr',
pvnames = ['ACCL_IN20_300_L0A_A', 'ACCL_IN20_300_L0A_P', 'dummy'])
# Adding a list pv names to be extracted. Note that any PV not present is simply returned as None
bsa_snapshot('2021-12-11T00:00:00-08:00', 'cu_hxr',
pvnames = ['ACCL_IN20_300_L0A_A', 'ACCL_IN20_300_L0A_P', 'dummy'])
Out[11]:
{'pvdata': {'ACCL_IN20_300_L0A_A': array(57.99538202), 'ACCL_IN20_300_L0A_P': array(-0.00917495), 'dummy': None}, 'timestamp': Timestamp('2021-12-11 08:00:00.003286466+0000', tz='UTC'), 'source': '/gpfs/slac/staas/fs1/g/bsd/BSAService/data/2021/12/11/CU_HXR_20211211_080825.h5'}
Notes on timestamps¶
Timestamps here must have localization information (i.e. the time zone). Otherwise it is ambiguous what time to extract. The internal data files and directories are named and described in UTC time only.
See: https://pandas.pydata.org/docs/reference/api/pandas.Timestamp.html
In [12]:
Copied!
# The timestamp must be localized, so this will fail:
try:
bsa_snapshot('2021-12-11T00:00:00', 'cu_hxr')
except Exception as ex:
print(ex)
# The timestamp must be localized, so this will fail:
try:
bsa_snapshot('2021-12-11T00:00:00', 'cu_hxr')
except Exception as ex:
print(ex)
Cannot convert tz-naive Timestamp, use tz_localize to localize
In [13]:
Copied!
import datetime
# This is not localized:
datetime.datetime(2021, 12, 1, 17, 7, 49)
import datetime
# This is not localized:
datetime.datetime(2021, 12, 1, 17, 7, 49)
Out[13]:
datetime.datetime(2021, 12, 1, 17, 7, 49)
In [14]:
Copied!
# but this is:
dtime = datetime.datetime(2021, 12, 1, 17, 7, 49, tzinfo=datetime.timezone.utc)
dtime
# but this is:
dtime = datetime.datetime(2021, 12, 1, 17, 7, 49, tzinfo=datetime.timezone.utc)
dtime
Out[14]:
datetime.datetime(2021, 12, 1, 17, 7, 49, tzinfo=datetime.timezone.utc)
In [15]:
Copied!
# And will work with bsa_snapshot
bsa_snapshot(dtime, 'cu_hxr')['timestamp']
# And will work with bsa_snapshot
bsa_snapshot(dtime, 'cu_hxr')['timestamp']
Out[15]:
Timestamp('2021-12-01 17:07:49.002202872+0000', tz='UTC')
Helper functions¶
In [16]:
Copied!
from lcls_live.bsa import bsa_h5file, BSA_DATA_SEARCH_PATHS
from lcls_live.bsa import bsa_h5file, BSA_DATA_SEARCH_PATHS
In [17]:
Copied!
# These are the pahs searched.
BSA_DATA_SEARCH_PATHS
# These are the pahs searched.
BSA_DATA_SEARCH_PATHS
Out[17]:
['/gpfs/slac/staas/fs1/g/bsd/BSAService/data/', '/nfs/slac/g/bsd/BSAService/data/']
In [18]:
Copied!
# Find the appropriate file
bsa_h5file('2021-12-11T00:00:00-08:00', 'cu_hxr')
# Find the appropriate file
bsa_h5file('2021-12-11T00:00:00-08:00', 'cu_hxr')
Out[18]:
'/gpfs/slac/staas/fs1/g/bsd/BSAService/data/2021/12/11/CU_HXR_20211211_080825.h5'
In [19]:
Copied!
?bsa_h5file
?bsa_h5file
Signature: bsa_h5file(timestamp, beampath) Docstring: Finds the BSA HDF5 file that contains the timestamp for a given beampath BSA data files are named as: CU_SXR_20211210_140742.h5 Which corresponds to '{beampath}_{time_str}.h5' with time_str in the format: '%Y%m%d_%H%M%S' See the documentation in: https://www.slac.stanford.edu/grp/ad/docs/model/matlab/bsd.html "The data files are named with the UTC datestamp of the END of their data taking period" Parameters ---------- timestamp: pd.DateTime or datetime.datetime This must be localized (not naive time) beampath : str one of ['cu_hxr', 'cu_sxr'] (case independent) Returns ------- h5file : str Full path to the HDF5 file that should contain the time. Examples -------- >>> bsa_h5file('2021-12-11T00:00:00-08:00', 'cu_hxr') '/gpfs/slac/staas/fs1/g/bsd/BSAService/data/2021/12/11/CU_HXR_20211211_080825.h5' File: ~/GitHub/lcls-live/lcls_live/bsa.py Type: function