General example

Choose and download data

Select a combination of variables and fetch data from one day.

Resample to one measurement per 10s; filter for the Northern Hemisphere.

[1]:
from viresclient import SwarmRequest
import datetime as dt

request = SwarmRequest()

request.set_collection("SW_OPER_MAGA_LR_1B")

request.set_products(measurements=["F","B_NEC"],
                     models=["MCO_SHA_2C", "MMA_SHA_2C-Primary", "MMA_SHA_2C-Secondary"],
                     auxiliaries=["QDLat", "QDLon"],
                     residuals=False,
                     sampling_step="PT10S")

request.set_range_filter(parameter="Latitude",
                         minimum=0,
                         maximum=90)

data = request.get_between(start_time=dt.datetime(2016,1,1),
                           end_time=dt.datetime(2016,1,2))
[1/1] Processing:  100%|██████████|  [ Elapsed: 00:01, Remaining: 00:00 ]
      Downloading: 100%|██████████|  [ Elapsed: 00:01, Remaining: 00:00 ] (0.766MB)

Save the data directly

You can save the generated data file directly then open it using other software.

[2]:
data.to_file('testfile.cdf')
Data written to testfile.cdf

Convert to a pandas DataFrame

pandas offers many useful features for working with 2D labelled time series data.

[4]:
df = data.as_dataframe()
df.head()
[4]:
Spacecraft Latitude Longitude Radius F F_MCO_SHA_2C F_MMA_SHA_2C-Primary F_MMA_SHA_2C-Secondary B_NEC B_NEC_MCO_SHA_2C B_NEC_MMA_SHA_2C-Primary B_NEC_MMA_SHA_2C-Secondary QDLat QDLon
Timestamp
2016-01-01 00:28:00 A 0.197610 -102.681841 6828390.60 24630.6312 24795.742996 127.134287 33.517576 [23432.3208, 2928.6274000000003, 7000.6948] [23604.765693069177, 2930.2688321265728, 7004.... [-125.78890509740265, -5.302191425378982, 17.6... [-32.5031562033331, -1.386212193226803, -8.065... 8.425159 -30.812750
2016-01-01 00:28:10 A 0.838158 -102.693946 6828288.87 24808.1073 24971.853392 127.197315 33.718966 [23451.1801, 2899.534, 7554.4035] [23623.327008268625, 2901.392633804825, 7557.3... [-125.60166979171441, -5.252627485291866, 19.3... [-32.494418615582255, -1.3566476193699097, -8.... 9.068963 -30.902256
2016-01-01 00:28:20 A 1.478724 -102.706041 6828186.32 24993.3709 25155.643730 127.257710 33.932493 [23465.812100000003, 2871.8025000000002, 8109.... [23637.6657342565, 2873.1207904780367, 8112.48... [-125.38835978948227, -5.203014637815361, 21.1... [-32.47777846302375, -1.3273711242288924, -9.7... 9.713511 -30.990784
2016-01-01 00:28:30 A 2.119308 -102.718118 6828082.97 25186.2207 25346.943757 127.315462 34.157878 [23476.3596, 2844.1066, 8666.6296] [23647.710917671076, 2845.447078036548, 8669.2... [-125.14900320901313, -5.153357531132542, 22.8... [-32.45318157347463, -1.2983807863969918, -10.... 10.358768 -31.078363
2016-01-01 00:28:40 A 2.759911 -102.730169 6827978.83 25386.4755 25545.573072 127.370566 34.394821 [23482.5073, 2817.0387, 9225.1501] [23653.3949273992, 2818.3663458182045, 9227.67... [-124.88363654458195, -5.103660714780139, 24.5... [-32.42057710492788, -1.2696744882122784, -11.... 11.004701 -31.165012

Extract numpy arrays

Individual entries (rows) in multi-dimensional variables (e.g. B_NEC) are currently stored as lists so they need to be as extracted as arrays to do array operations. This behaviour will likely change in the future to provide “flat” dataframes instead (i.e. the B_NEC column expanded as three columns: B_N, B_E, B_C, etc)

[5]:
import numpy as np

B_NEC = np.stack(df['B_NEC'].values)
X,Y,Z = (B_NEC[:,i] for i in (0,1,2))
X,Y,Z
[5]:
(array([23432.3208, 23451.1801, 23465.8121, ..., 19454.4269, 19271.3803,
        19084.199 ]),
 array([2928.6274, 2899.534 , 2871.8025, ...,  824.8021,  808.9134,
         792.9848]),
 array([ 7000.6948,  7554.4035,  8109.7535, ..., 32624.9088, 33094.2485,
        33559.0953]))

Pandas dataframes are not so good when we have higher-dimensional data, hence the motivation to use xarray: http://xarray.pydata.org/en/stable/faq.html#why-is-pandas-not-enough

Convert to an xarray Dataset

xarray extends the power of pandas to N-dimensional data but is more complex to work with.

[6]:
ds = data.as_xarray()
ds
[6]:
<xarray.Dataset>
Dimensions:                     (Timestamp: 4256, dim: 3)
Coordinates:
  * Timestamp                   (Timestamp) datetime64[ns] 2016-01-01T00:28:00 ... 2016-01-01T23:59:50
Dimensions without coordinates: dim
Data variables:
    Spacecraft                  (Timestamp) <U1 'A' 'A' 'A' 'A' ... 'A' 'A' 'A'
    Latitude                    (Timestamp) float64 0.1976 0.8382 ... 30.46 31.1
    Longitude                   (Timestamp) float64 -102.7 -102.7 ... -95.37
    Radius                      (Timestamp) float64 6.828e+06 ... 6.823e+06
    F                           (Timestamp) float64 2.463e+04 ... 3.861e+04
    F_MCO_SHA_2C                (Timestamp) float64 2.48e+04 ... 3.861e+04
    F_MMA_SHA_2C-Primary        (Timestamp) float64 127.1 127.2 ... 43.23 43.24
    F_MMA_SHA_2C-Secondary      (Timestamp) float64 33.52 33.72 ... 4.576 4.604
    B_NEC                       (Timestamp, dim) float64 2.343e+04 ... 3.356e+04
    B_NEC_MCO_SHA_2C            (Timestamp, dim) float64 2.36e+04 ... 3.354e+04
    B_NEC_MMA_SHA_2C-Primary    (Timestamp, dim) float64 -125.8 -5.302 ... 28.31
    B_NEC_MMA_SHA_2C-Secondary  (Timestamp, dim) float64 -32.5 -1.386 ... -4.19
    QDLat                       (Timestamp) float64 8.425 9.069 ... 40.18 40.81
    QDLon                       (Timestamp) float64 -30.81 -30.9 ... -24.81

Extracting numpy arrays from xarray:

[7]:
X,Y,Z = (ds["B_NEC"][:,i].values for i in (0,1,2))
X,Y,Z
[7]:
(array([23432.3208, 23451.1801, 23465.8121, ..., 19454.4269, 19271.3803,
        19084.199 ]),
 array([2928.6274, 2899.534 , 2871.8025, ...,  824.8021,  808.9134,
         792.9848]),
 array([ 7000.6948,  7554.4035,  8109.7535, ..., 32624.9088, 33094.2485,
        33559.0953]))

Work on xarray objects directly

Calculate the custom residual B_{res} = B_{obs} - B_{MCO} - B_{MMA} and plot the Z component against time

[8]:
B_res = ds["B_NEC"] - ds["B_NEC_MCO_SHA_2C"]\
                    - ds["B_NEC_MMA_SHA_2C-Primary"]\
                    - ds["B_NEC_MMA_SHA_2C-Secondary"]
B_res
[8]:
<xarray.DataArray (Timestamp: 4256, dim: 3)>
array([[-14.152832,   5.046971, -13.00904 ],
       [-14.05082 ,   4.750641, -13.44751 ],
       [-13.987496,   5.212095, -14.088606],
       ...,
       [ -0.065038,  -2.56282 ,  -3.695835],
       [  0.583051,  -3.247   ,  -4.246418],
       [  1.325767,  -4.054801,  -4.315355]])
Coordinates:
  * Timestamp  (Timestamp) datetime64[ns] 2016-01-01T00:28:00 ... 2016-01-01T23:59:50
Dimensions without coordinates: dim
[ ]:
%matplotlib inline
B_res[:,2].plot(x='Timestamp');
[ ]: