Writing and testing a python implementation of the updated tests described in Dall’Olmo et al. 2023
In this post I will examine the updated backscatter testing described in Dall’Olmo et al. 2023 implementation in medsrtqc, a python implementation of Argo Real Time Quality Control (RTQC). In its current state, the package contains the RTQC tests for chlorophyll (CHLA), backscatter (BBP) and pH, as well as many of standard tests listed in the official vocabulary.
We will use the same floats and cycles shown as examples in Dall’Olmo et al., discuss the expected results, and then examine the results given by the package.
For each test, code for that test is shown in python. This code is incomplete, as it is just snippets of the tests written in medsrtqc, however the code structure and variable names should be instructive should anyone want to re-purpose these functions.
A custom plotting function to show the flags of each profile is used throughout the post. The definition of that function can be found at the bottom of the post.
The floats we will be analyzing will have already been QC’ed (in fact, most have been DMQC’ed), so most data will already be flagged appropriately, or in some cases the QC flags may have been elevated by the DMQC operator. For this reason, we will fully reset all flags to 0 to simulate the actual RTQC result.
The missing data test bins the profile and checks how many of those bins are populated. Good data will have all bins populated, probably bad data will have multiple bins with no data, and bad data will have only one bin with data.
The test is written as follows in python:
# missing data test
self.log('Performing missing data test')
# find amount of data in each bin
= [0, 50, 156, 261, 367, 472, 578, 683, 789, 894, 1000]
bins = np.histogram(bbp.pres, bins=bins)
hist, bins # 1 or more bin empty, flag as probably bad (3)
= Flag.PROBABLY_BAD if sum(hist == 0) > 1 else Flag.GOOD
new_flag # all but 1 empty, flag as bad (4)
= Flag.BAD if sum(hist != 0) == 1 else new_flag
new_flag # all empty, flag as missing (9)
= Flag.MISSING if all(hist == 0) else new_flag
new_flag # update flags and log
Flag.update_safely(bbp.qc, new_flag)self.log(f'Missing data test results: flags set to {new_flag}')
The examples shown in the paper are on floats 1901339 cycle 1, 4900561 cycle 8, and 6901004 cycle 41. We expect to see the full profile flagged as 3 (probably bad), 3, and 4 (bad) respectively.
from medsrtqc.qc.util import ResetQCOperation
from medsrtqc.qc.bbp import bbpTest
from medsrtqc.nc import read_nc_profile
from medsrtqc.qc.check import preTestCheck
# example files
= ["BD1901339_001.nc", "BR7900561_008.nc", "BD6901004_041.nc"]
files # fig/axes to plot results
= plt.subplots(1, 3, sharey=True, constrained_layout=True)
fig, axes
= preTestCheck()
check = bbpTest()
bbp
# loop through each profile
for fn, ax in zip(files, axes):
= read_nc_profile("data/" + fn, mode="r+")
nc
ResetQCOperation().run(nc)= check.run(nc)
tests
print("before: ", nc["BBP700"])
nc.prepare(tests)
bbp.run(nc)print("after: ", nc["BBP700"])
= plot_profile_flags(nc, ax=ax, ylim=(1200, 0), xlim=(-0.002, 0.015))
g nc.close()
before: Trace(
value=[0.0025575757, 0.0022508001, 0.0028643121, [25 values], 0.00084325275, 0.00070981274, 0.00081648194],
qc=[b'0', b'0', b'0', [25 values], b'0', b'0', b'0'],
adjusted=[masked, masked, masked, [25 values], masked, masked, masked],
adjusted_qc=[masked, masked, masked, [25 values], masked, masked, masked],
adjusted_error=[masked, masked, masked, [25 values], masked, masked, masked],
pres=[7.61, 11.83, 21.68, [25 values], 281.02, 291.23, 300.71],
mtime=[masked, masked, masked, [25 values], masked, masked, masked]
)
after: Trace(
value=[0.0025575757, 0.0022508001, 0.0028643121, [25 values], 0.00084325275, 0.00070981274, 0.00081648194],
qc=[b'3', b'3', b'3', [25 values], b'3', b'3', b'3'],
adjusted=[masked, masked, masked, [25 values], masked, masked, masked],
adjusted_qc=[masked, masked, masked, [25 values], masked, masked, masked],
adjusted_error=[masked, masked, masked, [25 values], masked, masked, masked],
pres=[7.61, 11.83, 21.68, [25 values], 281.02, 291.23, 300.71],
mtime=[masked, masked, masked, [25 values], masked, masked, masked]
)
before: Trace(
value=[0.0004414144, 0.00048095727, 0.00046771928, [137 values], 0.00033030732, 0.00036985305, 0.0003302439],
qc=[b'0', b'0', b'0', [137 values], b'0', b'0', b'0'],
adjusted=[0.0004414144, 0.00048095727, 0.00046771928, [137 values], 0.00033030732, 0.00036985305, 0.0003302439],
adjusted_qc=[b'4', b'4', b'4', [137 values], b'4', b'4', b'4'],
adjusted_error=[masked, masked, masked, [137 values], masked, masked, masked],
pres=[267.06625, 272.18, 277.255, [137 values], 982.7982, 987.80884, 991.61237],
mtime=[masked, masked, masked, [137 values], masked, masked, masked]
)
after: Trace(
value=[0.0004414144, 0.00048095727, 0.00046771928, [137 values], 0.00033030732, 0.00036985305, 0.0003302439],
qc=[b'3', b'3', b'3', [137 values], b'3', b'3', b'3'],
adjusted=[0.0004414144, 0.00048095727, 0.00046771928, [137 values], 0.00033030732, 0.00036985305, 0.0003302439],
adjusted_qc=[b'4', b'4', b'4', [137 values], b'4', b'4', b'4'],
adjusted_error=[masked, masked, masked, [137 values], masked, masked, masked],
pres=[267.06625, 272.18, 277.255, [137 values], 982.7982, 987.80884, 991.61237],
mtime=[masked, masked, masked, [137 values], masked, masked, masked]
)
before: Trace(
value=[0.0004893391, -0.00025918605, 0.0006870628, [1101 values], 0.00033398485, 0.001548573, 0.00043284672],
qc=[b'0', b'0', b'0', [1101 values], b'0', b'0', b'0'],
adjusted=[masked, masked, masked, [1101 values], masked, masked, masked],
adjusted_qc=[b'4', b'4', b'4', [1101 values], b'4', b'4', b'4'],
adjusted_error=[masked, masked, masked, [1101 values], masked, masked, masked],
pres=[107.5, 107.5, 107.5, [1101 values], 107.5, 107.5, 107.5],
mtime=[masked, masked, masked, [1101 values], masked, masked, masked]
)
after: Trace(
value=[0.0004893391, -0.00025918605, 0.0006870628, [1101 values], 0.00033398485, 0.001548573, 0.00043284672],
qc=[b'4', b'4', b'4', [1101 values], b'4', b'4', b'4'],
adjusted=[masked, masked, masked, [1101 values], masked, masked, masked],
adjusted_qc=[b'4', b'4', b'4', [1101 values], b'4', b'4', b'4'],
adjusted_error=[masked, masked, masked, [1101 values], masked, masked, masked],
pres=[107.5, 107.5, 107.5, [1101 values], 107.5, 107.5, 107.5],
mtime=[masked, masked, masked, [1101 values], masked, masked, masked]
)
[medsrtqc.nc.NetCDFProfile log] Setting previously unset flags for BBP to GOOD
[medsrtqc.nc.NetCDFProfile log] Performing missing data test
[medsrtqc.nc.NetCDFProfile log] Missing data test results: flags set to 3
[medsrtqc.nc.NetCDFProfile log] Performing high deep value
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] High deep value test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing noisy profile test
[medsrtqc.nc.NetCDFProfile log] Noisy profile test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing negative bbp test
[medsrtqc.nc.NetCDFProfile log] Negative bbp test result: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Applying global range test to BBP
[medsrtqc.nc.NetCDFProfile log] Performing negative spike test
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] Performing stuck value test on bbp
[medsrtqc.nc.NetCDFProfile log] Setting previously unset flags for BBP to GOOD
[medsrtqc.nc.NetCDFProfile log] Performing missing data test
[medsrtqc.nc.NetCDFProfile log] Missing data test results: flags set to 3
[medsrtqc.nc.NetCDFProfile log] Performing high deep value
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] High deep value test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing noisy profile test
[medsrtqc.nc.NetCDFProfile log] Noisy profile test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing negative bbp test
[medsrtqc.nc.NetCDFProfile log] Negative bbp test result: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing parking hook test
[medsrtqc.nc.NetCDFProfile log] Parking hook test results: 0 points set to 4
[medsrtqc.nc.NetCDFProfile log] Applying global range test to BBP
[medsrtqc.nc.NetCDFProfile log] Performing negative spike test
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] Performing stuck value test on bbp
[medsrtqc.nc.NetCDFProfile log] Setting previously unset flags for BBP to GOOD
[medsrtqc.nc.NetCDFProfile log] Performing missing data test
[medsrtqc.nc.NetCDFProfile log] Missing data test results: flags set to 4
[medsrtqc.nc.NetCDFProfile log] Performing high deep value
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] High deep value test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing noisy profile test
[medsrtqc.nc.NetCDFProfile log] Noisy profile test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing negative bbp test
[medsrtqc.nc.NetCDFProfile log] Negative bbp test result: flags set to 3
[medsrtqc.nc.NetCDFProfile log] Applying global range test to BBP
[medsrtqc.nc.NetCDFProfile log] Performing negative spike test
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] Performing stuck value test on bbp
plt.show()
✅ BD1901339_001: Flags are set to 3 as expected.
✅ BD7900561_008: Flags are set to 3 as expected.
✅ BD6901004_041: Flags are set to 4 as expected.
The high deep value test aims to flag profiles with anomalously high BBP values at depth. This could be a symptom of biofouling, poor calibration, or sensor error. These could also be valid data, in which case the flags would be returned to a value of 1 or 2 by the D-mode operator. The test checks if median-filtered BBP is higher than a threshold value of 5e-4.
The test is written as follows in python:
# high deep value test
self.log('Performing high deep value')
# compute median, check which are above threshold value
= self.running_median(5)
median_bbp = (sum(bbp.pres > 700) > 5) & (np.nanmedian(median_bbp[bbp.pres > 700]) > 5e-4)
high_deep_value # update to 3 if there are high deep values
= Flag.PROBABLY_BAD if high_deep_value else Flag.GOOD
new_flag
Flag.update_safely(bbp.qc, new_flag)self.log(f'High deep value test results: flags set to {new_flag}')
The example shown in the paper is on float 3901531 cycle 125. We expect the flags to be set to 3.
= "BD3901531_125.nc"
fn = plt.subplots(constrained_layout=True)
fig, ax *2/5, fig.get_figheight())
fig.set_size_inches(fig.get_figwidth()
= read_nc_profile("data/" + fn, mode="r+")
nc
ResetQCOperation().run(nc)= check.run(nc)
tests
print("before: ", nc["BBP700"])
before: Trace(
value=[0.009043367, 0.027663978, 0.021865454, [1082 values], 0.0023098632, 0.0025140366, 0.002350698],
qc=[b'0', b'0', b'0', [1082 values], b'0', b'0', b'0'],
adjusted=[0.009043367, 0.027663978, 0.021865454, [1082 values], 0.0023098632, 0.0025140366, 0.002350698],
adjusted_qc=[b'3', b'3', b'3', [1082 values], b'3', b'4', b'4'],
adjusted_error=[masked, masked, masked, [1082 values], masked, masked, masked],
pres=[0.0, 0.0, 0.0, [1082 values], 1017.7, 1017.7, 1017.5],
mtime=[masked, masked, masked, [1082 values], masked, masked, masked]
)
nc.prepare(tests) bbp.run(nc)
[medsrtqc.nc.NetCDFProfile log] Setting previously unset flags for BBP to GOOD
[medsrtqc.nc.NetCDFProfile log] Performing missing data test
[medsrtqc.nc.NetCDFProfile log] Missing data test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing high deep value
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] High deep value test results: flags set to 3
[medsrtqc.nc.NetCDFProfile log] Performing noisy profile test
[medsrtqc.nc.NetCDFProfile log] Noisy profile test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing negative bbp test
[medsrtqc.nc.NetCDFProfile log] Negative bbp test result: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing parking hook test
[medsrtqc.nc.NetCDFProfile log] Parking hook test results: 12 points set to 4
[medsrtqc.nc.NetCDFProfile log] Applying global range test to BBP
[medsrtqc.nc.NetCDFProfile log] Performing negative spike test
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] Performing stuck value test on bbp
print("after: ", nc["BBP700"])
after: Trace(
value=[0.009043367, 0.027663978, 0.021865454, [1082 values], 0.0023098632, 0.0025140366, 0.002350698],
qc=[b'3', b'3', b'3', [1082 values], b'3', b'4', b'4'],
adjusted=[0.009043367, 0.027663978, 0.021865454, [1082 values], 0.0023098632, 0.0025140366, 0.002350698],
adjusted_qc=[b'3', b'3', b'3', [1082 values], b'3', b'4', b'4'],
adjusted_error=[masked, masked, masked, [1082 values], masked, masked, masked],
pres=[0.0, 0.0, 0.0, [1082 values], 1017.7, 1017.7, 1017.5],
mtime=[masked, masked, masked, [1082 values], masked, masked, masked]
)
= plot_profile_flags(nc, ax=ax, ylim=(1200, 0), xlim=(-0.002, 0.015))
g
nc.close() plt.show()
✅ BD3901531_125: Flags are set to 3 as expected. Some points at the bottom also fail the Parking Hook Test.
Flag profiles that are affected by noisy data by checking the portion of residuals to the median profile above a threshold value.
The test is written as follows in python:
self.log('Performing noisy profile test')
# below surface
= bbp.pres > 100
deep_ix # compute residuals below surface
= bbp.value - median_bbp
residual = residual > 0.0005
high_residuals = high_residuals[deep_ix]
high_residuals # portion of residuals
= 100*sum(high_residuals)/len(high_residuals)
pct_residuals = pct_residuals > 10
many_high_residuals # if there are a lot of high residuals, flag as 3
= Flag.PROBABLY_BAD if many_high_residuals else Flag.GOOD
new_flag
Flag.update_safely(bbp.qc, new_flag)self.log(f'Noisy profile test results: flags set to {new_flag}')
On float 6901151 cycle 79 we expect to flag to profile as 3:
= "BD6901151_079.nc"
fn = plt.subplots(constrained_layout=True)
fig, ax *2/5, fig.get_figheight())
fig.set_size_inches(fig.get_figwidth()
= read_nc_profile("data/" + fn, mode="r+")
nc
ResetQCOperation().run(nc)= check.run(nc)
tests
print("before: ", nc["BBP700"])
before: Trace(
value=[0.0074264565, 0.0022428727, 0.0032238269, [328 values], 0.0036023555, 0.0008325885, 0.0008325483],
qc=[b'0', b'0', b'0', [328 values], b'0', b'0', b'0'],
adjusted=[0.0074264565, 0.0022428727, 0.0032238269, [328 values], 0.0036023555, 0.0008325885, 0.0008325483],
adjusted_qc=[b'3', b'3', b'3', [328 values], b'3', b'3', b'3'],
adjusted_error=[masked, masked, masked, [328 values], masked, masked, masked],
pres=[0.7, 1.1, 1.2, [328 values], 977.6, 989.2, 993.8],
mtime=[masked, masked, masked, [328 values], masked, masked, masked]
)
nc.prepare(tests) bbp.run(nc)
[medsrtqc.nc.NetCDFProfile log] Setting previously unset flags for BBP to GOOD
[medsrtqc.nc.NetCDFProfile log] Performing missing data test
[medsrtqc.nc.NetCDFProfile log] Missing data test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing high deep value
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] High deep value test results: flags set to 3
[medsrtqc.nc.NetCDFProfile log] Performing noisy profile test
[medsrtqc.nc.NetCDFProfile log] Noisy profile test results: flags set to 3
[medsrtqc.nc.NetCDFProfile log] Performing negative bbp test
[medsrtqc.nc.NetCDFProfile log] Negative bbp test result: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing parking hook test
[medsrtqc.nc.NetCDFProfile log] Parking hook test results: 1 points set to 4
[medsrtqc.nc.NetCDFProfile log] Applying global range test to BBP
[medsrtqc.nc.NetCDFProfile log] Performing negative spike test
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] Performing stuck value test on bbp
print("after: ", nc["BBP700"])
after: Trace(
value=[0.0074264565, 0.0022428727, 0.0032238269, [328 values], 0.0036023555, 0.0008325885, 0.0008325483],
qc=[b'3', b'3', b'3', [328 values], b'3', b'3', b'3'],
adjusted=[0.0074264565, 0.0022428727, 0.0032238269, [328 values], 0.0036023555, 0.0008325885, 0.0008325483],
adjusted_qc=[b'3', b'3', b'3', [328 values], b'3', b'3', b'3'],
adjusted_error=[masked, masked, masked, [328 values], masked, masked, masked],
pres=[0.7, 1.1, 1.2, [328 values], 977.6, 989.2, 993.8],
mtime=[masked, masked, masked, [328 values], masked, masked, masked]
)
= plot_profile_flags(nc, ax=ax, ylim=(1200, 0), xlim=(-0.002, 0.015))
g
nc.close() plt.show()
✅ BD6901151_079: Flags are set to 3 as expected. One point also fails the Parking Hook Test.
The objective of this test is to flag negative backscatter values. Negative values in the top 5dbar will be flagged as 4 as they most likely represent in-air backscatter samples. If there are other negative values, the profile will be flagged as 3, or if the profile consists of more than 10% negative values, the profile is flagged as 4.
The test is written as follows in python:
self.log('Performing negative bbp test')
# negative points at the very surface
= (bbp.pres < 5) & (bbp.value < 0)
shallow_and_negative if any(shallow_and_negative):
self.log(f'Negative bbp test results: shallow negative flags set to {Flag.BAD}')
# update surface flags
=shallow_and_negative)
Flag.update_safely(bbp.qc, Flag.BAD, where# negative values in rest of profile
= (bbp.pres > 5) & (bbp.value < 0)
deep_and_negative = 100*sum(deep_and_negative)/len(deep_and_negative)
pct_negative # if more than 10 pct are negative
= pct_negative > 10
many_negative # flag as 3 if any negative
= Flag.PROBABLY_BAD if any(deep_and_negative) else Flag.GOOD
new_flag # flag as 4 if > 10% negative
= Flag.BAD if many_negative else new_flag
new_flag
Flag.update_safely(bbp.qc, new_flag)self.log(f'Negative bbp test result: flags set to {new_flag}')
For this test we will check 2 floats: 6901654 cycle 56 and 6901151 cycle 7. For the first profile, there is only a surface negative value, which should be flagged as 4 but the rest of the profile should be fine. For the second profile, there are negative deep values so the whole profile should be flagged 3 or 4 depending on the portion of negative data.
# example files
= ["BD6901654_056.nc", "BD6901151_007.nc"]
files # fig/axes to plot results
= plt.subplots(1, 2, sharey=True, constrained_layout=True)
fig, axes *4/5, fig.get_figheight())
fig.set_size_inches(fig.get_figwidth()
# loop through each profile
for fn, ax in zip(files, axes):
= read_nc_profile("data/" + fn, mode="r+")
nc
ResetQCOperation().run(nc)= check.run(nc)
tests
print("before: ", nc["BBP700"])
nc.prepare(tests)
bbp.run(nc)print("after: ", nc["BBP700"])
= plot_profile_flags(nc, ax=ax, ylim=(1200, 0), xlim=(-0.002, 0.005))
g nc.close()
before: Trace(
value=[-0.0004011667, 0.00046792105, 0.00046791686, [1190 values], 0.00025597008, 0.00024438222, 0.00025597008],
qc=[b'0', b'0', b'0', [1190 values], b'0', b'0', b'0'],
adjusted=[-0.0004011667, 0.00046792105, 0.00046791686, [1190 values], 0.00025597008, 0.00024438222, 0.00025597008],
adjusted_qc=[b'4', b'1', b'1', [1190 values], b'1', b'1', b'1'],
adjusted_error=[masked, masked, masked, [1190 values], masked, masked, masked],
pres=[0.6, 0.7, 1.1, [1190 values], 986.7, 986.7, 986.7],
mtime=[masked, masked, masked, [1190 values], masked, masked, masked]
)
after: Trace(
value=[-0.0004011667, 0.00046792105, 0.00046791686, [1190 values], 0.00025597008, 0.00024438222, 0.00025597008],
qc=[b'4', b'1', b'1', [1190 values], b'1', b'1', b'1'],
adjusted=[-0.0004011667, 0.00046792105, 0.00046791686, [1190 values], 0.00025597008, 0.00024438222, 0.00025597008],
adjusted_qc=[b'4', b'1', b'1', [1190 values], b'1', b'1', b'1'],
adjusted_error=[masked, masked, masked, [1190 values], masked, masked, masked],
pres=[0.6, 0.7, 1.1, [1190 values], 986.7, 986.7, 986.7],
mtime=[masked, masked, masked, [1190 values], masked, masked, masked]
)
before: Trace(
value=[0.00080094114, 0.00082017767, 0.00083941227, [354 values], -0.0003456371, -0.0002303965, -0.00021136053],
qc=[b'0', b'0', b'0', [354 values], b'0', b'0', b'0'],
adjusted=[0.00080094114, 0.00082017767, 0.00083941227, [354 values], -0.0003456371, -0.0002303965, -0.00021136053],
adjusted_qc=[b'4', b'4', b'4', [354 values], b'4', b'4', b'4'],
adjusted_error=[masked, masked, masked, [354 values], masked, masked, masked],
pres=[0.6, 1.1, 1.2, [354 values], 1840.2, 1903.7, 1970.2],
mtime=[masked, masked, masked, [354 values], masked, masked, masked]
)
after: Trace(
value=[0.00080094114, 0.00082017767, 0.00083941227, [354 values], -0.0003456371, -0.0002303965, -0.00021136053],
qc=[b'4', b'4', b'4', [354 values], b'4', b'4', b'4'],
adjusted=[0.00080094114, 0.00082017767, 0.00083941227, [354 values], -0.0003456371, -0.0002303965, -0.00021136053],
adjusted_qc=[b'4', b'4', b'4', [354 values], b'4', b'4', b'4'],
adjusted_error=[masked, masked, masked, [354 values], masked, masked, masked],
pres=[0.6, 1.1, 1.2, [354 values], 1840.2, 1903.7, 1970.2],
mtime=[masked, masked, masked, [354 values], masked, masked, masked]
)
[medsrtqc.nc.NetCDFProfile log] Setting previously unset flags for BBP to GOOD
[medsrtqc.nc.NetCDFProfile log] Performing missing data test
[medsrtqc.nc.NetCDFProfile log] Missing data test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing high deep value
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] High deep value test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing noisy profile test
[medsrtqc.nc.NetCDFProfile log] Noisy profile test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing negative bbp test
[medsrtqc.nc.NetCDFProfile log] Negative bbp test results: shallow negative flags set to 4
[medsrtqc.nc.NetCDFProfile log] Negative bbp test result: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing parking hook test
[medsrtqc.nc.NetCDFProfile log] Parking hook test results: 1 points set to 4
[medsrtqc.nc.NetCDFProfile log] Applying global range test to BBP
[medsrtqc.nc.NetCDFProfile log] Performing negative spike test
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] Performing stuck value test on bbp
[medsrtqc.nc.NetCDFProfile log] Setting previously unset flags for BBP to GOOD
[medsrtqc.nc.NetCDFProfile log] Performing missing data test
[medsrtqc.nc.NetCDFProfile log] Missing data test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing high deep value
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] High deep value test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing noisy profile test
[medsrtqc.nc.NetCDFProfile log] Noisy profile test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing negative bbp test
[medsrtqc.nc.NetCDFProfile log] Negative bbp test result: flags set to 4
[medsrtqc.nc.NetCDFProfile log] Applying global range test to BBP
[medsrtqc.nc.NetCDFProfile log] Performing negative spike test
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] Performing stuck value test on bbp
plt.show()
✅ BD6901654_056: Top negative point marked as bad, rest of profile ok. One point also fails the Parking Hook Test.
✅ BD6901151_007: Whole profile marked as bad as expected.
The parking hook test applies to profiles that have the same parking depth as their profile depth. In this configuration, there is no descent phase before the beginning of the profile, but instead the float immediately starts sampling and ascending toward the surface. Accumulated particles during the parking stage cause a distinctive hook at the bottom of the profile before they are released off the sensor back into the water. Although these data can be useful to expert users, they are not the expected data a user would want to find in an Argo profile and so the goal of this test is to flag these anomalous points.
The test is written as follows in python:
if ascending:
= bbp.pres
pres abs(pres) > 6000] = np.nan # remove fill values of 99999
pres[np.= np.sort(pres)
pres # calculate deepest difference
= pres[-1] - pres[-2]
deepest_diff # deep points closer than 20m
if deepest_diff < 20:
= 1000 # placeholder, assumed for now
parking_pres # difference between parking pressure and
= np.abs(pres[-1] - parking_pres)
parking_diff if parking_diff < 100:
self.log('Performing parking hook test')
# check for high points in deepest 50m of the profile
= (bbp.pres < (pres[-1] - 20)) & (bbp.pres > (pres[-1] - 50))
ix = np.median(bbp.value[ix]) + 0.0002
baseline = (bbp.pres > (pres[-1] - 50)) & (bbp.value > baseline)
deep_above_baseline = all_passed and not any(deep_above_baseline)
all_passed # update only the bottom flags
=deep_above_baseline)
Flag.update_safely(bbp.qc, Flag.BAD, whereself.log(f'Parking hook test results: flags set to {new_flag}')
The example profile for this test is float 6903197 cycle 26.
= "BD6903197_026.nc"
fn = plt.subplots(constrained_layout=True)
fig, ax *2/5, fig.get_figheight())
fig.set_size_inches(fig.get_figwidth()
= read_nc_profile("data/" + fn, mode="r+")
nc
ResetQCOperation().run(nc)= check.run(nc)
tests
print("before: ", nc["BBP700"])
before: Trace(
value=[0.0013924582, 0.00164023, 0.0015381566, [436 values], 0.00073923037, 0.00094326853, 0.0018410363],
qc=[b'0', b'0', b'0', [436 values], b'0', b'0', b'0'],
adjusted=[0.0013924582, 0.00164023, 0.0015381566, [436 values], 0.00073923037, 0.00094326853, 0.0018410363],
adjusted_qc=[b'1', b'1', b'1', [436 values], b'1', b'1', b'1'],
adjusted_error=[0.00027849164, 0.00032804598, 0.00030763133, [436 values], 0.00014784608, 0.0001886537, 0.00036820726],
pres=[0.3, 1.0, 1.2, [436 values], 932.8, 932.8, 932.8],
mtime=[masked, masked, masked, [436 values], masked, masked, masked]
)
nc.prepare(tests) bbp.run(nc)
[medsrtqc.nc.NetCDFProfile log] Setting previously unset flags for BBP to GOOD
[medsrtqc.nc.NetCDFProfile log] Performing missing data test
[medsrtqc.nc.NetCDFProfile log] Missing data test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing high deep value
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] High deep value test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing noisy profile test
[medsrtqc.nc.NetCDFProfile log] Noisy profile test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing negative bbp test
[medsrtqc.nc.NetCDFProfile log] Negative bbp test result: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing parking hook test
[medsrtqc.nc.NetCDFProfile log] Parking hook test results: 9 points set to 4
[medsrtqc.nc.NetCDFProfile log] Applying global range test to BBP
[medsrtqc.nc.NetCDFProfile log] Performing negative spike test
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] Performing stuck value test on bbp
print("after: ", nc["BBP700"])
after: Trace(
value=[0.0013924582, 0.00164023, 0.0015381566, [436 values], 0.00073923037, 0.00094326853, 0.0018410363],
qc=[b'1', b'1', b'1', [436 values], b'4', b'4', b'4'],
adjusted=[0.0013924582, 0.00164023, 0.0015381566, [436 values], 0.00073923037, 0.00094326853, 0.0018410363],
adjusted_qc=[b'1', b'1', b'1', [436 values], b'1', b'1', b'1'],
adjusted_error=[0.00027849164, 0.00032804598, 0.00030763133, [436 values], 0.00014784608, 0.0001886537, 0.00036820726],
pres=[0.3, 1.0, 1.2, [436 values], 932.8, 932.8, 932.8],
mtime=[masked, masked, masked, [436 values], masked, masked, masked]
)
= plot_profile_flags(nc, ax=ax, ylim=(1070, 800), xlim=(-0.0005, 0.003))
g
nc.close() plt.show()
✅ BD6903197_027: 9 high value points at depth flagged as bad.
def plot_profile_flags(nc, ax=None, ylim=(2000, 0), xlim="auto"):
if ax is None:
= plt.subplots()
fig, ax
# put netcdf Profile into dataframe
= pd.DataFrame(dict(
df =nc["BBP700"].pres, BBP700=nc["BBP700"].value,
PRES=[s.decode() for s in nc["BBP700"].qc]
QC
))
# plot results
sns.lineplot(=df, x="BBP700", y="PRES",
data="k", ax=ax, sort=False,
color=False, estimator=None, zorder=0
legend
)= sns.scatterplot(
g =df, x="BBP700", y="PRES", hue="QC",
data=("0", "1", "2", "3", "4"), ax=ax, zorder=1
hue_order
)
ax.set_xlim(xlim)
ax.set_ylim(ylim)
return g
For attribution, please cite this work as
Gordon (2023, Dec. 5). Argo Canada Development Blog: Implementing updated backscatter RTQC. Retrieved from https://argocanada.github.io/blog/posts/2023-12-05-implementing-updated-backscatter-rtqc/
BibTeX citation
@misc{gordon2023implementing, author = {Gordon, Christopher}, title = {Argo Canada Development Blog: Implementing updated backscatter RTQC}, url = {https://argocanada.github.io/blog/posts/2023-12-05-implementing-updated-backscatter-rtqc/}, year = {2023} }