Implementing updated backscatter RTQC

Writing and testing a python implementation of the updated tests described in Dall’Olmo et al. 2023

Christopher Gordon https://github.com/cgrdn
2023-12-05

In this post I will examine the updated backscatter testing described in Dall’Olmo et al. 2023 implementation in medsrtqc, a python implementation of Argo Real Time Quality Control (RTQC). In its current state, the package contains the RTQC tests for chlorophyll (CHLA), backscatter (BBP) and pH, as well as many of standard tests listed in the official vocabulary.

Analysis Structure

We will use the same floats and cycles shown as examples in Dall’Olmo et al., discuss the expected results, and then examine the results given by the package.

For each test, code for that test is shown in python. This code is incomplete, as it is just snippets of the tests written in medsrtqc, however the code structure and variable names should be instructive should anyone want to re-purpose these functions.

A custom plotting function to show the flags of each profile is used throughout the post. The definition of that function can be found at the bottom of the post.

The floats we will be analyzing will have already been QC’ed (in fact, most have been DMQC’ed), so most data will already be flagged appropriately, or in some cases the QC flags may have been elevated by the DMQC operator. For this reason, we will fully reset all flags to 0 to simulate the actual RTQC result.

Missing Data Test

The missing data test bins the profile and checks how many of those bins are populated. Good data will have all bins populated, probably bad data will have multiple bins with no data, and bad data will have only one bin with data.

The test is written as follows in python:

# missing data test
self.log('Performing missing data test')
# find amount of data in each bin
bins = [0, 50, 156, 261, 367, 472, 578, 683, 789, 894, 1000]
hist, bins = np.histogram(bbp.pres, bins=bins)
# 1 or more bin empty, flag as probably bad (3)
new_flag = Flag.PROBABLY_BAD if sum(hist == 0) > 1 else Flag.GOOD
# all but 1 empty, flag as bad (4)
new_flag = Flag.BAD if sum(hist != 0) == 1 else new_flag
# all empty, flag as missing (9)
new_flag = Flag.MISSING if all(hist == 0) else new_flag
# update flags and log
Flag.update_safely(bbp.qc, new_flag)
self.log(f'Missing data test results: flags set to {new_flag}')

The examples shown in the paper are on floats 1901339 cycle 1, 4900561 cycle 8, and 6901004 cycle 41. We expect to see the full profile flagged as 3 (probably bad), 3, and 4 (bad) respectively.

from medsrtqc.qc.util import ResetQCOperation
from medsrtqc.qc.bbp import bbpTest
from medsrtqc.nc import read_nc_profile
from medsrtqc.qc.check import preTestCheck

# example files
files = ["BD1901339_001.nc", "BR7900561_008.nc", "BD6901004_041.nc"]
# fig/axes to plot results
fig, axes = plt.subplots(1, 3, sharey=True, constrained_layout=True)

check = preTestCheck()
bbp = bbpTest()

# loop through each profile
for fn, ax in zip(files, axes):
    nc = read_nc_profile("data/" + fn, mode="r+")
    ResetQCOperation().run(nc)
    tests = check.run(nc)
    
    print("before: ", nc["BBP700"])
    nc.prepare(tests)
    
    bbp.run(nc)
    print("after: ", nc["BBP700"])
    
    g = plot_profile_flags(nc, ax=ax, ylim=(1200, 0), xlim=(-0.002, 0.015))
    nc.close()
before:  Trace(
    value=[0.0025575757, 0.0022508001, 0.0028643121, [25 values], 0.00084325275, 0.00070981274, 0.00081648194],
    qc=[b'0', b'0', b'0', [25 values], b'0', b'0', b'0'],
    adjusted=[masked, masked, masked, [25 values], masked, masked, masked],
    adjusted_qc=[masked, masked, masked, [25 values], masked, masked, masked],
    adjusted_error=[masked, masked, masked, [25 values], masked, masked, masked],
    pres=[7.61, 11.83, 21.68, [25 values], 281.02, 291.23, 300.71],
    mtime=[masked, masked, masked, [25 values], masked, masked, masked]
)
after:  Trace(
    value=[0.0025575757, 0.0022508001, 0.0028643121, [25 values], 0.00084325275, 0.00070981274, 0.00081648194],
    qc=[b'3', b'3', b'3', [25 values], b'3', b'3', b'3'],
    adjusted=[masked, masked, masked, [25 values], masked, masked, masked],
    adjusted_qc=[masked, masked, masked, [25 values], masked, masked, masked],
    adjusted_error=[masked, masked, masked, [25 values], masked, masked, masked],
    pres=[7.61, 11.83, 21.68, [25 values], 281.02, 291.23, 300.71],
    mtime=[masked, masked, masked, [25 values], masked, masked, masked]
)
before:  Trace(
    value=[0.0004414144, 0.00048095727, 0.00046771928, [137 values], 0.00033030732, 0.00036985305, 0.0003302439],
    qc=[b'0', b'0', b'0', [137 values], b'0', b'0', b'0'],
    adjusted=[0.0004414144, 0.00048095727, 0.00046771928, [137 values], 0.00033030732, 0.00036985305, 0.0003302439],
    adjusted_qc=[b'4', b'4', b'4', [137 values], b'4', b'4', b'4'],
    adjusted_error=[masked, masked, masked, [137 values], masked, masked, masked],
    pres=[267.06625, 272.18, 277.255, [137 values], 982.7982, 987.80884, 991.61237],
    mtime=[masked, masked, masked, [137 values], masked, masked, masked]
)
after:  Trace(
    value=[0.0004414144, 0.00048095727, 0.00046771928, [137 values], 0.00033030732, 0.00036985305, 0.0003302439],
    qc=[b'3', b'3', b'3', [137 values], b'3', b'3', b'3'],
    adjusted=[0.0004414144, 0.00048095727, 0.00046771928, [137 values], 0.00033030732, 0.00036985305, 0.0003302439],
    adjusted_qc=[b'4', b'4', b'4', [137 values], b'4', b'4', b'4'],
    adjusted_error=[masked, masked, masked, [137 values], masked, masked, masked],
    pres=[267.06625, 272.18, 277.255, [137 values], 982.7982, 987.80884, 991.61237],
    mtime=[masked, masked, masked, [137 values], masked, masked, masked]
)
before:  Trace(
    value=[0.0004893391, -0.00025918605, 0.0006870628, [1101 values], 0.00033398485, 0.001548573, 0.00043284672],
    qc=[b'0', b'0', b'0', [1101 values], b'0', b'0', b'0'],
    adjusted=[masked, masked, masked, [1101 values], masked, masked, masked],
    adjusted_qc=[b'4', b'4', b'4', [1101 values], b'4', b'4', b'4'],
    adjusted_error=[masked, masked, masked, [1101 values], masked, masked, masked],
    pres=[107.5, 107.5, 107.5, [1101 values], 107.5, 107.5, 107.5],
    mtime=[masked, masked, masked, [1101 values], masked, masked, masked]
)
after:  Trace(
    value=[0.0004893391, -0.00025918605, 0.0006870628, [1101 values], 0.00033398485, 0.001548573, 0.00043284672],
    qc=[b'4', b'4', b'4', [1101 values], b'4', b'4', b'4'],
    adjusted=[masked, masked, masked, [1101 values], masked, masked, masked],
    adjusted_qc=[b'4', b'4', b'4', [1101 values], b'4', b'4', b'4'],
    adjusted_error=[masked, masked, masked, [1101 values], masked, masked, masked],
    pres=[107.5, 107.5, 107.5, [1101 values], 107.5, 107.5, 107.5],
    mtime=[masked, masked, masked, [1101 values], masked, masked, masked]
)

[medsrtqc.nc.NetCDFProfile log] Setting previously unset flags for BBP to GOOD
[medsrtqc.nc.NetCDFProfile log] Performing missing data test
[medsrtqc.nc.NetCDFProfile log] Missing data test results: flags set to 3
[medsrtqc.nc.NetCDFProfile log] Performing high deep value
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] High deep value test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing noisy profile test
[medsrtqc.nc.NetCDFProfile log] Noisy profile test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing negative bbp test
[medsrtqc.nc.NetCDFProfile log] Negative bbp test result: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Applying global range test to BBP
[medsrtqc.nc.NetCDFProfile log] Performing negative spike test
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] Performing stuck value test on bbp
[medsrtqc.nc.NetCDFProfile log] Setting previously unset flags for BBP to GOOD
[medsrtqc.nc.NetCDFProfile log] Performing missing data test
[medsrtqc.nc.NetCDFProfile log] Missing data test results: flags set to 3
[medsrtqc.nc.NetCDFProfile log] Performing high deep value
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] High deep value test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing noisy profile test
[medsrtqc.nc.NetCDFProfile log] Noisy profile test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing negative bbp test
[medsrtqc.nc.NetCDFProfile log] Negative bbp test result: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing parking hook test
[medsrtqc.nc.NetCDFProfile log] Parking hook test results: 0 points set to 4
[medsrtqc.nc.NetCDFProfile log] Applying global range test to BBP
[medsrtqc.nc.NetCDFProfile log] Performing negative spike test
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] Performing stuck value test on bbp
[medsrtqc.nc.NetCDFProfile log] Setting previously unset flags for BBP to GOOD
[medsrtqc.nc.NetCDFProfile log] Performing missing data test
[medsrtqc.nc.NetCDFProfile log] Missing data test results: flags set to 4
[medsrtqc.nc.NetCDFProfile log] Performing high deep value
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] High deep value test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing noisy profile test
[medsrtqc.nc.NetCDFProfile log] Noisy profile test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing negative bbp test
[medsrtqc.nc.NetCDFProfile log] Negative bbp test result: flags set to 3
[medsrtqc.nc.NetCDFProfile log] Applying global range test to BBP
[medsrtqc.nc.NetCDFProfile log] Performing negative spike test
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] Performing stuck value test on bbp
plt.show()

Summary

✅ BD1901339_001: Flags are set to 3 as expected.

✅ BD7900561_008: Flags are set to 3 as expected.

✅ BD6901004_041: Flags are set to 4 as expected.

High Deep Value Test

The high deep value test aims to flag profiles with anomalously high BBP values at depth. This could be a symptom of biofouling, poor calibration, or sensor error. These could also be valid data, in which case the flags would be returned to a value of 1 or 2 by the D-mode operator. The test checks if median-filtered BBP is higher than a threshold value of 5e-4.

The test is written as follows in python:

# high deep value test
self.log('Performing high deep value')
# compute median, check which are above threshold value
median_bbp = self.running_median(5)
high_deep_value = (sum(bbp.pres > 700) > 5) & (np.nanmedian(median_bbp[bbp.pres > 700]) > 5e-4)
# update to 3 if there are high deep values
new_flag = Flag.PROBABLY_BAD if high_deep_value else Flag.GOOD
Flag.update_safely(bbp.qc, new_flag)
self.log(f'High deep value test results: flags set to {new_flag}')

The example shown in the paper is on float 3901531 cycle 125. We expect the flags to be set to 3.

fn = "BD3901531_125.nc"
fig, ax = plt.subplots(constrained_layout=True)
fig.set_size_inches(fig.get_figwidth()*2/5, fig.get_figheight())

nc = read_nc_profile("data/" + fn, mode="r+")
ResetQCOperation().run(nc)
tests = check.run(nc)

print("before: ", nc["BBP700"])
before:  Trace(
    value=[0.009043367, 0.027663978, 0.021865454, [1082 values], 0.0023098632, 0.0025140366, 0.002350698],
    qc=[b'0', b'0', b'0', [1082 values], b'0', b'0', b'0'],
    adjusted=[0.009043367, 0.027663978, 0.021865454, [1082 values], 0.0023098632, 0.0025140366, 0.002350698],
    adjusted_qc=[b'3', b'3', b'3', [1082 values], b'3', b'4', b'4'],
    adjusted_error=[masked, masked, masked, [1082 values], masked, masked, masked],
    pres=[0.0, 0.0, 0.0, [1082 values], 1017.7, 1017.7, 1017.5],
    mtime=[masked, masked, masked, [1082 values], masked, masked, masked]
)
nc.prepare(tests)
bbp.run(nc)
[medsrtqc.nc.NetCDFProfile log] Setting previously unset flags for BBP to GOOD
[medsrtqc.nc.NetCDFProfile log] Performing missing data test
[medsrtqc.nc.NetCDFProfile log] Missing data test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing high deep value
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] High deep value test results: flags set to 3
[medsrtqc.nc.NetCDFProfile log] Performing noisy profile test
[medsrtqc.nc.NetCDFProfile log] Noisy profile test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing negative bbp test
[medsrtqc.nc.NetCDFProfile log] Negative bbp test result: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing parking hook test
[medsrtqc.nc.NetCDFProfile log] Parking hook test results: 12 points set to 4
[medsrtqc.nc.NetCDFProfile log] Applying global range test to BBP
[medsrtqc.nc.NetCDFProfile log] Performing negative spike test
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] Performing stuck value test on bbp
print("after: ", nc["BBP700"])
after:  Trace(
    value=[0.009043367, 0.027663978, 0.021865454, [1082 values], 0.0023098632, 0.0025140366, 0.002350698],
    qc=[b'3', b'3', b'3', [1082 values], b'3', b'4', b'4'],
    adjusted=[0.009043367, 0.027663978, 0.021865454, [1082 values], 0.0023098632, 0.0025140366, 0.002350698],
    adjusted_qc=[b'3', b'3', b'3', [1082 values], b'3', b'4', b'4'],
    adjusted_error=[masked, masked, masked, [1082 values], masked, masked, masked],
    pres=[0.0, 0.0, 0.0, [1082 values], 1017.7, 1017.7, 1017.5],
    mtime=[masked, masked, masked, [1082 values], masked, masked, masked]
)
g = plot_profile_flags(nc, ax=ax, ylim=(1200, 0), xlim=(-0.002, 0.015))
nc.close()
plt.show()

Summary

✅ BD3901531_125: Flags are set to 3 as expected. Some points at the bottom also fail the Parking Hook Test.

Noisy Profile Test

Flag profiles that are affected by noisy data by checking the portion of residuals to the median profile above a threshold value.

The test is written as follows in python:

self.log('Performing noisy profile test')
# below surface
deep_ix = bbp.pres > 100
# compute residuals below surface
residual = bbp.value - median_bbp
high_residuals = residual > 0.0005
high_residuals = high_residuals[deep_ix]
# portion of residuals
pct_residuals = 100*sum(high_residuals)/len(high_residuals)
many_high_residuals = pct_residuals > 10
# if there are a lot of high residuals, flag as 3
new_flag = Flag.PROBABLY_BAD if many_high_residuals else Flag.GOOD
Flag.update_safely(bbp.qc, new_flag)
self.log(f'Noisy profile test results: flags set to {new_flag}')

On float 6901151 cycle 79 we expect to flag to profile as 3:

fn = "BD6901151_079.nc"
fig, ax = plt.subplots(constrained_layout=True)
fig.set_size_inches(fig.get_figwidth()*2/5, fig.get_figheight())

nc = read_nc_profile("data/" + fn, mode="r+")
ResetQCOperation().run(nc)
tests = check.run(nc)

print("before: ", nc["BBP700"])
before:  Trace(
    value=[0.0074264565, 0.0022428727, 0.0032238269, [328 values], 0.0036023555, 0.0008325885, 0.0008325483],
    qc=[b'0', b'0', b'0', [328 values], b'0', b'0', b'0'],
    adjusted=[0.0074264565, 0.0022428727, 0.0032238269, [328 values], 0.0036023555, 0.0008325885, 0.0008325483],
    adjusted_qc=[b'3', b'3', b'3', [328 values], b'3', b'3', b'3'],
    adjusted_error=[masked, masked, masked, [328 values], masked, masked, masked],
    pres=[0.7, 1.1, 1.2, [328 values], 977.6, 989.2, 993.8],
    mtime=[masked, masked, masked, [328 values], masked, masked, masked]
)
nc.prepare(tests)
bbp.run(nc)
[medsrtqc.nc.NetCDFProfile log] Setting previously unset flags for BBP to GOOD
[medsrtqc.nc.NetCDFProfile log] Performing missing data test
[medsrtqc.nc.NetCDFProfile log] Missing data test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing high deep value
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] High deep value test results: flags set to 3
[medsrtqc.nc.NetCDFProfile log] Performing noisy profile test
[medsrtqc.nc.NetCDFProfile log] Noisy profile test results: flags set to 3
[medsrtqc.nc.NetCDFProfile log] Performing negative bbp test
[medsrtqc.nc.NetCDFProfile log] Negative bbp test result: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing parking hook test
[medsrtqc.nc.NetCDFProfile log] Parking hook test results: 1 points set to 4
[medsrtqc.nc.NetCDFProfile log] Applying global range test to BBP
[medsrtqc.nc.NetCDFProfile log] Performing negative spike test
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] Performing stuck value test on bbp
print("after: ", nc["BBP700"])
after:  Trace(
    value=[0.0074264565, 0.0022428727, 0.0032238269, [328 values], 0.0036023555, 0.0008325885, 0.0008325483],
    qc=[b'3', b'3', b'3', [328 values], b'3', b'3', b'3'],
    adjusted=[0.0074264565, 0.0022428727, 0.0032238269, [328 values], 0.0036023555, 0.0008325885, 0.0008325483],
    adjusted_qc=[b'3', b'3', b'3', [328 values], b'3', b'3', b'3'],
    adjusted_error=[masked, masked, masked, [328 values], masked, masked, masked],
    pres=[0.7, 1.1, 1.2, [328 values], 977.6, 989.2, 993.8],
    mtime=[masked, masked, masked, [328 values], masked, masked, masked]
)
g = plot_profile_flags(nc, ax=ax, ylim=(1200, 0), xlim=(-0.002, 0.015))
nc.close()
plt.show()

Summary

✅ BD6901151_079: Flags are set to 3 as expected. One point also fails the Parking Hook Test.

Negative BBP Test

The objective of this test is to flag negative backscatter values. Negative values in the top 5dbar will be flagged as 4 as they most likely represent in-air backscatter samples. If there are other negative values, the profile will be flagged as 3, or if the profile consists of more than 10% negative values, the profile is flagged as 4.

The test is written as follows in python:

self.log('Performing negative bbp test')
# negative points at the very surface
shallow_and_negative = (bbp.pres < 5) & (bbp.value < 0)
if any(shallow_and_negative):
    self.log(f'Negative bbp test results: shallow negative flags set to {Flag.BAD}')
# update surface flags
Flag.update_safely(bbp.qc, Flag.BAD, where=shallow_and_negative)
# negative values in rest of profile
deep_and_negative = (bbp.pres > 5) & (bbp.value < 0)
pct_negative = 100*sum(deep_and_negative)/len(deep_and_negative)
# if more than 10 pct are negative
many_negative = pct_negative > 10
# flag as 3 if any negative
new_flag = Flag.PROBABLY_BAD if any(deep_and_negative) else Flag.GOOD
# flag as 4 if > 10% negative
new_flag = Flag.BAD if many_negative else new_flag
Flag.update_safely(bbp.qc, new_flag)
self.log(f'Negative bbp test result: flags set to {new_flag}')

For this test we will check 2 floats: 6901654 cycle 56 and 6901151 cycle 7. For the first profile, there is only a surface negative value, which should be flagged as 4 but the rest of the profile should be fine. For the second profile, there are negative deep values so the whole profile should be flagged 3 or 4 depending on the portion of negative data.

# example files
files = ["BD6901654_056.nc", "BD6901151_007.nc"]
# fig/axes to plot results
fig, axes = plt.subplots(1, 2, sharey=True, constrained_layout=True)
fig.set_size_inches(fig.get_figwidth()*4/5, fig.get_figheight())

# loop through each profile
for fn, ax in zip(files, axes):
    nc = read_nc_profile("data/" + fn, mode="r+")
    ResetQCOperation().run(nc)
    tests = check.run(nc)
    
    print("before: ", nc["BBP700"])
    nc.prepare(tests)
    
    bbp.run(nc)
    print("after: ", nc["BBP700"])
    
    g = plot_profile_flags(nc, ax=ax, ylim=(1200, 0), xlim=(-0.002, 0.005))
    nc.close()
before:  Trace(
    value=[-0.0004011667, 0.00046792105, 0.00046791686, [1190 values], 0.00025597008, 0.00024438222, 0.00025597008],
    qc=[b'0', b'0', b'0', [1190 values], b'0', b'0', b'0'],
    adjusted=[-0.0004011667, 0.00046792105, 0.00046791686, [1190 values], 0.00025597008, 0.00024438222, 0.00025597008],
    adjusted_qc=[b'4', b'1', b'1', [1190 values], b'1', b'1', b'1'],
    adjusted_error=[masked, masked, masked, [1190 values], masked, masked, masked],
    pres=[0.6, 0.7, 1.1, [1190 values], 986.7, 986.7, 986.7],
    mtime=[masked, masked, masked, [1190 values], masked, masked, masked]
)
after:  Trace(
    value=[-0.0004011667, 0.00046792105, 0.00046791686, [1190 values], 0.00025597008, 0.00024438222, 0.00025597008],
    qc=[b'4', b'1', b'1', [1190 values], b'1', b'1', b'1'],
    adjusted=[-0.0004011667, 0.00046792105, 0.00046791686, [1190 values], 0.00025597008, 0.00024438222, 0.00025597008],
    adjusted_qc=[b'4', b'1', b'1', [1190 values], b'1', b'1', b'1'],
    adjusted_error=[masked, masked, masked, [1190 values], masked, masked, masked],
    pres=[0.6, 0.7, 1.1, [1190 values], 986.7, 986.7, 986.7],
    mtime=[masked, masked, masked, [1190 values], masked, masked, masked]
)
before:  Trace(
    value=[0.00080094114, 0.00082017767, 0.00083941227, [354 values], -0.0003456371, -0.0002303965, -0.00021136053],
    qc=[b'0', b'0', b'0', [354 values], b'0', b'0', b'0'],
    adjusted=[0.00080094114, 0.00082017767, 0.00083941227, [354 values], -0.0003456371, -0.0002303965, -0.00021136053],
    adjusted_qc=[b'4', b'4', b'4', [354 values], b'4', b'4', b'4'],
    adjusted_error=[masked, masked, masked, [354 values], masked, masked, masked],
    pres=[0.6, 1.1, 1.2, [354 values], 1840.2, 1903.7, 1970.2],
    mtime=[masked, masked, masked, [354 values], masked, masked, masked]
)
after:  Trace(
    value=[0.00080094114, 0.00082017767, 0.00083941227, [354 values], -0.0003456371, -0.0002303965, -0.00021136053],
    qc=[b'4', b'4', b'4', [354 values], b'4', b'4', b'4'],
    adjusted=[0.00080094114, 0.00082017767, 0.00083941227, [354 values], -0.0003456371, -0.0002303965, -0.00021136053],
    adjusted_qc=[b'4', b'4', b'4', [354 values], b'4', b'4', b'4'],
    adjusted_error=[masked, masked, masked, [354 values], masked, masked, masked],
    pres=[0.6, 1.1, 1.2, [354 values], 1840.2, 1903.7, 1970.2],
    mtime=[masked, masked, masked, [354 values], masked, masked, masked]
)

[medsrtqc.nc.NetCDFProfile log] Setting previously unset flags for BBP to GOOD
[medsrtqc.nc.NetCDFProfile log] Performing missing data test
[medsrtqc.nc.NetCDFProfile log] Missing data test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing high deep value
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] High deep value test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing noisy profile test
[medsrtqc.nc.NetCDFProfile log] Noisy profile test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing negative bbp test
[medsrtqc.nc.NetCDFProfile log] Negative bbp test results: shallow negative flags set to 4
[medsrtqc.nc.NetCDFProfile log] Negative bbp test result: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing parking hook test
[medsrtqc.nc.NetCDFProfile log] Parking hook test results: 1 points set to 4
[medsrtqc.nc.NetCDFProfile log] Applying global range test to BBP
[medsrtqc.nc.NetCDFProfile log] Performing negative spike test
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] Performing stuck value test on bbp
[medsrtqc.nc.NetCDFProfile log] Setting previously unset flags for BBP to GOOD
[medsrtqc.nc.NetCDFProfile log] Performing missing data test
[medsrtqc.nc.NetCDFProfile log] Missing data test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing high deep value
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] High deep value test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing noisy profile test
[medsrtqc.nc.NetCDFProfile log] Noisy profile test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing negative bbp test
[medsrtqc.nc.NetCDFProfile log] Negative bbp test result: flags set to 4
[medsrtqc.nc.NetCDFProfile log] Applying global range test to BBP
[medsrtqc.nc.NetCDFProfile log] Performing negative spike test
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] Performing stuck value test on bbp
plt.show()

Summary

✅ BD6901654_056: Top negative point marked as bad, rest of profile ok. One point also fails the Parking Hook Test.

✅ BD6901151_007: Whole profile marked as bad as expected.

Parking Hook Test

The parking hook test applies to profiles that have the same parking depth as their profile depth. In this configuration, there is no descent phase before the beginning of the profile, but instead the float immediately starts sampling and ascending toward the surface. Accumulated particles during the parking stage cause a distinctive hook at the bottom of the profile before they are released off the sensor back into the water. Although these data can be useful to expert users, they are not the expected data a user would want to find in an Argo profile and so the goal of this test is to flag these anomalous points.

The test is written as follows in python:

if ascending:
    pres = bbp.pres
    pres[np.abs(pres) > 6000] = np.nan # remove fill values of 99999
    pres = np.sort(pres)
    # calculate deepest difference
    deepest_diff = pres[-1] - pres[-2]
    # deep points closer than 20m
    if deepest_diff < 20:
        parking_pres = 1000 # placeholder, assumed for now
        # difference between parking pressure and 
        parking_diff = np.abs(pres[-1] - parking_pres)
        if parking_diff < 100:
            self.log('Performing parking hook test')
            # check for high points in deepest 50m of the profile
            ix = (bbp.pres < (pres[-1] - 20)) & (bbp.pres > (pres[-1] - 50))
            baseline = np.median(bbp.value[ix]) + 0.0002
            deep_above_baseline = (bbp.pres > (pres[-1] - 50)) & (bbp.value > baseline)
            all_passed = all_passed and not any(deep_above_baseline)
            # update only the bottom flags
            Flag.update_safely(bbp.qc, Flag.BAD, where=deep_above_baseline)
            self.log(f'Parking hook test results: flags set to {new_flag}')

The example profile for this test is float 6903197 cycle 26.

fn = "BD6903197_026.nc"
fig, ax = plt.subplots(constrained_layout=True)
fig.set_size_inches(fig.get_figwidth()*2/5, fig.get_figheight())

nc = read_nc_profile("data/" + fn, mode="r+")
ResetQCOperation().run(nc)
tests = check.run(nc)

print("before: ", nc["BBP700"])
before:  Trace(
    value=[0.0013924582, 0.00164023, 0.0015381566, [436 values], 0.00073923037, 0.00094326853, 0.0018410363],
    qc=[b'0', b'0', b'0', [436 values], b'0', b'0', b'0'],
    adjusted=[0.0013924582, 0.00164023, 0.0015381566, [436 values], 0.00073923037, 0.00094326853, 0.0018410363],
    adjusted_qc=[b'1', b'1', b'1', [436 values], b'1', b'1', b'1'],
    adjusted_error=[0.00027849164, 0.00032804598, 0.00030763133, [436 values], 0.00014784608, 0.0001886537, 0.00036820726],
    pres=[0.3, 1.0, 1.2, [436 values], 932.8, 932.8, 932.8],
    mtime=[masked, masked, masked, [436 values], masked, masked, masked]
)
nc.prepare(tests)
bbp.run(nc)
[medsrtqc.nc.NetCDFProfile log] Setting previously unset flags for BBP to GOOD
[medsrtqc.nc.NetCDFProfile log] Performing missing data test
[medsrtqc.nc.NetCDFProfile log] Missing data test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing high deep value
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] High deep value test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing noisy profile test
[medsrtqc.nc.NetCDFProfile log] Noisy profile test results: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing negative bbp test
[medsrtqc.nc.NetCDFProfile log] Negative bbp test result: flags set to 1
[medsrtqc.nc.NetCDFProfile log] Performing parking hook test
[medsrtqc.nc.NetCDFProfile log] Parking hook test results: 9 points set to 4
[medsrtqc.nc.NetCDFProfile log] Applying global range test to BBP
[medsrtqc.nc.NetCDFProfile log] Performing negative spike test
[medsrtqc.nc.NetCDFProfile log] Calculating running median over window size 5
[medsrtqc.nc.NetCDFProfile log] Performing stuck value test on bbp
print("after: ", nc["BBP700"])
after:  Trace(
    value=[0.0013924582, 0.00164023, 0.0015381566, [436 values], 0.00073923037, 0.00094326853, 0.0018410363],
    qc=[b'1', b'1', b'1', [436 values], b'4', b'4', b'4'],
    adjusted=[0.0013924582, 0.00164023, 0.0015381566, [436 values], 0.00073923037, 0.00094326853, 0.0018410363],
    adjusted_qc=[b'1', b'1', b'1', [436 values], b'1', b'1', b'1'],
    adjusted_error=[0.00027849164, 0.00032804598, 0.00030763133, [436 values], 0.00014784608, 0.0001886537, 0.00036820726],
    pres=[0.3, 1.0, 1.2, [436 values], 932.8, 932.8, 932.8],
    mtime=[masked, masked, masked, [436 values], masked, masked, masked]
)
g = plot_profile_flags(nc, ax=ax, ylim=(1070, 800), xlim=(-0.0005, 0.003))
nc.close()
plt.show()

Summary

✅ BD6903197_027: 9 high value points at depth flagged as bad.

Appendix

Plotting function

def plot_profile_flags(nc, ax=None, ylim=(2000, 0), xlim="auto"):
    if ax is None:
        fig, ax = plt.subplots()

    # put netcdf Profile into dataframe
    df = pd.DataFrame(dict(
      PRES=nc["BBP700"].pres, BBP700=nc["BBP700"].value, 
      QC=[s.decode() for s in nc["BBP700"].qc]
    ))

    # plot results
    sns.lineplot(
      data=df, x="BBP700", y="PRES", 
      color="k", ax=ax, sort=False, 
      legend=False, estimator=None, zorder=0
    )
    g = sns.scatterplot(
      data=df, x="BBP700", y="PRES", hue="QC", 
      hue_order=("0", "1", "2", "3", "4"), ax=ax, zorder=1
    )
    
    ax.set_xlim(xlim)
    ax.set_ylim(ylim)

    return g

Citation

For attribution, please cite this work as

Gordon (2023, Dec. 5). Argo Canada Development Blog: Implementing updated backscatter RTQC. Retrieved from https://argocanada.github.io/blog/posts/2023-12-05-implementing-updated-backscatter-rtqc/

BibTeX citation

@misc{gordon2023implementing,
  author = {Gordon, Christopher},
  title = {Argo Canada Development Blog: Implementing updated backscatter RTQC},
  url = {https://argocanada.github.io/blog/posts/2023-12-05-implementing-updated-backscatter-rtqc/},
  year = {2023}
}