| Anomaly | Location of a system response deemed to warrant further investigation by the demonstrator for consideration as an emplaced munitions item. |
| Detection | An anomaly location that is within Rhalo of an emplaced munitions item. |
| Military Munitions (MM) |
Specific categories of military munitions that may pose unique explosive safety risks, including UXO as defined in 10 USC 101(e)(5), DMM as defined in 10 USC 2710(e)(2), and/or munitions constituents (e.g. TNT, RDX) as defined in 10 USC 2710(e)(3) that are present in high enough concentrations to pose an explosive hazard. |
| Emplaced Munitions | An munitions item buried by the government at a specified location in the test site. |
| Emplaced Clutter | A clutter item (i.e., non-munitions item) buried by the government at a specified location in the test site. |
| Rhalo | A pre-determined radius about the an emplaced item (clutter or munitions) within which an anomaly identified by the demonstrator as being of interest is considered a detection of that item. For the purpose of this program, a circular halo 0.5 meters in radius is placed around the center of the object for all clutter and munitions items less than 0.6 meters in length. When munitions items are longer than 0.6 meters, the halo becomes an ellipse where the minor axis is 1 meter and the major axis is equal to the length of the munitions plus 1 meter. |
| Small Munitions | Caliber of munitions less than or equal to 40mm (includes 20mm projectile, 40mm projectile, submunitions BLU-26, BLU-63, and M42). |
| Medium Munitions | Caliber of munitions greater than 40mm and less than or equal to 81mm (includes 57mm projectile, 60mm mortar, 2.75-inch rocket, and 81mm mortar). |
| Large Munitions | Caliber of munitions greater than 81mm (includes 105mm HEAT, 105mm projectile, and 155mm projectile). |
| Shallow | Items buried less than 0.3 meters below ground surface. |
| Medium | Items buried greater than or equal to 0.3 meters and less than 1 meter below ground surface. |
| Deep | Items buried greater than or equal to 1 meter below ground surface. |
| Response Stage Noise Level |
The level that represents the signal level below which anomalies are not considered detectable. Demonstrators are required to provide the recommended noise level for the Blind Grid Test Area. |
| Discrimination Stage Threshold |
The demonstrator selected threshold level that is expected to provide optimum performance of the system by retaining all detectable munitions and rejecting the maximum amount of clutter. This level defines the subset of anomalies the demonstrator would recommend digging based on discrimination. |
| Binomially Distributed Random Variable |
A random variable of the type which has only two possible outcomes, say success and failure, is repeated for n independent trials with the probability p of success and the probability 1-p of failure being the same for each trial. The number of successes x observed in the n trials is an estimate of p and is considered to be a binomially distributed random variable. |
| Response and Discrimination Stage Data |
| The scoring of the demonstrator's performance is conducted in two stages. These two stages are termed the RESPONSE STAGE and DISCRIMINATION STAGE. For both stages, the probability of detection (Pd) and the false alarms are reported as receiver operating characteristic (ROC) curves. False alarms are divided into those anomalies that correspond to emplaced clutter items, measuring the probability of clutter detection (Pcd) or probability of false positive (Pfp) and those that do not correspond to any known item, termed background alarms.
The RESPONSE STAGE is a measure of whether the sensor can detect an object of interest. For a channel instrument, this value should be closely related to the amplitude of the signal. The demonstrator must report the response level (threshold) below which target responses are deemed insufficient to warrant further investigation. At this stage, minimal processing can be performed. This includes filtering long and short scale variations, bias removal, and scaling. This processing should be detailed in the data submission. For a multi-channel instrument, the demonstrator must construct a quantity analogous to amplitude. The demonstrator should consider what combination of channels provides the best test for detecting any object that the sensor can detect. The average amplitude across a set of channels is an example of an acceptable Response Stage quantity. Other methods may be more appropriate for a given sensor. Again, minimal processing can be performed and the demonstrator should explain how this quantity was constructed in their data submission. The DISCRIMINATION STAGE evaluates the demonstrator's ability to correctly identify munitions as such and to reject clutter. For the same locations as in the RESPONSE STAGE anomaly list, the DISCRIMINATION STAGE list contains the output of the algorithms applied in the discrimination-stage processing. This list is prioritized based on the demonstrator's determination that an anomaly location is likely to contain munitions. Thus, higher output values are indicative of higher confidence that an munitions item is present at the specified location. For electronic signal processing, priority ranking is based on algorithm output. For other systems, priority ranking is based on human judgment. The demonstrator also selects the threshold that the demonstrator believes will provide "optimum" system performance (i.e., that retains all the detected munitions and rejects the maximum amount of clutter). Note: The two lists provided by the demonstrator contain identical numbers of potential target locations. They differ only in the priority ranking of the declarations. |
| Group Scoring Factors |
Based on configuration of the GT at the standardized sites and the defined scoring methodology, there exists munitions groups defined as having overlapping halos. In these cases, the following scoring logic is implemented (see figs. A-1 through A-9):
|
| Response Stage Definitions | |
| Response Stage Probability of Detection (Pdres) |
Pdres = (No. of response-stage detections)/(No.of emplaced munitions in the test site). |
| Response Stage Clutter Detection(cdres) |
An anomaly location that is within Rhalo of an emplaced clutter item. |
| Response Stage Probability of Clutter Detection (Pcdres) |
Pcdres = (No. of response-stage clutter detections)/(No. of emplaced clutter items). |
| Response Stage Background Alarm (bares) |
An anomaly in a blind grid cell that contains neither emplaced munitions nor an emplaced clutter item. An anomaly location in the open field or scenarios that is outside Rhalo of any emplaced munitions or emplaced clutter item. |
| Response Stage Probability of Background Alarm (Pbares) |
Blind Grid only: Pbares = (No. of response-stage background alarms)/(No. of empty grid locations). |
| Response Stage Background Alarm Rate (BARres) | Open Field, and any Challenge Areas (including the direct and indirect firing sub areas) only: BARres = (No. of response-stage background alarms)/(arbitrary constant). |
| Note that the quantities Pdres, Pcdres, Pbares, and BARres are functions of tres, the threshold applied to the response-stage signal strength. These quantities can therefore be written as Pdres(tres), Pcdres(tres), Pbares(tres), and BARres (tres). | |
| Discrimination Stage Definitions | |
| Discrimination | The application of a signal processing algorithm or human judgment to sensor data that discriminates munitions from clutter. Discrimination should identify anomalies that the demonstrator has high confidence correspond to munitions, as well as those that the demonstrator has high confidence correspond to non-munitions or background returns. The former should be ranked with highest priority and the latter with lowest. |
| Discrimination Stage Probability of Detection (Pddisc) |
Pddisc = (No. of discrimination-stage detections)/(No. of emplaced munitions in the test site). |
| Discrimination Stage False Positive (fpdisc) |
An anomaly location that is within Rhalo of an emplaced clutter item. |
| Discrimination Stage Probability of False Positive (Pfpdisc) |
Pfpdisc = (No. of discrimination stage false positives)/(No. of emplaced clutter items). |
| Discrimination Stage Background Alarm (badisc) |
An anomaly in a blind grid cell that contains neither emplaced munitions nor an emplaced clutter item. An anomaly location in the open field or scenarios that is outside Rhalo of any emplaced munitions or emplaced clutter item. |
| Discrimination Stage Probability of Background Alarm (Pbadisc) |
Pbadisc = (No. of discrimination stage background alarms)/(No. of empty grid locations). |
| Discrimination Stage Background Alarm Rate (BARdisc) |
BARdisc = (No. of discrimination-stage background alarms)/(arbitrary constant) |
| Note that the quantities Pddisc, Pfpdisc, Pbadisc, and BARdisc are functions of tdisc, the threshold applied to the discrimination-stage signal strength. These quantities can therefore be written as Pddisc (tdisc), Pfpdisc(tdisc), Pbadisc(tdisc), and BARdisc (tdisc). | |
| Receiver-Operating Characteristic (ROC) Curves |
| ROC curves at both the response and discrimination stages can be constructed based on the above definitions. The ROC curves plot the relationship between Pd vs. Pcd or Pfp and Pd vs. BAR or Pba as the threshold applied to the signal strength is varied from its minimum (tmin) to its maximum (tmax) value.1 Figure 1 shows how Pd vs. Pcd and Pd vs. BAR are combined into ROC curves. Note that the "res" and "disc" superscripts have been suppressed from all the variables for clarity.
|
| Metrics to Characterize the Discrimination Stage |
| The demonstrator is also scored on efficiency and rejection ratio, which measure the effectiveness of the discrimination stage processing. The goal of discrimination is to retain the greatest number of munitions detections from the anomaly list, while rejecting the maximum number of anomalies arising from non-munitions items. The efficiency measures the fraction of detected munitions retained by the discrimination, while the rejection ratio measures the fraction of false alarms rejected. Both measures are defined relative to the entire response list, i.e., the maximum munitions detectable by the sensor and its accompanying clutter detection/false positive rate or background alarm rate. |
| Efficiency (E) | E = Pddisc(tdisc)/Pdres(tminres); Measures (at a threshold of interest), the degree to which the maximum theoretical detection performance of the sensor system (as determined by the response stage tmin) is preserved after application of discrimination techniques. Efficiency is a number between 0 and 1. An efficiency of 1 implies that all of the munitions initially detected in the response stage was retained at the specified threshold in the discrimination stage, tdisc. |
| False Positive Rejection Rate (Rfp) |
Rfp = 1 - [Pfpdisc(tdisc)/Pcdres(tminres)]; Measures (at a threshold of interest) the degree to which the sensor system's false positive performance is improved over the maximum false positive performance (as determined by the response stage tmin). The rejection rate is a number between 0 and 1. A rejection rate of 1 implies that all emplaced clutter initially detected in the response stage were correctly rejected at the specified threshold in the discrimination stage. |
| Background Alarm Rejection Rate (Rba) BLIND GRID OPEN FIELD |
Rba = 1 - [Pbadisc(tdisc)/Pbares(tminres)] Rba = 1 - [BARdisc(tdisc)/BARres(tminres)]). |
| Measures the degree to which the discrimination stage correctly rejects background alarms initially detected in the response stage. The rejection rate is a number between 0 and 1. A rejection rate of 1 implies that all background alarms initially detected in the response stage were rejected at the specified threshold in the discrimination stage. | |
| Chi-square Comparison | ||||||
The Chi-square test for differences in probabilities (or 2 x 2 contingency table) is used to analyze two samples drawn from two different populations to see if both populations have the same or different proportions of elements in a certain category. More specifically, two random samples are drawn, one from each population, to test the null hypothesis that the probability of event A (some specified event) is the same for both populations (ref 3). The test statistic of the 2 x 2 contingency table is the Chi-square distribution with one degree of freedom. When an association between a more challenging terrain feature and relatively degraded performance is sought, a one-sided test is performed. A two-sided 2 x 2 contingency table is used in the Standardized UXO Technology Demonstration Site Program to compare performance between any two areas or sub-areas when the direction of degradation cannot be predetermined. For a one-sided test, a significance level of 0.05 is used to set the critical decision limit. It is a critical decision limit because if the test statistic calculated from the data exceeds this value, the lower proportion tested will be considered significantly less than the greater one (degraded). If the test statistic calculated from the data is less than this value, than no degradation can be said to exist due to the terrain feature introduced. For a two-sided test, a significance level of 0.10 is used to allow .05 on either side of the decision. It is a critical decision limit because if the test statistic calculated from the data exceeds this value, the two proportions tested will be considered significantly different. If the test statistic calculated from the data is less than this value, the two proportions tested will be considered not significantly different. An exception must be applied when either a 0 or 100 percent success rate occurs in the sample data. The Chi-square test cannot be used in these instances. Instead, Fischer's test is used and the critical decision limit for one-sided tests is the chosen significance level, which in this case is 0.05. With Fischer's test, if the test statistic is less than the critical value, the proportions are considered to be significantly different. An example follows that illustrates Standardized UXO Technology Demonstration Site blind grid results compared to those from the open field legacy. It should be noted that a significant result does not prove a cause and effect relationship exists between the two populations of interest; however, it does serve as a tool to indicate that one data set has experienced a degradation or change in system performance at a large enough level than can be accounted for merely by chance or random variation. Note also that a result that is not significant indicates that there is not enough evidence to declare that anything more than chance or random variation within the same population is at work between the two data sets being compared.
Pdres: BLIND GRID versus OPEN FIELD (legacy). Using the example data above to compare probabilities of detection in the response stage, all 100 munitions out of 100 emplaced munitions items were detected in the blind grid while 8 munitions out of 10 emplaced were detected in the open field. Fischer's test must be used since a 100 percent success rate occurs in the data. Fischer's test uses the four input values to calculate a test statistic of 0.0075 that is compared against the critical value of 0.05. Since the test statistic is less than the critical value, the smaller response stage detection rate (0.80) is considered to be significantly less at the 0.05 level of significance. While a significant result does not prove a cause and effect relationship exists between the change in survey area and degradation in performance, it does indicate that the detection ability of demonstrator X's system seems to have been degraded in the open field relative to results from the blind grid using the same system. This is an example of a one-sided Chi-squared test. ![]() Standardized UXO Technology Demonstration Site
For more information, please contact the |
||||||