Analysis, Chisquare Calculation
The Statistical Approach
The primary focus for GCP analyses is anomalous shifts of the mean during periods of time specified in formal predictions. The standard test of such departures from expectation compares the Chisquare of the composite deviation across all eggs during a specified event, against theoretical expectation. This composite Chisquare is the Stouffer Z
which is a normalized sum of the Z-scores for all predefined segments (see below). The segments may be defined either as the whole period of the prediction, or they may be broken into sub-segments (e.g., seconds or 15-minute blocks).
Preparing the Data
A prediction specifies a moment or a period of time during which a deviation is expected in the data, corresponding to a global event. This provides most of the information needed for analysis, and leads to the algorithm for processing the data and calculating the statistics that may provide evidence for the hypothesis. The following is a description of the stable, standard procedures as of early 2000. The exact algorithmic procedures for the analysis must be specifed as part of the prediction, before the data are examined. This is done most often by indicating that the standard analysis
will be used. This and other defined analyses that we have used over the course of the experiment are detailed in recipes that, if followed, will duplicate the original GCP analysis. (In some cases, extra data will have been accumulated from dial and drop eggs.)
The earliest analyses used a summation of Z2 across eggs, a different algorithm that is now used only when explicitly pre-specified (or in contextual explorations). This page continued to describe that superseded procedure, which simply measures the variance among the eggs, until October 2001, when the outdated description was noted. The standard or default analysis for the record is based on the composite, signed meanshift across eggs (the Stouffer Z), which properly represents an underlying hypothesis that the behavior of the eggs will tend to be correlated if there is a global consciousness
effect. The Stouffer Z is defined as Zs = Sum(Zi)/Sqrt(i). In words, the Stouffer Z is the algebraic sum of the individual Z-scores in a set, divided by the square root of their number. York Dobyns has provided a rigorous description of the relationship of the Stouffer Z based measure and the variance measure.
We begin with the assumption that the eggs are synchronized (even though this isn’t 100% true). We calculate the mean, var, and Z across eggs for each second, properly treating missing values. This yields a single time-series representing the composite egg behavior, which can then be used in various analytical explorations like those done for the Y2K event. For short periods, we do not need to block the data, but in some cases, given a pre-specified reason for doing so, such as creating a manageable dataset for 6 days worth of seconds, we do blocking by a standard unit, typically minutes. For some analyses, like the inter-egg correlations, it is always necessary to choose some blocking period (Doug Mast also uses 1 minute, so that the correlations are calculated for 60 pairs of egg-trials.)
We still have much to learn, and in particular, lots to learn about the dependence of results on these seemingly arbitrary factors: the order in which composites are constructed, the size of blocks, etc. This means we still need to balance the desirable features of specificity and flexibility. There is nothing new here, but it is especially notable because there are so many questions and so much apparently relevant data.
The Chisquare Test
The actual calculation for statistical tests involves a sequence of steps.
- The REG or RNG units produce random bits at high speed, for a computer serial port
-
Each Egg-site records data as
trials
at one per second, summing 200 bits for one trial - The 200-bit trial sums have expected mean = 100 and standard deviation = 7.071
- The deviation of a trial, or of the mean of a specified set of trials, is normalized as a Z-score
- A composite (Stouffer) Z-score across Eggs is computed for each second or block of time
- This is squared, yielding a Chisquare-distributed quantity with one degree of freedom
- Since Chisquares are additive, we may sum across seconds or across blocks of time
- The total Chisquare represents the deviation for the predicted period of time
- This is compared with the appropriate Chisquare distribution to yield a chance probability
Control Data
Control data are needed to establish the viability of the statistical results from active
data generated during events specified via the prediction protocol. The control data are expected to produce chance results because by hypothesis no engaging event is specified. The complex nature of the data in the Global Consciousness Project and the situation-dependent nature of the predictions requires specially designed procedures for ensuring that the statistical characterizations of the data are valid. The main components of statistical control are quality-controlled equipment design, thorough device calibration, and a procedure called resampling. In addition, a clone
database of Algorithmic Pseudo-random data is automatically generated, and these may be assessed as control
data. The combined force of these efforts ensures that the GCP data meet appropriate standards, and that the active
subsets subjected to hypothesis-testing are evaluated against chance expectation as well as a large of surrounding control
and calibration data. See also Appendix, Nelson et al., FieldREG II.