The `stats` command calculates basic summary statistics for a data set, displays them in human-readable form and (optionally) makes them available as gnuplot variables. Syntax: stats {<ranges>} {"<datafile>" {datafile-modifiers}} {[no]output} {variables[=prefix]} Permissible data file modifiers are `index`, `every`, and `using`, all of which behave exactly as for the `plot` command. Up to two columns can be specified with `using`, and inline transformations are available (same as for `plot`). The `stats` command will only consider data points which fall into the plot range as defined inline or using `set xrange` and `set yrange`. If a value in either column falls outside of its corresponding range, the entire record is skipped and does not contribute to any of the summary statistics. By default, the results will be printed to the screen, or to the destination specified using the `set print` option. Output can be supressed using the `nooutput` directive. (This can be useful when only assignment to variables is desired - see below.) If gnuplot detects output to a non-interactive terminal, output is formatted in a way that is intended to be easy to parse by another program (name/value pairs). The results of the calculation can be assigned to user-defined variables in the current gnuplot session using the `variables` directive. The `variables` directive can take an optional prefix after an equality sign. If such a prefix is found, it is prepended to the names of the variables in the current session. Unless the `variables` keyword is found (with or without a prefix specification), no assignment to variables is made. Quantities calculated (and their variable names, without prefix): records : number of valid records found invalid : number of invalid records found blank : number of blank lines found blocks : number of data blocks in the file (separated by double blank lines) mean_* : mean stddev_* : standard deviation sumx_* : sum of all values sumx2_* : sum of the squares of all values min_* : minimal value min_pos_* : the position of the minimal value lo_quartile_* : lower quartile median_* : median up_quartile_* : upper quartile max_* : maximum value max_pos_* : the position of the maximum value In the variable names, the `*` is replaced by `x` (for the first or sole column) or `y` (for the second column). For min, max, median, and quartiles, the `stats` command also reports on the position in the file at which the value was found. In the corresponding variables, the `*` is replaced by `pos_x` or `pos_y`. Note: the value reported in this way is the number of the record in the data set. This is not necessarily the same as the line number in the data file if the file contains comment lines, blank lines, or invalid or unreadable records! Furthermore, gnuplot silently skips invalid records, unless an explicit `using` directive with parenthesized columns has been issued like this: `using ($1):($2)`. With a using directive such as: `using 1:2`, the number of invalid records reported by the `stats` command will always be zero. (See the section on `plot using` for more details.) If two columns have been specified with the `using` directive, then the following additional quantities are calculated: slope : slope in a linear regression model intercept : intercept in a linear regression model correlation : linear correlation coefficient sumxy : sum of x and y values ('dot product') All variables and their values can be seen using the `show variables all` The `stats` command is not available in polar or parametric mode, or when logarithmic axes are in effect. Examples: stats 'data' out stats 'data' noout var stats 'data' index 0 using 1:2 out stats [1:10] 'data' using 1 every ::1::12 stats [0:10] 'data1' using ($1*$1) noout var=dat1 If the results have been assigned to variables, then these variables can be used in subsequent `plot` or other commands: stats 'data' using 1:2 noout var plot 'data' using 1:( ($2-mean_x)/stddev_x ) w lp or (showing the original data together with its linear regression): stats 'data' using 1:2 noout var=d_ plot 'data' using 1:2, d_slope*x + d_intercept See: `plot` for details on the `index`, `every`, and `using` directives.