Datasets for `Pattern Recognition and Neural Networks' by B.D. Ripley
=====================================================================

Cambridge University Press (1996)  ISBN  0-521-46086-7

The background to the datasets is described in section 1.4; this file
relates the computer-readable files to that description.


synthetic two-class problem
---------------------------

Data from Ripley (1994a).

This has two real-valued co-ordinates (xs and ys) and a class (xc)
which is 0 or 1.

Data file  synth.tr   has 250 rows of the training set
           synth.te   has 1000 rows of the test set  (not used here)



Diabetes in Pima Indians
------------------------

A population of women who were at least 21 years old, of Pima Indian heritage
and living near Phoenix, Arizona,  was tested for diabetes
according to World Health Organization criteria.  The data
were collected by the US National Institute of Diabetes and Digestive and
Kidney Diseases (Smith et al, 1988). This example is also contained in the
UCI machine-learning database collection (Murphy & Aha, 1995).

The data files have rows containing

npreg 	number of pregnancies
glu 	plasma glucose concentration in an oral glucose tolerance test
bp 	diastolic blood pressure (mm Hg)
skin 	triceps skin fold thickness (mm)
ins	serum insulin (micro U/ml)
bmi 	body mass index (weight in kg/(height in m)^2)
ped 	diabetes pedigree function
age 	in years
type	No / Yes

Data file pima.tr   has 200 rows of complete training data.
          pima.te   has 332 rows of complete test data.

