The first 5 variables are all blood tests which are thought to be sensitive to liver disorders that might arise from excessive alcohol consumption. Each line in the dataset constitutes the record of a single male individual.

liver

Format

A data frame with 345 observations on the following 7 variables.

  1. mcv: mean corpuscular volume

  2. alkphos: alkaline phosphotase

  3. sgpt: alamine aminotransferase

  4. sgot: aspartate aminotransferase

  5. gammagt: gamma-glutamyl transpeptidase

  6. drinks: number of half-pint equivalents of alcoholic beverages drunk per day

  7. selector: field used to split data into two sets

Source

  1. Creators: BUPA Medical Research Ltd.

  2. Donor: Richard S. Forsyth 8 Grosvenor Avenue Mapperley Park Nottingham NG3 5DX 0602-621676

  3. Date: 5/15/1990

Details

BUPA Medical Research Ltd. database donated by Richard S. Forsyth.

Important note: The 7th field (selector) has been widely misinterpreted in the past as a dependent variable representing presence or absence of a liver disorder. This is incorrect [1]. The 7th field was created by BUPA researchers as a train/test selector. It is not suitable as a dependent variable for classification. The dataset does not contain any variable representing presence or absence of a liver disorder. Researchers who wish to use this dataset as a classification benchmark should follow the method used in experiments by the donor (Forsyth & Rada, 1986, Machine learning: applications in expert systems and information retrieval) and others (e.g. Turney, 1995, Cost-sensitive classification: Empirical evaluation of a hybrid genetic decision tree induction algorithm), who used the 6th field (drinks), after dichotomising, as a dependent variable for classification. Because of widespread misinterpretation in the past, researchers should take care to state their method clearly.

References

McDermott & Forsyth 2016, Diagnosing a disorder in a classification benchmark, Pattern Recognition Letters, Volume 73.

https://archive.ics.uci.edu/ml/machine-learning-databases/liver-disorders/

https://archive.ics.uci.edu/ml/datasets/Liver+Disorders