Banner ToolsApproach

Children categories

28. Epidemiology and Statistics

28. Epidemiology and Statistics (12)

Banner 4


28. Epidemiology and Statistics

Chapter Editors:  Franco Merletti, Colin L. Soskolne and Paolo Vineis

Table of Contents

Tables and Figures

Epidemiological Method Applied to Occupational Health and Safety
Franco Merletti, Colin L. Soskolne and Paolo Vineis

Exposure Assessment
M. Gerald Ott

Summary Worklife Exposure Measures
Colin L. Soskolne

Measuring Effects of Exposures
Shelia Hoar Zahm

     Case Study: Measures
     Franco Merletti, Colin L. Soskolne and Paola Vineis

Options in Study Design
Sven Hernberg

Validity Issues in Study Design
Annie J. Sasco

Impact of Random Measurement Error
Paolo Vineis and Colin L. Soskolne

Statistical Methods
Annibale Biggeri and Mario Braga

Causality Assessment and Ethics in Epidemiological Research
Paolo Vineis

Case Studies Illustrating Methodological Issues in the Surveillance of Occupational Diseases
Jung-Der Wang

Questionnaires in Epidemiological Research
Steven D. Stellman and Colin L. Soskolne

Asbestos Historical Perspective
Lawrence Garfinkel


Click a link below to view table in article context.

1. Five selected summary measures of worklife exposure

2. Measures of disease occurrence

3. Measures of association for a cohort study

4. Measures of association for case-control studies

5. General frequency table layout for cohort data

6. Sample layout of case-control data

7. Layout case-control data - one control per case

8. Hypothetical cohort of 1950 individuals to T2

9. Indices of central tendency & dispersion

10. A binomial experiment & probabilities

11. Possible outcomes of a binomial experiment

12. Binomial distribution, 15 successes/30 trials

13. Binomial distribution, p = 0.25; 30 trials

14. Type II error & power; x = 12, n = 30, a = 0.05

15. Type II error & power; x = 12, n = 40, a = 0.05

16. 632 workers exposed to asbestos 20 years or longer

17. O/E number of deaths among 632 asbestos workers


Point to a thumbnail to see figure caption, click to see the figure in article context.


Click to return to top of page

View items...
29. Ergonomics

29. Ergonomics (27)

Banner 4


29. Ergonomics

Chapter Editors:  Wolfgang Laurig and Joachim Vedder



Table of Contents 

Tables and Figures

Wolfgang Laurig and Joachim Vedder

Goals, Principles and Methods

The Nature and Aims of Ergonomics
William T. Singleton

Analysis of Activities, Tasks and Work Systems
Véronique De Keyser

Ergonomics and Standardization
Friedhelm Nachreiner

Pranab Kumar Nag

Physical and Physiological Aspects

Melchiorre Masali

Muscular Work
Juhani Smolander and Veikko Louhevaara

Postures at Work
Ilkka Kuorinka

Frank Darby

General Fatigue
Étienne Grandjean

Fatigue and Recovery
Rolf Helbig and Walter Rohmert

Psychological Aspects

Mental Workload
Winfried Hacker

Herbert Heuer

Mental Fatigue
Peter Richter

Organizational Aspects of Work

Work Organization
Eberhard Ulich and Gudela Grote

Sleep Deprivation
Kazutaka Kogi

Work Systems Design

Roland Kadefors

T.M. Fraser

Controls, Indicators and Panels
Karl H. E. Kroemer

Information Processing and Design
Andries F. Sanders

Designing for Everyone

Designing for Specific Groups
Joke H. Grady-van den Nieuwboer

     Case Study: The International Classification of Functional Limitation in People

Cultural Differences
Houshang Shahnavaz

Elderly Workers
Antoine Laville and Serge Volkoff

Workers with Special Needs
Joke H. Grady-van den Nieuwboer

Diversity and Importance of Ergonomics--Two Examples

System Design in Diamond Manufacturing
Issachar Gilad

Disregarding Ergonomic Design Principles: Chernobyl
Vladimir M. Munipov 


Click a link below to view table in article context.

1. Basic anthropometric core list

2. Fatigue & recovery dependent on activity levels

3. Rules of combination effects of two stress factors on strain

4. Differenting among several negative consequences of mental strain

5. Work-oriented principles for production structuring

6. Participation in organizational context

7. User participation in the technology process

8. Irregular working hours & sleep deprivation

9. Aspects of advance, anchor & retard sleeps

10. Control movements & expected effects

11. Control-effect relations of common hand controls

12. Rules for arrangement of controls

13. Guidelines for labels


Point to a thumbnail to see figure caption, click to see the figure in the article context.


View items...
32. Record Systems and Surveillance

32. Record Systems and Surveillance (9)

Banner 4


32. Record Systems and Surveillance

Chapter Editor:  Steven D. Stellman



Table of Contents 

Tables and Figures

Occupational Disease Surveillance and Reporting Systems
Steven B. Markowitz

Occupational Hazard Surveillance
David H. Wegman and Steven D. Stellman

Surveillance in Developing Countries
David Koh and Kee-Seng Chia

Development and Application of an Occupational Injury and Illness Classification System
Elyce Biddle

Risk Analysis of Nonfatal Workplace Injuries and Illnesses
John W. Ruser

Case Study: Worker Protection and Statistics on Accidents and Occupational Diseases - HVBG, Germany
Martin Butz and Burkhard Hoffmann

Case Study: Wismut - A Uranium Exposure Revisited
Heinz Otten and Horst Schulz

Measurement Strategies and Techniques for Occupational Exposure Assessment in Epidemiology
Frank Bochmann and Helmut Blome

Case Study: Occupational Health Surveys in China


Click a link below to view the table in article context.

1. Angiosarcoma of the liver - world register

2. Occupational illness, US, 1986 versus 1992

3. US Deaths from pneumoconiosis & pleural mesothelioma

4. Sample list of notifiable occupational diseases

5. Illness & injury reporting code structure, US

6. Nonfatal occupational injuries & illnesses, US 1993

7. Risk of occupational injuries & illnesses

8. Relative risk for repetitive motion conditions

9. Workplace accidents, Germany, 1981-93

10. Grinders in metalworking accidents, Germany, 1984-93

11. Occupational disease, Germany, 1980-93

12. Infectious diseases, Germany, 1980-93

13. Radiation exposure in the Wismut mines

14. Occupational diseases in Wismut uranium mines 1952-90


Point to a thumbnail to see figure caption, click to see the figure in article context.


Click to return to top of page

View items...
33. Toxicology

33. Toxicology (21)

Banner 4


33. Toxicology

Chapter Editor: Ellen K. Silbergeld

Table of Contents

Tables and Figures

Ellen K. Silbergeld, Chapter Editor

General Principles of Toxicology

Definitions and Concepts
Bo Holmberg, Johan Hogberg and Gunnar Johanson

Dušan Djuríc

Target Organ And Critical Effects
Marek Jakubowski

Effects Of Age, Sex And Other Factors
Spomenka Telišman

Genetic Determinants Of Toxic Response
Daniel W. Nebert and Ross A. McKinnon

Mechanisms of Toxicity

Introduction And Concepts
Philip G. Watanabe

Cellular Injury And Cellular Death
Benjamin F. Trump and Irene K. Berezesky

Genetic Toxicology
R. Rita Misra and Michael P. Waalkes

Joseph G. Vos and Henk van Loveren

Target Organ Toxicology
Ellen K. Silbergeld

Toxicology Test Methods

Philippe Grandjean

Genetic Toxicity Assessment
David M. DeMarini and James Huff

In Vitro Toxicity Testing
Joanne Zurlo

Structure Activity Relationships
Ellen K. Silbergeld

Regulatory Toxicology

Toxicology In Health And Safety Regulation
Ellen K. Silbergeld

Principles Of Hazard Identification - The Japanese Approach
Masayuki Ikeda

The United States Approach to Risk Assessment Of Reproductive Toxicants and Neurotoxic Agents
Ellen K. Silbergeld

Approaches To Hazard Identification - IARC
Harri Vainio and Julian Wilbourn

Appendix - Overall Evaluations of Carcinogenicity to Humans: IARC Monographs Volumes 1-69 (836)

Carcinogen Risk Assessment: Other Approaches
Cees A. van der Heijden


Click a link below to view table in article context.

  1. Examples of critical organs & critical effects
  2. Basic effects of possible multiple interactions of metals
  3. Haemoglobin adducts in workers exposed to aniline & acetanilide
  4. Hereditary, cancer-prone disorders & defects in DNA repair
  5. Examples of chemicals that exhibit genotoxicity in human cells
  6. Classification of tests for immune markers
  7. Examples of biomarkers of exposure
  8. Pros & cons of methods for identifying human cancer risks
  9. Comparison of in vitro systems for hepatotoxicity studies
  10. Comparison of SAR & test data: OECD/NTP analyses
  11. Regulation of chemical substances by laws, Japan
  12. Test items under the Chemical Substance Control Law, Japan
  13. Chemical substances & the Chemical Substances Control Law
  14. Selected major neurotoxicity incidents
  15. Examples of specialized tests to measure neurotoxicity
  16. Endpoints in reproductive toxicology
  17. Comparison of low-dose extrapolations procedures
  18. Frequently cited models in carcinogen risk characterization


Point to a thumbnail to see figure caption, click to see figure in article context.


Click to return to top of page

View items...
Tuesday, 08 March 2011 21:29

General Fatigue

This article is adapted from the 3rd edition of the Encyclopaedia of Occupational Health and Safety.

The two concepts of fatigue and rest are familiar to all from personal experience. The word “fatigue” is used to denote very different conditions, all of which cause a reduction in work capacity and resistance. The very varied use of the concept of fatigue has resulted in an almost chaotic confusion and some clarification of current ideas is necessary. For a long time, physiology has distinguished between muscle fatigue and general fatigue. The former is an acute painful phenomenon localized in the muscles: general fatigue is characterized by a sense of diminishing willingness to work. This article is concerned only with general fatigue, which may also be called “psychic fatigue” or “nervous fatigue” and the rest that it necessitates.

General fatigue may be due to quite different causes, the most important of which are shown in figure 1. The effect is as if, during the course of the day, all the various stresses experienced accumulate within the organism, gradually producing a feeling of increasing fatigue. This feeling prompts the decision to stop work; its effect is that of a physiological prelude to sleep.

Figure 1. Diagrammatic presentation of the cumulative effect of the everyday causes of fatigue


Fatigue is a salutary sensation if one can lie down and rest. However, if one disregards this feeling and forces oneself to continue working, the feeling of fatigue increases until it becomes distressing and finally overwhelming. This daily experience demonstrates clearly the biological significance of fatigue which plays a part in sustaining life, similar to that played by other sensations such as, for example, thirst, hunger, fear, etc.

Rest is represented in figure 1 as the emptying of a barrel. The phenomenon of rest can take place normally if the organism remains undisturbed or if at least one essential part of the body is not subjected to stress. This explains the decisive part played on working days by all work breaks, from the short pause during work to the nightly sleep. The simile of the barrel illustrates how necessary it is for normal living to reach a certain equilibrium between the total load borne by the organism and the sum of the possibilities for rest.

Neurophysiological interpretation of fatigue

The progress of neurophysiology during the last few decades has greatly contributed to a better understanding of the phenomena triggered off by fatigue in the central nervous system.

The physiologist Hess was the first to observe that electrical stimulation of certain of the diencephalic structures, and more especially of certain of the structures of the medial nucleus of the thalamus, gradually produced an inhibiting effect which showed itself in a deterioration in the capacity for reaction and in a tendency to sleep. If the stimulation was continued for a certain time, general relaxation was followed by sleepiness and finally by sleep. It was later proved that starting from these structures, an active inhibition may extend to the cerebral cortex where all conscious phenomena are centered. This is reflected not only in behaviour, but also in the electrical activity of the cerebral cortex. Other experiments have also succeeded in initiating inhibitions from other subcortical regions.

The conclusion which can be drawn from all these studies is that there are structures located in the diencephalon and mesencephalon which represent an effective inhibiting system and which trigger off fatigue with all its accompanying phenomena.

Inhibition and activation

Numerous experiments performed on animals and humans have shown that the general disposition of them both to reaction depends not only on this system of inhibition but essentially also on a system functioning in an antagonistic manner, known as the reticular ascending system of activation. We know from experiments that the reticular formation contains structures that control the degree of wakefulness, and consequently the general dispositions to a reaction. Nervous links exist between these structures and the cerebral cortex where the activating influences are exerted on the consciousness. Moreover, the activating system receives stimulation from the sensory organs. Other nervous connections transmit impulses from the cerebral cortex—the area of perception and thought—to the activation system. On the basis of these neurophysiological concepts, it can be established that external stimuli, as well as influences originating in the areas of consciousness, may, in passing through the activating system, stimulate a disposition to a reaction.

In addition, many other investigations make it possible to conclude that stimulation of the activating system frequently spreads also from the vegetative centers, and cause the organism to orient towards the expenditure of energy, towards work, struggle, flight, etc. (ergotropic conversion of the internal organs). Conversely, it appears that stimulation of the inhibiting system within the sphere of the vegetative nervous system causes the organism to tend towards rest, reconstitution of its reserves of energy, phenomena of assimilation (trophotropic conversion).

By synthesis of all these neurophysiological findings, the following conception of fatigue can be established: the state and feeling of fatigue are conditioned by the functional reaction of the consciousness in the cerebral cortex, which is, in turn, governed by two mutually antagonistic systems—the inhibiting system and the activating system. Thus, the disposition of humans to work depends at each moment on the degree of activation of the two systems: if the inhibiting system is dominant, the organism will be in a state of fatigue; when the activating system is dominant, it will exhibit an increased disposition to work.

This psychophysiological conception of fatigue makes it possible to understand certain of its symptoms which are sometimes difficult to explain. Thus, for example, a feeling of fatigue may disappear suddenly when some unexpected outside event occurs or when emotional tension develops. It is clear in both these cases that the activating system has been stimulated. Conversely, if the surroundings are monotonous or work seems boring, the functioning of the activating system is diminished and the inhibiting system becomes dominant. This explains why fatigue appears in a monotonous situation without the organism being subjected to any workload.

Figure 2 depicts diagrammatically the notion of the mutually antagonistic systems of inhibition and activation.

Figure 2. Diagrammatic presentation of the control of disposition to work by means of inhibiting and activating systems


Clinical fatigue

It is a matter of common experience that pronounced fatigue occurring day after day will gradually produce a state of chronic fatigue. The feeling of fatigue is then intensified and comes on not only in the evening after work but already during the day, sometimes even before the start of work. A feeling of malaise, frequently of an emotive nature, accompanies this state. The following symptoms are often observed in persons suffering from fatigue: heightened psychic emotivity (antisocial behaviour, incompatibility), tendency to depression (unmotivated anxiety), and lack of energy with loss of initiative. These psychic effects are often accompanied by an unspecific malaise and manifest themselves by psychosomatic symptoms: headaches, vertigo, cardiac and respiratory functional disturbances, loss of appetite, digestive disorders, insomnia, etc.

In view of the tendency towards morbid symptoms that accompany chronic fatigue, it may justly be called clinical fatigue. There is a tendency towards increased absenteeism, and particularly to more absences for short periods. This would appear to be caused both by the need for rest and by increased morbidity. The state of chronic fatigue occurs particularly among persons exposed to psychic conflicts or difficulties. It is sometimes very difficult to distinguish the external and internal causes. In fact, it is almost impossible to distinguish cause and effect in clinical fatigue: a negative attitude towards work, superiors or workplace may just as well be the cause of clinical fatigue as the result.

Research has shown that the switchboard operators and supervisory personnel employed in telecommunications services exhibited a significant increase in physiological symptoms of fatigue after their work (visual reaction time, flicker fusion frequency, dexterity tests). Medical investigations revealed that in these two groups of workers there was a significant increase in neurotic conditions, irritability, difficulty in sleeping and in the chronic feeling of lassitude, by comparison with a similar group of women employed in the technical branches of the postal, telephone and telegraphic services. The accumulation of symptoms was not always due to a negative attitude on the part of the women affected their job or their working conditions.

Preventive Measures

There is no panacea for fatigue but much can be done to alleviate the problem by attention to general working conditions and the physical environment at the workplace. For example much can be achieved by the correct arrangement of hours of work, provision of adequate rest periods and suitable canteens and restrooms; adequate paid holidays should also be given to workers. The ergonomic study of the workplace can also help in the reduction of fatigue by ensuring that seats, tables, and workbenches are of suitable dimensions and that the workflow is correctly organized. In addition, noise control, air-conditioning, heating, ventilation, and lighting may all have a beneficial effect on delaying the onset of fatigue in workers.

Monotony and tension may also be alleviated by controlled use of colour and decoration in the surroundings, intervals of music and sometimes breaks for physical exercises for sedentary workers. Training of workers and in particular of supervisory and management staff also play an important part.



Sunday, 16 January 2011 18:43

Target Organ Toxicology

The study and characterization of chemicals and other agents for toxic properties is often undertaken on the basis of specific organs and organ systems. In this chapter, two targets have been selected for in-depth discussion: the immune system and the gene. These examples were chosen to represent a complex target organ system and a molecular target within cells. For more comprehensive discussion of the toxicology of target organs, the reader is referred to standard toxicology texts such as Casarett and Doull, and Hayes. The International Programme on Chemical Safety (IPCS) has also published several criteria documents on target organ toxicology, by organ system.

Target organ toxicology studies are usually undertaken on the basis of information indicating the potential for specific toxic effects of a substance, either from epidemiological data or from general acute or chronic toxicity studies, or on the basis of special concerns to protect certain organ functions, such as reproduction or foetal development. In some cases, specific target organ toxicity tests are expressly mandated by statutory authorities, such as neurotoxicity testing under the US pesticides law (see “The United States approach to risk assessment of reproductive toxicants and neurotoxic agents,” and mutagenicity testing under the Japanese Chemical Substance Control Law (see “Principles of hazard identification: The Japanese approach”).

As discussed in “Target organ and critical effects,” the identification of a critical organ is based upon the detection of the organ or organ system which first responds adversely or to the lowest doses or exposures. This information is then used to design specific toxicology investigations or more defined toxicity tests that are designed to elicit more sensitive indications of intoxication in the target organ. Target organ toxicology studies may also be used to determine mechanisms of action, of use in risk assessment (see “The United States approach to risk assessment of reproductive toxicants and neurotoxic agents”).

Methods of Target Organ Toxicity Studies

Target organs may be studied by exposure of intact organisms and detailed analysis of function and histopathology in the target organ, or by in vitro exposure of cells, tissue slices, or whole organs maintained for short or long term periods in culture (see “Mechanisms of toxicology: Introduction and concepts”). In some cases, tissues from human subjects may also be available for target organ toxicity studies, and these may provide opportunities to validate assumptions of cross-species extrapolation. However, it must be kept in mind that such studies do not provide information on relative toxicokinetics.

In general, target organ toxicity studies share the following common characteristics: detailed histopathological examination of the target organ, including post mortem examination, tissue weight, and examination of fixed tissues; biochemical studies of critical pathways in the target organ, such as important enzyme systems; functional studies of the ability of the organ and cellular constituents to perform expected metabolic and other functions; and analysis of biomarkers of exposure and early effects in target organ cells.

Detailed knowledge of target organ physiology, biochemistry and molecular biology may be incorporated in target organ studies. For instance, because the synthesis and secretion of small-molecular-weight proteins is an important aspect of renal function, nephrotoxicity studies often include special attention to these parameters (IPCS 1991). Because cell-to-cell communication is a fundamental process of nervous system function, target organ studies in neurotoxicity may include detailed neurochemical and biophysical measurements of neurotransmitter synthesis, uptake, storage, release and receptor binding, as well as electrophysiological measurement of changes in membrane potential associated with these events.

A high degree of emphasis is being placed upon the development of in vitro methods for target organ toxicity, to replace or reduce the use of whole animals. Substantial advances in these methods have been achieved for reproductive toxicants (Heindel and Chapin 1993).

In summary, target organ toxicity studies are generally undertaken as a higher order test for determining toxicity. The selection of specific target organs for further evaluation depends upon the results of screening level tests, such as the acute or subchronic tests used by OECD and the European Union; some target organs and organ systems may be a priori candidates for special investigation because of concerns to prevent certain types of adverse health effects.



Tuesday, 01 March 2011 02:17

Validity Issues in Study Design

The Need for Validity

Epidemiology aims at providing an understanding of the disease experience in populations. In particular, it can be used to obtain insight into the occupational causes of ill health. This knowledge comes from studies conducted on groups of people having a disease by comparing them to people without that disease. Another approach is to examine what diseases people who work in certain jobs with particular exposures acquire and to compare these disease patterns to those of people not similarly exposed. These studies provide estimates of risk of disease for specific exposures. For information from such studies to be used for establishing prevention programmes, for the recognition of occupational diseases, and for those workers affected by exposures to be appropriately compensated, these studies must be valid.

Validity can be defined as the ability of a study to reflect the true state of affairs. A valid study is therefore one which measures correctly the association (either positive, negative or absent) between an exposure and a disease. It describes the direction and magnitude of a true risk. Two types of validity are distinguished: internal and external validity. Internal validity is a study’s ability to reflect what really happened among the study subjects; external validity reflects what could occur in the population.

Validity relates to the truthfulness of a measurement. Validity must be distinguished from precision of the measurement, which is a function of the size of the study and the efficiency of the study design.

Internal Validity

A study is said to be internally valid when it is free from biases and therefore truly reflects the association between exposure and disease which exists among the study participants. An observed risk of disease in association with an exposure may indeed result from a real association and therefore be valid, but it may also reflect the influence of biases. A bias will give a distorted image of reality.

Three major types of biases, also called systematic errors, are usually distinguished:

  • selection bias
  • information or observation bias
  • confounding


They will be presented briefly below, using examples from the occupational health setting.

Selection bias

Selection bias will occur when the entry into the study is influenced by knowledge of the exposure status of the potential study participant. This problem is therefore encountered only when the disease has already taken place by the time (before) the person enters the study. Typically, in the epidemiological setting, this will happen in case-control studies or in retrospective cohort studies. This means that a person will be more likely to be considered a case if it is known that he or she has been exposed. Three sets of circumstances may lead to such an event, which will also depend on the severity of the disease.

Self-selection bias

This can occur when people who know they have been exposed to known or believed harmful products in the past and who are convinced their disease is the result of the exposure will consult a physician for symptoms which other people, not so exposed, might have ignored. This is particularly likely to happen for diseases which have few noticeable symptoms. An example may be early pregnancy loss or spontaneous abortion among female nurses handling drugs used for cancer treatment. These women are more aware than most of reproductive physiology and, by being concerned about their ability to have children, may be more likely to recognize or label as a spontaneous abortion what other women would only consider as a delay in the onset of menstruation. Another example from a retrospective cohort study, cited by Rothman (1986), involves a Centers for Disease Control study of leukaemia among troops who had been present during a US atomic test in Nevada. Of the troops present on the test site, 76% were traced and constituted the cohort. Of these, 82% were found by the investigators, but an additional 18% contacted the investigators themselves after hearing publicity about the study. Four cases of leukaemia were present among the 82% traced by CDC and four cases were present among the self-referred 18%. This strongly suggests that the investigators’ ability to identify exposed persons was linked to leukaemia risk.

Diagnostic bias

This will occur when the doctors are more likely to diagnose a given disease once they know to what the patient has been previously exposed. For example, when most paints were lead-based, a symptom of disease of the peripheral nerves called peripheral neuritis with paralysis was also known as painters’ “wrist drop”. Knowing the occupation of the patient made it easier to diagnose the disease even in its early stages, whereas the identification of the causal agent would be much more difficult in research participants not known to be occupationally exposed to lead.

Bias resulting from refusal to participate in a study

When people, either healthy or sick, are asked to participate in a study, several factors play a role in determining whether or not they will agree. Willingness to answer variably lengthy questionnaires, which at times inquire about sensitive issues, and even more so to give blood or other biological samples, may be determined by the degree of self-interest held by the person. Someone who is aware of past potential exposure may be ready to comply with this inquiry in the hope that it will help to find the cause of the disease, whereas someone who considers that they have not been exposed to anything dangerous, or who is not interested in knowing, may decline the invitation to participate in the study. This can lead to a selection of those people who will finally be the study participants as compared to all those who might have been.

Information bias

This is also called observation bias and concerns disease outcome in follow-up studies and exposure assessment in case-control studies.

Differential outcome assessment in prospective follow-up (cohort) studies

Two groups are defined at the start of the study: an exposed group and an unexposed group. Problems of diagnostic bias will arise if the search for cases differs between these two groups. For example, consider a cohort of people exposed to an accidental release of dioxin in a given industry. For the highly exposed group, an active follow-up system is set up with medical examinations and biological monitoring at regular intervals, whereas the rest of the working population receives only routine care. It is highly likely that more disease will be identified in the group under close surveillance, which would lead to a potential over-estimation of risk.

Differential losses in retrospective cohort studies

The reverse mechanism to that described in the preceding paragraph may occur in retrospective cohort studies. In these studies, the usual way of proceeding is to start with the files of all the people who have been employed in a given industry in the past, and to assess disease or mortality subsequent to employment. Unfortunately, in almost all studies files are incomplete, and the fact that a person is missing may be related either to exposure status or to disease status or to both. For example, in a recent study conducted in the chemical industry in workers exposed to aromatic amines, eight tumours were found in a group of 777 workers who had undergone cytological screening for urinary tumours. Altogether, only 34 records were found missing, corresponding to a 4.4% loss from the exposure assessment file, but for bladder cancer cases, exposure data were missing for two cases out of eight, or 25%. This shows that the files of people who became cases were more likely to become lost than the files of other workers. This may occur because of more frequent job changes within the company (which may be linked to exposure effects), resignation, dismissal or mere chance.

Differential assessment of exposure in case-control studies

In case-control studies, the disease has already occurred at the start of the study, and information will be sought on exposures in the past. Bias may result either from the interviewer’s or study participant’s attitude to the investigation. Information is usually collected by trained interviewers who may or may not be aware of the hypothesis underlying the research. For example, in a population-based case-control study of bladder cancer conducted in a highly industrialized region, study staff may well be aware of the fact that certain chemicals, such as aromatic amines, are risk factors for bladder cancer. If they also know who has developed the disease and who has not, they may be likely to conduct more in-depth interviews with the participants who have bladder cancer than with the controls. They may insist on more detailed information of past occupations, searching systematically for exposure to aromatic amines, whereas for controls they may record occupations in a more routine way. The resulting bias is known as exposure suspicion bias.

The participants themselves may also be responsible for such bias. This is called recall bias to distinguish it from interviewer bias. Both have exposure suspicion as the mechanism for the bias. Persons who are sick may suspect an occupational origin to their disease and therefore will try to remember as accurately as possible all the dangerous agents to which they may have been exposed. In the case of handling undefined products, they may be inclined to recall the names of precise chemicals, particularly if a list of suspected products is made available to them. By contrast, controls may be less likely to go through the same thought process.


Confounding exists when the association observed between exposure and disease is in part the result of a mixing of the effect of the exposure under study and another factor. Let us say, for example, that we are finding an increased risk of lung cancer among welders. We are tempted to conclude immediately that there is a causal association between exposure to welding fumes and lung cancer. However, we also know that smoking is by far the main risk factor for lung cancer. Therefore, if information is available, we begin checking the smoking status of welders and other study participants. We may find that welders are more likely to smoke than non-welders. In that situation, smoking is known to be associated with lung cancer and, at the same time, in our study smoking is also found to be associated with being a welder. In epidemiological terms, this means that smoking, linked both to lung cancer and to welding, is confounding the association between welding and lung cancer.

Interaction or effect modification

In contrast to all the issues listed above, namely selection, information and confounding, which are biases, interaction is not a bias due to problems in study design or analysis, but reflects reality and its complexity. An example of this phenomenon is the following: exposure to radon is a risk factor for lung cancer, as is smoking. In addition, smoking and radon exposure have different effects on lung cancer risk depending on whether they act together or in isolation. Most of the occupational studies on this topic have been conducted among underground miners and at times have provided conflicting results. Overall, there seem to be arguments in favour of an interaction of smoking and radon exposure in producing lung cancer. This means that lung cancer risk is increased by exposure to radon, even in non-smokers, but that the size of the risk increase from radon is much greater among smokers than among non-smokers. In epidemiological terms, we say that the effect is multiplicative. In contrast to confounding, described above, interaction needs to be carefully analysed and described in the analysis rather than simply controlled, as it reflects what is happening at the biological level and is not merely a consequence of poor study design. Its explanation leads to a more valid interpretation of the findings from a study.

External Validity

This issue can be addressed only after ensuring that internal validity is secured. If we are convinced that the results observed in the study reflect associations which are real, we can ask ourselves whether or not we can extrapolate these results to the larger population from which the study participants themselves were drawn, or even to other populations which are identical or at least very similar. The most common question is whether results obtained for men also apply to women. For years, studies and, in particular, occupational epidemiological investigations have been conducted exclusively among men. Studies among chemists carried out in the 1960s and 1970s in the United States, United Kingdom and Sweden all found increased risks of specific cancers—namely leukaemia, lymphoma and pancreatic cancer. Based on what we knew of the effects of exposure to solvents and some other chemicals, we could already have deduced at the time that laboratory work also entailed carcinogenic risk for women. This in fact was shown to be the case when the first study among women chemists was finally published in the mid-1980s, which found results similar to those among men. It is worth noting that other excess cancers found were tumours of the breast and ovary, traditionally considered as being related only to endogenous factors or reproduction, but for which newly suspected environmental factors such as pesticides may play a role. Much more work needs to be done on occupational determinants of female cancers.

Strategies for a Valid Study

A perfectly valid study can never exist, but it is incumbent upon the researcher to try to avoid, or at least to minimize, as many biases as possible. This can often best be done at the study design stage, but can also be carried out during analysis.

Study design

Selection and information bias can be avoided only through the careful design of an epidemiological study and the scrupulous implementation of all the ensuing day-to-day guidelines, including meticulous attention to quality assurance, for the conduct of the study in field conditions. Confounding may be dealt with either at the design or analysis stage.


Criteria for considering a participant as a case must be explicitly defined. One cannot, or at least should not, attempt to study ill-defined clinical conditions. A way of minimizing the impact that knowledge of the exposure may have on disease assessment is to include only severe cases which would have been diagnosed irrespective of any information on the history of the patient. In the field of cancer, studies often will be limited to cases with histological proof of the disease to avoid the inclusion of borderline lesions. This also will mean that groups under study are well defined. For example, it is well-known in cancer epidemiology that cancers of different histological types within a given organ may have dissimilar risk factors. If the number of cases is sufficient, it is better to separate adenocarcinoma of the lung from squamous cell carcinoma of the lung. Whatever the final criteria for entry into the study, they should always be clearly defined and described. For example, the exact code of the disease should be indicated using the International Classification of Diseases (ICD) and also, for cancer, the International Classification of Diseases-Oncology (ICD-O).

Efforts should be made once the criteria are specified to maximize participation in the study. The decision to refuse to participate is hardly ever made at random and therefore leads to bias. Studies should first of all be presented to the clinicians who are seeing the patients. Their approval is needed to approach patients, and therefore they will have to be convinced to support the study. One argument that is often persuasive is that the study is in the interest of the public health. However, at this stage it is better not to discuss the exact hypothesis being evaluated in order to avoid unduly influencing the clinicians involved. Physicians should not be asked to take on supplementary duties; it is easier to convince health personnel to lend their support to a study if means are provided by the study investigators to carry out any additional tasks, over and above routine care, necessitated by the study. Interviewers and data abstractors ought to be unaware of the disease status of their patients.

Similar attention should be paid to the information provided to participants. The goal of the study must be described in broad, neutral terms, but must also be convincing and persuasive. It is important that issues of confidentiality and interest for public health be fully understood while avoiding medical jargon. In most settings, use of financial or other incentives is not considered appropriate, although compensation should be provided for any expense a participant may incur. Last, but not least, the general population should be sufficiently scientifically literate to understand the importance of such research. Both the benefits and the risks of participation must be explained to each prospective participant where they need to complete questionnaires and/or to provide biological samples for storage and/or analysis. No coercion should be applied in obtaining prior and fully informed consent. Where studies are exclusively records-based, prior approval of the agencies responsible for ensuring the confidentiality of such records must be secured. In these instances, individual participant consent usually can be waived. Instead, approval of union and government officers will suffice. Epidemiological investigations are not a threat to an individual’s private life, but are a potential aid to improve the health of the population. The approval of an institutional review board (or ethics review committee) will be needed prior to the conduct of a study, and much of what is stated above will be expected by them for their review.


In prospective follow-up studies, means for assessment of the disease or mortality status must be identical for exposed and non-exposed participants. In particular, different sources should not be used, such as only checking in a central mortality register for non-exposed participants and using intensive active surveillance for exposed participants. Similarly, the cause of death must be obtained in strictly comparable ways. This means that if a system is used to gain access to official documents for the unexposed population, which is often the general population, one should never plan to get even more precise information through medical records or interviews on the participants themselves or on their families for the exposed subgroup.

In retrospective cohort studies, efforts should be made to determine how closely the population under study is compared to the population of interest. One should beware of potential differential losses in exposed and non-exposed groups by using various sources concerning the composition of the population. For example, it may be useful to compare payroll lists with union membership lists or other professional listings. Discrepancies must be reconciled and the protocol adopted for the study must be closely followed.

In case-control studies, other options exist to avoid biases. Interviewers, study staff and study participants need not be aware of the precise hypothesis under study. If they do not know the association being tested, they are less likely to try to provide the expected answer. Keeping study personnel in the dark as to the research hypothesis is in fact often very impractical. The interviewer will almost always know the exposures of greatest potential interest as well as who is a case and who is a control. We therefore have to rely on their honesty and also on their training in basic research methodology, which should be a part of their professional background; objectivity is the hallmark at all stages in science.

It is easier not to inform the study participants of the exact object of the research. Good, basic explanations on the need to collect data in order to have a better understanding of health and disease are usually sufficient and will satisfy the needs of ethics review.


Confounding is the only bias which can be dealt with either at the study design stage or, provided adequate information is available, at the analysis stage. If, for example, age is considered to be a potential confounder of the association of interest because age is associated with the risk of disease (i.e., cancer becomes more frequent in older age) and also with exposure (conditions of exposure vary with age or with factors related to age such as qualification, job position and duration of employment), several solutions exist. The simplest is to limit the study to a specified age range—for example, enrol only Caucasian men aged 40 to 50. This will provide elements for a simple analysis, but will also have the drawback of limiting the application of the results to a single sex age/racial group. Another solution is matching on age. This means that for each case, a referent of the same age is needed. This is an attractive idea, but one has to keep in mind the possible difficulty of fulfilling this requirement as the number of matching factors increases. In addition, once a factor has been matched on, it becomes impossible to evaluate its role in the occurrence of disease. The last solution is to have sufficient information on potential confounders in the study database in order to check for them in the analysis. This can be done either through a simple stratified analysis, or with more sophisticated tools such as multivariate analysis. However, it should be remembered that analysis will never be able to compensate for a poorly designed or conducted study.


The potential for biases to occur in epidemiological research is long established. This was not too much of a concern when the associations being studied were strong (as is the case for smoking and lung cancer) and therefore some inaccuracy did not cause too severe a problem. However, now that the time has come to evaluate weaker risk factors, the need for better tools becomes paramount. This includes the need for excellent study designs and the possibility of combining the advantages of various traditional designs such as the case-control or cohort studies with more innovative approaches such as case-control studies nested within a cohort. Also, the use of biomarkers may provide the means of obtaining more accurate assessments of current and possibly past exposures, as well as for the early stages of disease.



Tuesday, 08 March 2011 21:40

Fatigue and Recovery

Fatigue and recovery are periodic processes in every living organism. Fatigue can be described as a state which is characterized by a feeling of tiredness combined with a reduction or undesired variation in the performance of the activity (Rohmert 1973).

Not all the functions of the human organism become tired as a result of use. Even when asleep, for example, we breathe and our heart is pumping without pause. Obviously, the basic functions of breathing and heart activity are possible throughout life without fatigue and without pauses for recovery.

On the other hand, we find after fairly prolonged heavy work that there is a reduction in capacity—which we call fatigue. This does not apply to muscular activity alone. The sensory organs or the nerve centres also become tired. It is, however, the aim of every cell to balance out the capacity lost by its activity, a process which we call recovery.

Stress, Strain, Fatigue and Recovery

The concepts of fatigue and recovery at human work is closely related to the ergonomic concepts of stress and strain (Rohmert 1984) (figure 1).

Figure 1. Stress, strain and fatigue


Stress means the sum of all parameters of work in the working system influencing people at work, which are perceived or sensed mainly over the receptor system or which put demands on the effector system. The parameters of stress result from the work task (muscular work, non-muscular work—task-oriented dimensions and factors) and from the physical, chemical and social conditions under which the work has to be done (noise, climate, illumination, vibration, shift work, etc.—situation-oriented dimensions and factors).

The intensity/difficulty, the duration and the composition (i.e., the simultaneous and successive distribution of these specific demands) of the stress factors results in combined stress, which all the exogenous effects of a working system exert on the working person. This combined stress can be actively coped with or passively put up with, specifically depending on the behaviour of the working person. The active case will involve activities directed towards the efficiency of the working system, while the passive case will induce reactions (voluntary or involuntary), which are mainly concerned with minimizing stress. The relation between the stress and activity is decisively influenced by the individual characteristics and needs of the working person. The main factors of influence are those that determine performance and are related to motivation and concentration and those related to disposition, which can be referred to as abilities and skills.

The stresses relevant to behaviour, which are manifest in certain activities, cause individually different strains. The strains can be indicated by the reaction of physiological or biochemical indicators (e.g., raising the heart rate) or it can be perceived. Thus, the strains are susceptible to “psycho-physical scaling”, which estimates the strain as experienced by the working person. In a behavioural approach, the existence of strain can also be derived from an activity analysis. The intensity with which indicators of strain (physiological-biochemical, behaviouristic or psycho-physical) react depends on the intensity, duration, and combination of stress factors as well as on the individual characteristics, abilities, skills, and needs of the working person.

Despite constant stresses the indicators derived from the fields of activity, performance and strain may vary over time (temporal effect). Such temporal variations are to be interpreted as processes of adaptation by the organic systems. The positive effects cause a reduction of strain/improvement of activity or performance (e.g., through training). In the negative case, however, they will result in increased strain/reduced activity or performance (e.g., fatigue, monotony).

The positive effects may come into action if the available abilities and skills are improved in the working process itself, e.g., when the threshold of training stimulation is slightly exceeded. The negative effects are likely to appear if so-called endurance limits (Rohmert 1984) are exceeded in the course of the working process. This fatigue leads to a reduction of physiological and psychological functions, which can be compensated by recovery.

To restore the original performance rest allowances or at least periods with less stress are necessary (Luczak 1993).

When the process of adaptation is carried beyond defined thresholds, the employed organic system may be damaged so as to cause a partial or total deficiency of its functions. An irreversible reduction of functions may appear when stress is far too high (acute damage) or when recovery is impossible for a longer time (chronic damage). A typical example of such damage is noise-induced hearing loss.

Models of Fatigue

Fatigue can be many-sided, depending on the form and combi-nation of strain, and a general definition of it is yet not possible. The biological proceedings of fatigue are in general not measurable in a direct way, so that the definitions are mainly oriented towards the fatigue symptoms. These fatigue symptoms can be divided, for example, into the following three categories.

    1. Physiological symptoms: fatigue is interpreted as a decrease of functions of organs or of the whole organism. It results in physiological reactions, e.g., in an increase of heart rate frequency or electrical muscle activity (Laurig 1970).
    2. Behavioural symptoms: fatigue is interpreted mainly as a decrease of performance parameters. Examples are increasing errors when solving certain tasks, or an increasing variability of performance.
    3. Psycho-physical symptoms: fatigue is interpreted as an increase of the feeling of exertion and deterioration of sensation, depending on the intensity, duration and composition of stress factors.


        In the process of fatigue all three of these symptoms may play a role, but they may appear at different points in time.

        Physiological reactions in organic systems, particularly those involved in the work, may appear first. Later on, the feelings of exertion may be affected. Changes in performance are manifested generally in a decreasing regularity of work or in an increasing quantity of errors, although the mean of the performance may not yet be affected. On the contrary, with appropriate motivation, the working person may even try to maintain performance through will-power. The next step may be a clear reduction of performance ending with a breakdown of performance. The physiological symptoms may lead to a breakdown of the organism including changes of the structure of personality and in exhaustion. The process of fatigue is explained in the theory of successive destabilization (Luczak 1983).

        The principal trend of fatigue and recovery is shown in figure 2.

        Figure 2. Principal trend of fatigue and recovery


        Prognosis of Fatigue and Recovery

        In the field of ergonomics there is a special interest in predicting fatigue dependent on the intensity, duration and composition of stress factors and to determine the necessary recovery time. Table 1 shows those different activity levels and consideration periods and possible reasons of fatigue and different possibilities of recovery.

        Table 1. Fatigue and recovery dependent on activity levels

        Level of activity


        Fatigue from

        Recovery by

        Work life


        Overexertion for


        Phases of work life


        Overexertion for


        Sequences of
        work shifts


        Unfavourable shift

        Weekend, free

        One work shift

        One day

        Stress above
        endurance limits

        Free time, rest



        Stress above
        endurance limits

        Rest period

        Part of a task


        Stress above
        endurance limits

        Change of stress


        In ergonomic analysis of stress and fatigue for determining the necessary recovery time, considering the period of one working day is the most important. The methods of such analyses start with the determination of the different stress factors as a function of time (Laurig 1992) (figure 3).

        Figure 3. Stress as a function of time


        The stress factors are determined from the specific work content and from the conditions of work. Work content could be the production of force (e.g., when handling loads), the coordination of motor and sensory functions (e.g., when assembling or crane operating), the conversion of information into reaction (e.g., when controlling), the transformations from input to output information (e.g., when programming, translating) and the production of information (e.g., when designing, problem solving). The conditions of work include physical (e.g., noise, vibration, heat), chemical (chemical agents) and social (e.g., colleagues, shift work) aspects.

        In the easiest case there will be a single important stress factor while the others can be neglected. In those cases, especially when the stress factors results from muscular work, it is often possible to calculate the necessary rest allowances, because the basic concepts are known.

        For example, the sufficient rest allowance in static muscle work depends on the force and duration of muscular contraction as in an exponential function linked by multiplication according to the formula:


        R.A. = Rest allowance in percentage of t

        t = duration of contraction (working period) in minutes

        T = maximal possible duration of contraction in minutes

        f = the force needed for the static force and

        F = maximal force.

        The connection between force, holding time and rest allowances is shown in figure 4.

        Figure 4. Percentage rest allowances for various combinations of holding forces and time


        Similar laws exist for heavy dynamic muscular work (Rohmert 1962), active light muscular work (Laurig 1974) or different industrial muscular work (Schmidtke 1971). More rarely you find comparable laws for non-physical work, e.g., for computing (Schmidtke 1965). An overview of existing methods for determining rest allowances for mainly isolated muscle and non-muscle work is given by Laurig (1981) and Luczak (1982).






        More difficult is the situation where a combination of different stress factors exists, as shown in figure 5, which affect the working person simultaneously (Laurig 1992).

        Figure 5. The combination of two stress factors    


        The combination of two stress factors, for example, can lead to different strain reactions depending on the laws of combination. The combined effect of different stress factors can be indifferent, compensatory or cumulative.

        In the case of indifferent combination laws, the different stress factors have an effect on different subsystems of the organism. Each of these subsystems can compensate for the strain without the strain being fed into a common subsystem. The overall strain depends on the highest stress factor, and thus laws of superposition are not needed.

        A compensatory effect is given when the combination of different stress factors leads to a lower strain than does each stress factor alone. The combination of muscular work and low temperatures can reduce the overall strain, because low temperatures allow the body to lose heat which is produced by the muscular work.

        A cumulative effect arises if several stress factors are superimposed, that is, they must pass through one physiological “bottleneck”. An example is the combination of muscular work and heat stress. Both stress factors affect the circulatory system as a common bottleneck with resultant cumulative strain.

        Possible combination effects between muscle work and physical conditions are described in Bruder (1993) (see table 2).

        Table 2. Rules of combination effects of two stress factors on strain






        Heavy dynamic work




        Active light muscle work





        Static muscle work





        0 indifferent effect; + cumulative effect; – compensatory effect.

        Source: Adapted from Bruder 1993.

        For the case of the combination of more than two stress factors, which is the normal situation in practice, only limited scientific knowledge is available. The same applies for the successive combination of stress factors, (i.e., the strain effect of different stress factors which affect the worker successively). For such cases, in practice, the necessary recovery time is determined by measuring physiological or psychological parameters and using them as integrating values.



        Tuesday, 01 March 2011 02:20

        Impact of Random Measurement Error

        Errors in exposure measurement may have different impacts on the exposure-disease relationship being studied, depending on how the errors are distributed. If an epidemiological study has been conducted blindly (i.e., measurements have been taken with no knowledge of the disease or health status of the study participants) we expect that measurement error will be evenly distributed across the strata of disease or health status.

        Table 1 provides an example: suppose we recruit a cohort of people exposed at work to a toxicant, in order to investigate a frequent disease. We determine the exposure status only at recruitment (T0), and not at any further points in time during follow-up. However, let us say that a number of individuals do, in fact, change their exposure status in the following year: at time T1, 250 of the original 1,200 exposed people have ceased being exposed, while 150 of the original 750 non-exposed people have started to be exposed to the toxicant. Therefore, at time T1, 1,100 individuals are exposed and 850 are not exposed. As a consequence, we have “misclassification” of exposure, based on our initial measurement of exposure status at time T0. These individuals are then traced after 20 years (at time T2) and the cumulative risk of disease is evaluated. (The assumption being made in the example is that only exposure of more than one year is a concern.)

        Table 1. Hypothetical cohort of 1950 individuals (exposed and unexposed at work), recruited at time T0 and whose disease status is ascertained at time T2






        Exposed workers                                1200      250 quit exposure       1100 (1200-250+150)

        Cases of disease at time T2 = 220 among exposed workers

        Non-exposed workers                         750        150 start exposure      850 (750-150+250)

        Cases of disease at time T2 = 85 among non-exposed workers

        The true risk of disease at time T2 is 20% among exposed workers (220/1100),
        and 10% in non-exposed workers (85/850) (risk ratio = 2.0).

        Estimated risk at T2 of disease among those classified as exposed at T0: 20%
        (i.e., true risk in those exposed) ´ 950 (i.e., 1200-250)+ 10%
        (i.e., true risk in non-exposed) ´ 250 = (190+25)/1200 = 17.9%

        Estimated risk at T2 of disease among those classified as non-exposed at
        T0: 20% (i.e., true risk in those exposed) ´ 150 +10%
        (i.e., true risk innon-exposed) ´ 600 (i.e., 750-150) = (30+60)/750 = 12%

        Estimated risk ratio = 17.9% / 12% = 1.49

        Misclassification depends, in this example, on the study design and the characteristics of the population, rather than on technical limitations of the exposure measurement. The effect of misclassification is such that the “true” ratio of 2.0 between the cumulative risk among exposed people and non-exposed people becomes an “observed” ratio of 1.49 (table 1). This underestimation of the risk ratio arises from a “blurring” of the relationship between exposure and disease, which occurs when the misclassification of exposure, as in this case, is evenly distributed according to the disease or health status (i.e., the exposure measurement is not influenced by whether or not the person suffered from the disease that we are studying).

        By contrast, either underestimation or overestimation of the association of interest may occur when exposure misclassification is not evenly distributed across the outcome of interest. In the example, we may have bias, and not only a blurring of the aetiologic relationship, if classification of exposure depends on the disease or health status among the workers. This could arise, for example, if we decide to collect biological samples from a group of exposed workers and from a group of unexposed workers, in order to identify early changes related to exposure at work. Samples from the exposed workers might then be analysed in a more accurate way than samples from those unexposed; scientific curiosity might lead the researcher to measure additional biomarkers among the exposed people (including, e.g., DNA adducts in lymphocytes or urinary markers of oxidative damage to DNA), on the assumption that these people are scientifically “more interesting”. This is a rather common attitude which, however, could lead to serious bias.



        Wednesday, 02 March 2011 03:15

        Statistical Methods

        There is much debate on the role of statistics in epidemiological research on causal relationships. In epidemiology, statistics is primarily a collection of methods for assessing data based on human (and also on animal) populations. In particular, statistics is a technique for the quantification and measurement of uncertain phenomena. All the scientific investigations which deal with non-deterministic, variable aspects of reality could benefit from statistical methodology. In epidemiology, variability is intrinsic to the unit of observation—a person is not a deterministic entity. While experimental designs would be improved in terms of better meeting the assumptions of statistics in terms of random variation, for ethical and practical reasons this approach is not too common. Instead, epidemiology is engaged in observational research which has associated with it both random and other sources of variability.

        Statistical theory is concerned with how to control unstructured variability in the data in order to make valid inferences from empirical observations. Lacking any explanation for the variable behaviour of the phenomenon studied, statistics assumes it as random—that is, non-systematic deviations from some average state of nature (see Greenland 1990 for a criticism of these assumptions).

        Science relies on empirical evidence to demonstrate whether its theoretical models of natural events have any validity. Indeed, the methods used from statistical theory determine the degree to which observations in the real world conform to the scientists’ view, in mathematical model form, of a phenomenon. Statistical methods, based in mathematics, have therefore to be carefully selected; there are plenty of examples about “how to lie with statistics”. Therefore, epidemiologists should be aware of the appropriateness of the techniques they apply to measure the risk of disease. In particular, great care is needed when interpreting both statistically significant and statistically non-significant results.

        The first meaning of the word statistics relates to any summary quantity computed on a set of values. Descriptive indices or statistics such as the arithmetic average, the median or the mode, are widely used to summarize the information in a series of observations. Historically, these summary descriptors were used for administrative purposes by states, and therefore they were named statistics. In epidemiology, statistics that are commonly seen derive from the comparisons inherent to the nature of epidemiology, which asks questions such as: “Is one population at greater risk of disease than another?” In making such comparisons, the relative risk is a popular measure of the strength of association between an individual characteristic and the probability of becoming ill, and it is most commonly applied in aetiological research; attributable risk is also a measure of association between individual characteristics and disease occurrence, but it emphasizes the gain in terms of number of cases spared by an intervention which removes the factor in question—it is mostly applied in public health and preventive medicine.

        The second meaning of the word statistics relates to the collection of techniques and the underlying theory of statistical inference. This is a particular form of inductive logic which specifies the rules for obtaining a valid generalization from a particular set of empirical observations. This generalization would be valid provided some assumptions are met. This is the second way in which an uneducated use of statistics can deceive us: in observational epidemiology, it is very difficult to be sure of the assumptions implied by statistical techniques. Therefore, sensitivity analysis and robust estimators should be companions of any correctly conducted data analysis. Final conclusions also should be based on overall knowledge, and they should not rely exclusively on the findings from statistical hypothesis testing.


        A statistical unit is the element on which the empirical observations are made. It could be a person, a biological specimen or a piece of raw material to be analysed. Usually the statistical units are independently chosen by the researcher, but sometimes more complex designs can be set up. For example, in longitudinal studies, a series of determinations is made on a collection of persons over time; the statistical units in this study are the set of determinations, which are not independent, but structured by their respective connections to each person being studied. Lack of independence or correlation among statistical units deserves special attention in statistical analysis.

        A variable is an individual characteristic measured on a given statistical unit. It should be contrasted with a constant, a fixed individual characteristic—for example, in a study on human beings, having a head or a thorax are constants, while the gender of a single member of the study is a variable.

        Variables are evaluated using different scales of measurement. The first distinction is between qualitative and quantitative scales. Qualitative variables provide different modalities or categories. If each modality cannot be ranked or ordered in relation to others—for example, hair colour, or gender modalities—we denote the variable as nominal. If the categories can be ordered—like degree of severity of an illness—the variable is called ordinal. When a variable consists of a numeric value, we say that the scale is quantitative. A discrete scale denotes that the variable can assume only some definite values—for example, integer values for the number of cases of disease. A continuous scale is used for those measures which result in real numbers. Continuous scales are said to be interval scales when the null value has a purely conventional meaning. That is, a value of zero does not mean zero quantity—for example, a temperature of zero degrees Celsius does not mean zero thermal energy. In this instance, only differences among values make sense (this is the reason for the term “interval” scale). A real null value denotes a ratio scale. For a variable measured on that scale, ratios of values also make sense: indeed, a twofold ratio means double the quantity. For example, to say that a body has a temperature two times greater than a second body means that it has two times the thermal energy of the second body, provided that the temperature is measured on a ratio scale (e.g., in Kelvin degrees). The set of permissible values for a given variable is called the domain of the variable.

        Statistical Paradigms

        Statistics deals with the way to generalize from a set of particular observations. This set of empirical measurements is called a sample. From a sample, we calculate some descriptive statistics in order to summarize the information collected.

        The basic information that is generally required in order to characterize a set of measures relates to its central tendency and to its variability. The choice between several alternatives depends on the scale used to measure a phenomenon and on the purposes for which the statistics are computed. In table 1 different measures of central tendency and variability (or, dispersion) are described and associated with the appropriate scale of measurement.

        Table 1. Indices of central tendency and dispersion by scale of measurement


        Scale of measurement










        Arithmetic mean

        Sum of the observed values divided by the total number of observations




        Midpoint value of the observed distribution





        Most frequent value





        Lowest and highest values of the distribution





        Sum of the squared difference of each value from the mean divided by the total number of observations minus 1





        The descriptive statistics computed are called estimates when we use them as a substitute for the analogous quantity of the population from which the sample has been selected. The population counterparts of the estimates are constants called parameters. Estimates of the same parameter can be obtained using different statistical methods. An estimate should be both valid and precise.

        The population-sample paradigm implies that validity can be assured by the way the sample is selected from the population. Random or probabilistic sampling is the usual strategy: if each member of the population has the same probability of being included in the sample, then, on average, our sample should be representative of the population and, moreover, any deviation from our expectation could be explained by chance. The probability of a given deviation from our expectation also can be computed, provided that random sampling has been performed. The same kind of reasoning applies to the estimates calculated for our sample with regard to the population parameters. We take, for example, the arithmetic average from our sample as an estimate of the mean value for the population. Any difference, if it exists, between the sample average and the population mean is attributed to random fluctuations in the process of selection of the members included in the sample. We can calculate the probability of any value of this difference, provided the sample was randomly selected. If the deviation between the sample estimate and the population parameter cannot be explained by chance, the estimate is said to be biased. The design of the observation or experiment provides validity to the estimates and the fundamental statistical paradigm is that of random sampling.

        In medicine, a second paradigm is adopted when a comparison among different groups is the aim of the study. A typical example is the controlled clinical trial: a set of patients with similar characteristics is selected on the basis of pre-defined criteria. No concern for representativeness is made at this stage. Each patient enrolled in the trial is assigned by a random procedure to the treatment group—which will receive standard therapy plus the new drug to be evaluated—or to the control group—receiving the standard therapy and a placebo. In this design, the random allocation of the patients to each group replaces the random selection of members of the sample. The estimate of the difference between the two groups can be assessed statistically because, under the hypothesis of no efficacy of the new drug, we can calculate the probability of any non-zero difference.

        In epidemiology, we lack the possibility of assembling randomly exposed and non-exposed groups of people. In this case, we still can use statistical methods, as if the groups analysed had been randomly selected or allocated. The correctness of this assumption relies mainly on the study design. This point is particularly important and underscores the importance of epidemiological study design over statistical techniques in biomedical research.

        Signal and Noise

        The term random variable refers to a variable for which a defined probability is associated with each value it can assume. The theoretical models for the distribution of the probability of a random variable are population models. The sample counterparts are represented by the sample frequency distribution. This is a useful way to report a set of data; it consists of a Cartesian plane with the variable of interest along the horizontal axis and the frequency or relative frequency along the vertical axis. A graphic display allows us to readily see what is (are) the most frequent value(s) and how the distribution is concentrated around certain central values like the arithmetic average.

        For the random variables and their probability distributions, we use the terms parameters, mean expected value (instead of arithmetic average) and variance. These theoretical models describe the variability in a given phenomenon. In information theory, the signal is represented by the central tendency (for example, the mean value), while the noise is measured by a dispersion index (such as the variance).

        To illustrate statistical inference, we will use the binomial model. In the sections which follow, the concepts of point estimates and confidence intervals, tests of hypotheses and probability of erroneous decisions, and power of a study will be introduced.

        Table 2. Possible outcomes of a binomial experiment (yes = 1, no = 0) and their probabilities (n = 3)
































        An Example: The Binomial Distribution

        In biomedical research and epidemiology, the most important model of stochastic variation is the binomial distribution. It relies on the fact that most phenomena behave as a nominal variable with only two categories: for example, the presence/absence of disease: alive/dead, or recovered/ill. In such circumstances, we are interested in the probability of success—that is, in the event of interest (e.g., presence of disease, alive or recovery)—and in the factors or variables that can alter it. Let us consider n = 3 workers, and suppose that we are interested in the probability, p, of having a visual impairment (yes/no). The result of our observation could be the possible outcomes in table 2.

        Table 3. Possible outcomes of a binomial experiment (yes = 1, no = 0) and their probabilities (n = 3)

        Number of successes







        The probability of any of these event combinations is easily obtained by considering p, the (individual) probability of success, constant for each subject and independent from other outcomes. Since we are interested in the total number of successes and not in a specific ordered sequence, we can rearrange the table as follows (see table 3) and, in general, express the probability of x successes P(x) as:

        where x is the number of successes and the notation x! denotes the factorial of x, i.e., x! = x×(x–1)×(x–2)…×1.

        When we consider the event “being/not being ill”, the individual probability, refers to the state in which the subject is presumed; in epidemiology, this probability is called “prevalence”. To estimate p, we use the sample proportion:

        p = x/n

        with variance:

        In an hypothetical infinite series of replicated samples of the same size n, we would obtain different sample proportions p = x/n, with probabilities given by the binomial formula. The “true” value of  is estimated by each sample proportion, and a confidence interval for p, that is, the set of likely values for p, given the observed data and a pre-defined level of confidence (say 95%), is estimated from the binomial distribution as the set of values for p which gives a probability of x greater than a pre-specified value (say 2.5%). For a hypothetical experiment in which we observed x = 15 successes in n = 30 trials, the estimated probability of success is:

        p = x/n = 15/30 = 0.5 

        Table 4. Binomial distribution. Probabilities for different values of  for x = 15 successes in n = 30 trials



















        The 95% confidence interval for p, obtained from table 4, is 0.334 – 0.666. Each entry of the table shows the probability of x = 15 successes in n = 30 trials computed with the binomial formula; for example, for = 0.30, we obtain from:

        For n large and p close to 0.5 we can use an approximation based on the Gaussian distribution:

        where za /2 denotes the value of the standard Gaussian distribution for a probability

        P (|z| ³ za /2) = a/2;

        1 – a being the chosen confidence level. For the example considered, = 15/30 = 0.5; n = 30 and from the standard Gaussian table z0.025 = 1.96. The 95% confidence interval results in the set of values 0.321 – 0.679, obtained by substituting p = 0.5, n = 30, and z0.025 = 1.96 into the above equation for the Gaussian distribution. Note that these values are close to the exact values computed before.

        Statistical tests of hypotheses comprise a decision procedure about the value of a population parameter. Suppose, in the previous example, that we want to address the proposition that there is an elevated risk of visual impairment among workers of a given plant. The scientific hypothesis to be tested by our empirical observations then is “there is an elevated risk of visual impairment among workers of a given plant”. Statisticians demonstrate such hypotheses by falsifying the complementary hypothesis “there is no elevation of the risk of visual impairment”. This follows the mathematical demonstration per absurdum and, instead of verifying an assertion, empirical evidence is used only to falsify it. The statistical hypothesis is called the null hypothesis. The second step involves specifying a value for the parameter of that probability distribution used to model the variability in the observations. In our examples, since the phenomenon is binary (i.e., presence/absence of visual impairment), we choose the binomial distribution with parameter p, the probability of visual impairment. The null hypothesis asserts that = 0.25, say. This value is chosen from the collection of knowledge about the topic and a priori knowledge of the usual prevalence of visual impairment in non-exposed (i.e., non-worker) populations. Suppose our data produced an estimate = 0.50, from the 30 workers examined.

        Can we reject the null hypothesis?

        If yes, in favour of what alternative hypothesis?

        We specify an alternative hypothesis as a candidate should the evidence dictate that the null hypothesis be rejected. Non-directional (two-sided) alternative hypotheses state that the population parameter is different from the value stated in the null hypothesis; directional (one-sided) alternative hypotheses state that the population parameter is greater (or lesser) than the null value.

        Table 5. Binomial distribution. Probabilities of success for  = 0.25 in n = 30 trials



        Cumulative probability






























































        Under the null hypothesis, we can calculate the probability distribution of the results of our example. Table 5 shows, for = 0.25 and n = 30, the probabilities (see equation (1)) and the cumulative probabilities:

        From this table, we obtain the probability of having x ³15 workers with visual impairment

        P(x ³15) = 1 – P(x <15) = 1 – 0.9992 = 0.0008

        This means that it is highly improbable that we would observe 15 or more workers with visual impairment if they experienced the prevalence of disease of the non-exposed populations. Therefore, we could reject the null hypothesis and affirm that there is a higher prevalence of visual impairment in the population of workers that was studied.

        When n×p ³ 5 and n×(1-) ³ 5, we can use the Gaussian approximation:

        From the table of the standard Gaussian distribution we obtain:

        P(|z|>2.95) = 0.0008

        in close agreement with the exact results. From this approximation we can see that the basic structure of a statistical test of hypothesis consists of the ratio of signal to noise. In our case, the signal is (p), the observed deviation from the null hypothesis, while the noise is the standard deviation of P:

        The greater the ratio, the lesser the probability of the null value.

        In making decisions about statistical hypotheses, we can incur two kinds of errors: a type I error, rejection of the null hypothesis when it is true; or a type II error, acceptance of the null hypothesis when it is false. The probability level, or p-value, is the probability of a type I error, denoted by the Greek letter a. This is calculated from the probability distribution of the observations under the null hypothesis. It is customary to predefine an a-error level (e.g., 5%, 1%) and reject the null hypothesis when the result of our observation has a probability equal to or less than this so-called critical level.

        The probability of a type II error is denoted by the Greek letter β. To calculate it, we need to specify, in the alternative hypothesis, α value for the parameter to be tested (in our example, α value for ). Generic alternative hypotheses (different from, greater than, less than) are not useful. In practice, the β-value for a set of alternative hypotheses is of interest, or its complement, which is called the statistical power of the test. For example, fixing the α-error value at 5%, from table 5, we find:

        P(x ³12) <0.05

        under the null hypothesis = 0.25. If we were to observe at least x = 12 successes, we would reject the null hypothesis. The corresponding β values and the power for x = 12 are given by table 6. 

        Table 6. Type II error and power for x = 12, n = 30, α = 0.05






















        In this case our data cannot discriminate whether is greater than the null value of 0.25 but less than 0.50, because the power of the study is too low (<80%) for those values of <0.50—that is, the sensitivity of our study is 8% for = 0.3, 22% for = 0.35,…, 64% for = 0.45.

        The only way to achieve a lower β, or a higher level of power, would be to increase the size of the study. For example, in table 7 we report β and power for n = 40; as expected, we should be able to detect a  value greater than 0.40. 

        Table 7. Type II error and power for x = 12, n = 40, α = 0.05






















        Study design is based on careful scrutiny of the set of alternative hypotheses which deserve consideration and guarantee power to the study providing an adequate sample size.

        In the epidemiological literature, the relevance of providing reliable risk estimates has been emphasized. Therefore, it is more important to report confidence intervals (either 95% or 90%) than a p-value of a test of a hypothesis. Following the same kind of reasoning, attention should be given to the interpretation of results from small-sized studies: because of low power, even intermediate effects could be undetected and, on the other hand, effects of great magnitude might not be replicated subsequently.

        Advanced Methods

        The degree of complexity of the statistical methods used in the occupational medicine context has been growing over the last few years. Major developments can be found in the area of statistical modelling. The Nelder and Wedderburn family of non-Gaussian models (Generalized Linear Models) has been one of the most striking contributions to the increase of knowledge in areas such as occupational epidemiology, where the relevant response variables are binary (e.g., survival/death) or counts (e.g., number of industrial accidents).

        This was the starting point for an extensive application of regression models as an alternative to the more traditional types of analysis based on contingency tables (simple and stratified analysis). Poisson, Cox and logistic regression are now routinely used for the analysis of longitudinal and case-control studies, respectively. These models are the counterpart of linear regression for categorical response variables and have the elegant feature of providing directly the relevant epidemiological measure of association. For example, the coefficients of Poisson regression are the logarithm of the rate ratios, while those of logistic regression are the log of the odds ratios.

        Taking this as a benchmark, further developments in the area of statistical modelling have taken two main directions: models for repeated categorical measures and models which extend the Generalized Linear Models (Generalized Additive Models). In both instances, the aims are focused on increasing the flexibility of the statistical tools in order to cope with more complex problems arising from reality. Repeated measures models are needed in many occupational studies where the units of analysis are at the sub-individual level. For example:

        1. The study of the effect of working conditions on carpal tunnel syndrome has to consider both hands of a person, which are not independent from one other.
        2. The analysis of time trends of environmental pollutants and their effect on children’s respiratory systems can be evaluated using extremely flexible models since the exact functional form of the dose-response relationship is difficult to obtain.


        A parallel and probably faster development has been seen in the context of Bayesian statistics. The practical barrier of using Bayesian methods collapsed after the introduction of computer-intensive methods. Monte Carlo procedures such as Gibbs sampling schemes have allowed us to avoid the need for numerical integration for computing the posterior distributions which represented the most challenging feature of Bayesian methods. The number of applications of Bayesian models in real and complex problems have found increasing space in applied journals. For example, geographical analyses and ecological correlations at the small area level and AIDS prediction models are more and more often tackled using Bayesian approaches. These developments are welcomed because they represent not only an increase in the number of alternative statistical solutions which could be employed in the analysis of epidemiological data, but also because the Bayesian approach can be considered a more sound strategy.



        The preceding articles of this chapter have shown the need for a careful evaluation of the study design in order to draw credible inferences from epidemiological observations. Although it has been claimed that inferences in observational epidemiology are weak because of the non-experimental nature of the discipline, there is no built-in superiority of randomized controlled trials or other types of experimental design over well-planned observation (Cornfield 1954). However, to draw sound inferences implies a thorough analysis of the study design in order to identify potential sources of bias and confounding. Both false positive and false negative results can originate from different types of bias.

        In this article, some of the guidelines that have been proposed to assess the causal nature of epidemiological observations are discussed. In addition, although good science is a premise for ethically correct epidemiological research, there are additional issues that are relevant to ethical concerns. Therefore, we have devoted some discussion to the analysis of ethical problems that may arise in doing epidemiological studies.

        Causality Assessment

        Several authors have discussed causality assessment in epidemiology (Hill 1965; Buck 1975; Ahlbom 1984; Maclure 1985; Miettinen 1985; Rothman 1986; Weed 1986; Schlesselman 1987; Maclure 1988; Weed 1988; Karhausen 1995). One of the main points of discussion is whether epidemiology uses or should use the same criteria for the ascertainment of cause-effect relationships as used in other sciences.

        Causes should not be confused with mechanisms. For example, asbestos is a cause of mesothelioma, whereas oncogene mutation is a putative mechanism. On the basis of the existing evidence, it is likely that (a) different external exposures can act at the same mechanistic stages and (b) usually there is not a fixed and necessary sequence of mechanistic steps in the development of disease. For example, carcinogenesis is interpreted as a sequence of stochastic (probabilistic) transitions, from gene mutation to cell proliferation to gene mutation again, that eventually leads to cancer. In addition, carcinogenesis is a multifactorial process—that is, different external exposures are able to affect it and none of them is necessary in a susceptible person. This model is likely to apply to several diseases in addition to cancer.

        Such a multifactorial and probabilistic nature of most exposure-disease relationships implies that disentangling the role played by one specific exposure is problematic. In addition, the observational nature of epidemiology prevents us from conducting experiments that could clarify aetiologic relationships through a wilful alteration of the course of the events. The observation of a statistical association between exposure and disease does not mean that the association is causal. For example, most epidemiologists have interpreted the association between exposure to diesel exhaust and bladder cancer as a causal one, but others have claimed that workers exposed to diesel exhaust (mostly truck and taxi drivers) are more often cigarette smokers than are non-exposed individuals. The observed association, according to this claim, thus would be “confounded” by a well-known risk factor like smoking.

        Given the probabilistic-multifactorial nature of most exposure-disease associations, epidemiologists have developed guidelines to recognize relationships that are likely to be causal. These are the guidelines originally proposed by Sir Bradford Hill for chronic diseases (1965):

        • strength of the association
        • dose-response effect
        • lack of temporal ambiguity
        • consistency of the findings
        • biological plausibility
        • coherence of the evidence
        • specificity of the association.


        These criteria should be considered only as general guidelines or practical tools; in fact, scientific causal assessment is an iterative process centred around measurement of the exposure-disease relationship. However, Hill’s criteria often are used as a concise and practical description of causal inference procedures in epidemiology.

        Let us consider the example of the relationship between exposure to vinyl chloride and liver angiosarcoma, applying Hill’s criteria.

        The usual expression of the results of an epidemiological study is a measure of the degree of association between exposure and disease (Hill’s first criterion). A relative risk (RR) that is greater than unity means that there is a statistical association between exposure and disease. For instance, if the incidence rate of liver angiosarcoma is usually 1 in 10 million, but it is 1 in 100,000 among those exposed to vinyl chloride, then the RR is 100 (that is, people who work with vinyl chloride have a 100 times increased risk of developing angiosarcoma compared to people who do not work with vinyl chloride).

        It is more likely that an association is causal when the risk increases with increasing levels of exposure (dose-response effect, Hill’s second criterion) and when the temporal relationship between exposure and disease makes sense on biological grounds (the exposure precedes the effect and the length of this “induction” period is compatible with a biological model of disease; Hill’s third criterion). In addition, an association is more likely to be causal when similar results are obtained by others who have been able to replicate the findings in different circumstances (“consistency”, Hill’s fourth criterion).

        A scientific analysis of the results requires an evaluation of biological plausibility (Hill’s fifth criterion). This can be achieved in different ways. For example, a simple criterion is to evaluate whether the alleged “cause” is able to reach the target organ (e.g., inhaled substances that do not reach the lung cannot circulate in the body). Also, supporting evidence from animal studies is helpful: the observation of liver angiosarcomas in animals treated with vinyl chloride strongly reinforces the association observed in man.

        Internal coherence of the observations (for example, the RR is similarly increased in both genders) is an important scientific criterion (Hill’s sixth criterion). Causality is more likely when the relationship is very specific—that is, involves rare causes and/or rare diseases, or a specific histologic type/subgroup of patients (Hill’s seventh criterion).

        “Enumerative induction” (the simple enumeration of instances of association between exposure and disease) is insufficient to describe completely the inductive steps in causal reasoning. Usually, the result of enumerative induction produces a complex and still confused observation because different causal chains or, more frequently, a genuine causal relationship and other irrelevant exposures, are entangled. Alternative explanations have to be eliminated through “eliminative induction”, showing that an association is likely to be causal because it is not “confounded” with others. A simple definition of an alternative explanation is “an extraneous factor whose effect is mixed with the effect of the exposure of interest, thus distorting the risk estimate for the exposure of interest” (Rothman 1986).

        The role of induction is expanding knowledge, whereas deduction’s role is “transmitting truth” (Giere 1979). Deductive reasoning scrutinizes the study design and identifies associations which are not empirically true, but just logically true. Such associations are not a matter of fact, but logical necessities. For example, a selection bias occurs when the exposed group is selected among ill people (as when we start a cohort study recruiting as “exposed” to vinyl chloride a cluster of liver angiosarcoma cases) or when the unexposed group is selected among healthy people. In both instances the association which is found between exposure and disease is necessarily (logically) but not empirically true (Vineis 1991).

        To conclude, even when one considers its observational (non-experimental) nature, epidemiology does not use inferential procedures that differ substantially from the tradition of other scientific disciplines (Hume 1978; Schaffner 1993).

        Ethical Issues in Epidemiological Research

        Because of the subtleties involved in inferring causation, special care has to be exercised by epidemiologists in interpreting their studies. Indeed, several concerns of an ethical nature flow from this.

        Ethical issues in epidemiological research have become a subject of intense discussion (Schulte 1989; Soskolne 1993; Beauchamp et al. 1991). The reason is evident: epidemiologists, in particular occupational and environmental epidemiologists, often study issues having significant economic, social and health policy implications. Both negative and positive results concerning the association between specific chemical exposures and disease can affect the lives of thousands of people, influence economic decisions and therefore seriously condition political choices. Thus, the epidemiologist may be under pressure, and be tempted or even encouraged by others to alter—marginally or substantially—the interpretation of the results of his or her investigations.

        Among the several relevant issues, transparency of data collection, coding, computerization and analysis is central as a defence against allegations of bias on the part of the researcher. Also crucial, and potentially in conflict with such transparency, is the right of the subjects enrolled in epidemiological research to be protected from the release of personal information
        (confidentiality issues).

        From the point of view of misconduct that can arise especially in the context of causal inference, questions that should be addressed by ethics guidelines are:

        • Who owns the data and for how long must the data be retained?
        • What constitutes a credible record of the work having been done?
        • Do public grants allow in the budget for costs associated with adequate documentation, archiving and re-analysis of data?
        • Is there a role for the primary investigator in any third party’s re-analysis of his or her data?
        • Are there standards of practice for data storage?
        • Should occupational and environmental epidemiologists be establishing a normative climate in which ready data scrutiny or audit can be accomplished?
        • How do good data storage practices serve to prevent not only misconduct, but also allegations of misconduct?
        • What constitutes misconduct in occupational and environmental epidemiology in relation to data management, interpretation of results and advocacy?
        • What is the role of the epidemiologist and/or of professional bodies in developing standards of practice and indicators/ outcomes for their assessment, and contributing expertise in any advocacy role?
        • What role does the professional body/ organization have in dealing with concerns about ethics and law? (Soskolne 1993)


        Other crucial issues, in the case of occupational and environmental epidemiology, relate to the involvement of the workers in preliminary phases of studies, and to the release of the results of a study to the subjects who have been enrolled and are directly affected (Schulte 1989). Unfortunately, it is not common practice that workers enrolled in epidemiological studies are involved in collaborative discussions about the purposes of the study, its interpretation and the potential uses of the findings (which may be both advantageous and detrimental to the worker).

        Partial answers to these questions have been provided by recent guidelines (Beauchamp et al. 1991; CIOMS 1991). However, in each country, professional associations of occupational epidemiologists should engage in a thorough discussion about ethical issues and, possibly, adopt a set of ethics guidelines appropriate to the local context while recognizing internationally accepted normative standards of practice.



        The documentation of occupational diseases in a country like Taiwan is a challenge to an occupational physician. For lack of a system including material safety data sheets (MSDS), workers were usually not aware of the chemicals with which they work. Since many occupational diseases have long latencies and do not show any specific symptoms and signs until clinically evident, recognition and identification of the occupational origin are often very difficult.

        To better control occupational diseases, we have accessed databases which provide a relatively complete list of industrial chemicals and a set of specific signs and/or symptoms. Combined with the epidemiological approach of conjectures and refutations (i.e., considering and ruling out all possible alternative explanations), we have documented more than ten kinds of occupational diseases and an outbreak of botulism. We recommend that a similar approach be applied to any other country in a similar situation, and that a system involving an identification sheet (e.g., MSDS) for each chemical be advocated and implemented as one means to enable prompt recognition and hence the prevention of occupational diseases.

        Hepatitis in a Colour Printing Factory

        Three workers from a colour printing factory were admitted to community hospitals in 1985 with manifestations of acute hepatitis. One of the three had superimposed acute renal failure. Since viral hepatitis has a high prevalence in Taiwan, we should consider a viral origin among the most likely aetiologies. Alcohol and drug use, as well as organic solvents in the workplace, should also be included. Because there was no system of MSDS in Taiwan, neither the employees nor the employer were aware of all the chemicals used in the factory (Wang 1991).

        We had to compile a list of hepatotoxic and nephrotoxic agents from several toxicological databases. Then, we deduced all possible inferences from the above hypotheses. For example, if hepatitis A virus (HAV) were the aetiology, we should observe antibodies (HAV-IgM) among the affected workers; if hepatitis B virus were the aetiology, we should observe more hepatitis B surface antigens (HBsAg) carriers among the affected workers as compared with non-affected workers; if alcohol were the main aetiology, we should observe more alcohol abusers or chronic alcoholics among affected workers; if any toxic solvent (e.g., chloroform) were the aetiology, we should find it at the workplace.

        We performed a comprehensive medical evaluation for each worker. The viral aetiology was easily refuted, as well as the alcohol hypothesis, because they could not be supported by the evidence.

        Instead, 17 of 25 workers from the plant had abnormal liver function tests, and a significant association was found between the presence of abnormal liver function and a history of recently having worked inside any of three rooms in which an interconnecting air-conditioning system had been installed to cool the printing machines. The association remained after stratification by the carrier status of hepatitis B. It was later determined that the incident occurred following inadvertent use of a “cleaning agent” (which was carbon tetrachloride) to clean a pump in the printing machine. Moreover, a simulation test of the pump-cleaning operation revealed ambient air levels of carbon tetrachloride of 115 to 495 ppm, which could produce hepatic damage. In a further refutational attempt, by eliminating the carbon tetrachloride in the workplace, we found that no more new cases occurred, and all affected workers improved after removal from the workplace for 20 days. Therefore, we concluded that the outbreak was from the use of carbon tetrachloride.

        Neurological Symptoms in a Colour Printing Factory

        In September 1986, an apprentice in a colour printing factory in Chang-Hwa suddenly developed acute bilateral weakness and respiratory paralysis. The victim’s father alleged on the telephone that there were several other workers with similar symptoms. Since colour printing shops were once documented to have occupational diseases resulting from organic solvent exposures, we went to the worksite to determine the aetiology with an hypothesis of possible solvent intoxication in mind (Wang 1991).

        Our common practice, however, was to consider all alternative conjectures, including other medical problems including the impaired function of upper motor neurones, lower motor neurones, as well as the neuromuscular junction. Again, we deduced outcome statements from the above hypotheses. For example, if any solvent reported to produce polyneuropathy (e.g., n-hexane, methyl butylketone, acrylamide) were the cause, it would also impair the nerve conduction velocity (NCV); if it were other medical problems involving upper motor neurones, there would be signs of impaired consciousness and/or involuntary movement.

        Field observations disclosed that all affected workers had a clear consciousness throughout the clinical course. An NCV study of three affected workers showed intact lower motor neurones. There was no involuntary movement, no history of medication or bites prior to the appearance of symptoms, and the neostigmine test was negative. A significant association between illness and eating breakfast in the factory cafeteria on September 26 or 27 was found; seven of seven affected workers versus seven of 32 unaffected workers ate breakfast in the factory on these two days. A further testing effort showed that type A botulinum toxin was detected in canned peanuts manufactured by an unlicensed company, and its specimen also showed a full growth of Clostridium botulinum. A final refutational trial was the removal of such products from the commercial market, which resulted in no new cases. This investigation documented the first cases of botulism from a commercial food product in Taiwan.

        Premalignant Skin Lesions among Paraquat Manufacturers

        In June 1983, two workers from a paraquat manufacturing factory visited a dermatology clinic complaining of numerous bilateral hyperpigmented macules with hyperkeratotic changes on parts of their hands, neck and face exposed to the sun. Some skin specimens also showed Bowenoid changes. Since malignant and premalignant skin lesions were reported among bipyridyl manufacturing workers, an occupational cause was strongly suspected. However, we also had to consider other alternative causes (or hypotheses) of skin cancer such as exposure to ionizing radiation, coal tar, pitch, soot or any other polyaromatic hydrocarbons (PAH). To rule out all of these conjectures, we conducted a study in 1985, visiting all of the 28 factories which ever engaged in paraquat manufacturing or packaging and examining the manufacturing processes as well as the workers (Wang et al. 1987; Wang 1993).

        We examined 228 workers and none of them had ever been exposed to the aforementioned skin carcinogens except sunlight and 4’-4’-bipyridine and its isomers. After excluding workers with multiple exposures, we found that one out of seven administrators and two out of 82 paraquat packaging workers developed hyperpigmented skin lesions, as compared with three out of three workers involved in only bipyridine crystallization and centrifugation. Moreover, all 17 workers with hyperkeratotic or Bowen’s lesions had a history of direct exposure to bipyridyl and its isomers. The longer the exposure to bipyridyls, the more likely the development of skin lesions, and this trend cannot be explained by sunlight or age as demonstrated by stratification and logistic regression analysis. Hence, the skin lesion was tentatively attributed to a combination of bipyridyl exposures and sunlight. We made further refutational attempts to follow up if any new case occurred after enclosing all processes involving bipyridyls exposure. No new case was found.

        Discussion and Conclusions

        The above three examples have illustrated the importance of adopting a refutational approach and a database of occupational diseases. The former makes us always consider alternative hypotheses in the same manner as the initial intuitional hypothesis, while the latter provides a detailed list of chemical agents which can guide us toward the true aetiology. One possible limitation of this approach is that we can consider only those alternative explanations which we can imagine. If our list of alternatives is incomplete, we may be left with no answer or a wrong answer. Therefore, a comprehensive database of occupational disease is crucial to the success of this strategy.

        We used to construct our own database in a laborious manner. However, the recently published OSH-ROM databases, which contain the NIOSHTIC database of more than 160,000 abstracts, may be one of the most comprehensive for such a purpose, as discussed elsewhere in the Encyclopaedia. Furthermore, if a new occupational disease occurs, we might search such a database and rule out all known aetiological agents, and leave none unrefuted. In such a situation, we may try to identify or define the new agent (or occupational setting) as specifically as possible so that the problem can first be mitigated, and then test further hypotheses. The case of premalignant skin lesions among paraquat manufacturers is a good example of this kind.



        Role of Questionnaires in Epidemiological Research

        Epidemiological research is generally carried out in order to answer a specific research question which relates the exposures of individuals to hazardous substances or situations with subsequent health outcomes, such as cancer or death. At the heart of nearly every such investigation is a questionnaire which constitutes the basic data-gathering tool. Even when physical measurements are to be made in a workplace environment, and especially when biological materials such as serum are to be collected from exposed or unexposed study subjects, a questionnaire is essential in order to develop an adequate exposure picture by systematically collecting personal and other characteristics in an organized and uniform way.

        The questionnaire serves a number of critical research functions:

        • It provides data on individuals which may not be available from any other source, including workplace records or environmental measurements.
        • It permits targeted studies of specific workplace problems.
        • It provides baseline information against which future health effects can be assessed.
        • It provides information about participant characteristics that are necessary for proper analysis and interpretation of exposure-outcome relationships, especially possibly confounding variables like age and education, and other lifestyle variables that may affect disease risk, like smoking and diet.


        Place of questionnaire design within overall study goals

        While the questionnaire is often the most visible part of an epidemiological study, particularly to the workers or other study participants, it is only a tool and indeed is often called an “instrument” by researchers. Figure 1 depicts in a very general way the stages of survey design from conception through data collection and analysis. The figure shows four levels or tiers of study operation which proceed in parallel throughout the life of the study: sampling, questionnaire, operations, and analysis. The figure demonstrates quite clearly the way in which stages of questionnaire development are related to the overall study plan, proceeding from an initial outline to a first draft of both the questionnaire and its associated codes, followed by pretesting within a selected subpopulation, one or more revisions dictated by pretest experiences, and preparation of the final document for actual data collection in the field. What is most important is the context: each stage of questionnaire development is carried out in conjunction with a corresponding stage of creation and refinement of the overall sampling plan, as well as the operational design for administration of the questionnaire.

        Figure 1. The stages of a survey


        Types of studies and questionnaires

        The research goals of the study itself determine the structure, length and content of the questionnaire. These questionnaire attributes are invariably tempered by the method of data collection, which usually falls within one of three modes: in person, mail and telephone. Each of these has its advantages and disadvantages which can affect not only the quality of the data but the validity of the overall study.

        A mailed questionnaire is the least expensive format and can cover workers in a wide geographical area. However, in that overall response rates are often low (typically 45 to 75%), it cannot be overly complex since there is little or no opportunity for clarification of questions, and it may be difficult to ascertain whether potential responses to critical exposure or other questions differ systematically between respondents and non-respondents. The physical layout and language must accommodate the least educated of potential study participants, and must be capable of completion in a fairly short time period, typically 20 to 30 minutes.

        Telephone questionnaires can be used in population-based studies—that is, surveys in which a sample of a geographically defined population is canvassed—and are a practical method to update information in existing data files. They may be longer and more complex than mailed questionnaires in language and content, and since they are administered by trained interviewers the greater cost of a telephone survey can be partially offset by physically structuring the questionnaire for efficient administration (such as through skip patterns). Response rates are usually better than with mailed questionnaires, but are subject to biases related to increasing use of telephone answering machines, refusals, non-contacts and problems of populations with limited telephone service. Such biases generally relate to the sampling design itself and not especially to the questionnaire. Although telephone questionnaires have long been in use in North America, their feasibility in other parts of the world has yet to be established.

        Face-to-face interviews provide the greatest opportunity for collecting accurate complex data; they are also the most expensive to administer, since they require both training and travel for professional staff. The physical layout and order of questions may be arranged to optimize administration time. Studies which utilize in-person interviewing generally have the highest response rates and are subject to the least response bias. This is also the type of interview in which the interviewer is most likely to learn whether or not the participant is a case (in a case-control study) or the participant’s exposure status (in a cohort study). Care must therefore be taken to preserve the objectivity of the interviewer by training him or her to avoid leading questions and body language that might evoke biased responses.

        It is becoming more common to use a hybrid study design in which complex exposure situations are assessed in a personal or telephone interview which allows maximum probing and clarification, followed by a mailed questionnaire to capture lifestyle data like smoking and diet.

        Confidentiality and research participant issues

        Since the purpose of a questionnaire is to obtain data about individuals, questionnaire design must be guided by established standards for ethical treatment of human subjects. These guidelines apply to acquisition of questionnaire data just as they do for biological samples such as blood and urine, or to genetic testing. In the United States and many other countries, no studies involving humans may be conducted with public funds unless approval of questionnaire language and content is first obtained from an appropriate Institutional Review Board. Such approval is intended to assure that questions are confined to legitimate study purposes, and that they do not violate the rights of study participants to answer questions voluntarily. Participants must be assured that their participation in the study is entirely voluntary, and that refusal to answer questions or even to participate at all will not subject them to any penalties or alter their relationship with their employer or medical practitioner.

        Participants must also be assured that the information they provide will be held in strict confidence by the investigator, who must of course take steps to maintain the physical security and inviolability of the data. This often entails physical separation of information regarding the identity of participants from computerized data files. It is common practice to advise study participants that their replies to questionnaire items will be used only in aggregation with responses of other participants in statistical reports, and will not be disclosed to the employer, physician or other parties.

        Measurement aspects of questionnaire design

        One of the most important functions of a questionnaire is to obtain data about some aspect or attribute of a person in either qualitative or quantitative form. Some items may be as simple as weight, height or age, while others may be considerably more complicated, as with an individual’s response to stress. Qualitative responses, such as gender, will ordinarily be converted into numerical variables. All such measures may be characterized by their validity and their reliability. Validity is the degree to which a questionnaire-derived number approaches its true, but possibly unknown, value. Reliability measures the likelihood that a given measurement will yield the same result on repetition, whether that result is close to the “truth” or not. Figure 2 shows how these concepts are related. It demonstrates that a measurement can be valid but not reliable, reliable but not valid, or both valid and reliable.

        Figure 2. Validity & reliability relationship


        Over the years, many questionnaires have been developed by researchers in order to answer research questions of wide interest. Examples include the Scholastic Aptitude Test, which measures a student’s potential for future academic achievement, and the Minnesota Multiphasic Personality Inventory (MMPI), which measures certain psychosocial characteristics. A variety of other psychological indicators are discussed in the chapter on psychometrics. There are also established physiological scales, such as the British Medical Research Council (BMRC) questionnaire for pulmonary function. These instruments have a number of important advantages. Chief among these are the facts that they have already been developed and tested, usually in many populations, and that their reliability and validity are known. Anyone constructing a questionnaire is well advised to utilize such scales if they fit the study purpose. Not only do they save the effort of “re-inventing the wheel”, but they make it more likely that study results will be accepted as valid by the research community. It also makes for more valid comparisons of results from different studies provided they have been properly used.

        The preceding scales are examples of two important types of measures which are commonly used in questionnaires to quantify concepts that may not be fully objectively measurable in the way that height and weight are, or which require many similar questions to fully “tap the domain” of one specific behavioural pattern. More generally, indexes and scales are two data reduction techniques that provide a numerical summary of groups of questions. The above examples illustrate physiological and psychological indexes, and they are also frequently used to measure knowledge, attitude and behaviour. Briefly, an index is usually constructed as a score obtained by counting, among a group of related questions, the number of items that apply to a study participant. For instance, if a questionnaire presents a list of diseases, a disease history index could be the total number of those which a respondent says he or she has had. A scale is a composite measure based on the intensity with which a participant answers one or more related questions. For example, the Likert scale, which is frequently used in social research, is typically constructed from statements with which one may agree strongly, agree weakly, offer no opinion, disagree weakly, or disagree strongly, the response being scored as a number from 1 to 5. Scales and indexes may be summed or otherwise combined to form a fairly complex picture of study participants’ physical, psychological, social or behavioural characteristics.

        Validity merits special consideration because of its reflection of the “truth”. Three important types of validity often discussed are face, content and criterion validity. Face validity is a subjective quality of an indicator which insures that the wording of a question is clear and unambiguous. Content validity insures that the questions will serve to tap that dimension of response in which the researcher is interested. Criterion (or predictive) validity is derived from an objective assessment of how closely a questionnaire measurement approaches a separately measurable quantity, as for instance how well a questionnaire assessment of dietary vitamin A intake matches the actual consumption of vitamin A, based upon food consumption as documented with dietary records.

        Questionnaire content, quality and length

        Wording. The wording of questions is both an art and a professional skill. Therefore, only the most general of guidelines can be presented. It is generally agreed that questions should be devised which:

        1. motivate the participant to respond
        2. draw upon the participant’s personal knowledge
        3. take into account his or her limitations and personal frame of reference, so that the aim and meaning of the questions is easily understood and
        4. elicit a response based upon the participant’s own knowledge and do not require guessing, except possibly for attitude and opinion questions.


        Question sequence and structure. Both the order and presentation of questions can affect the quality of information gathered. A typical questionnaire, whether self-administered or read by an interviewer, contains a prologue which introduces the study and its topic to the respondent, provides any additional information he or she will need, and tries to motivate the respondent to answer the questions. Most questionnaires contain a section designed to collect demographic information, such as age, gender, ethnic background and other variables about the participant’s background, including possibly confounding variables. The main subject matter of data collection, such as nature of the workplace and exposure to specific substances, is usually a distinct questionnaire section, and is often preceded by an introductory prologue of its own which might first remind the participant of specific aspects of the job or workplace in order to create a context for detailed questions. The layout of questions that are intended to establish worklife chronologies should be arranged so as to minimize the risk of chronological omissions. Finally, it is customary to thank the respondent for his or her participation.

        Types of questions. The designer must decide whether to use open-ended questions in which participants compose their own answers, or closed questions that require a definite response or a choice from a short menu of possible responses. Closed questions have the advantage that they clarify alternatives for the respondent, avoid snap responses, and minimize lengthy rambling that may be impossible to interpret. However, they require that the designer anticipate the range of potential responses in order to avoid losing information, particularly for unexpected situations that occur in many workplaces. This in turn requires well planned pilot testing. The investigator must decide whether and to what extent to permit a “don’t know” response category.

        Length. Determining the final length of a questionnaire requires striking a balance between the desire to obtain as much detailed information as possible to achieve the study goals with the fact that if a questionnaire is too lengthy, at some point many respondents will lose interest and either stop responding or respond hastily, inaccurately and without thought in order to bring the session to an end. On the other hand, a questionnaire which is very short may obtain a high response rate but not achieve the study goals. Since respondent motivation often depends on having a personal stake in the outcome, such as improving working conditions, tolerance for a lengthy questionnaire may vary widely, especially when some participants (such as workers in a particular plant) may perceive their stake to be higher than others (such as persons contacted via random telephone dialling). This balance can be achieved only through pilot testing and experience. Interviewer-administered questionnaires should record the beginning and ending time to permit calculation of the duration of the interview. This information is useful in assessing the level of quality of the data.

        Language. It is essential to use the language of the population to make the questions understood by all. This may require becoming familiar with local vernacular that may vary within any one country. Even in countries where the same language is nominally spoken, such as Britain and the United States, or the Spanish-speaking countries of Latin America, local idioms and usage may vary in a way that can obscure interpretation. For example, in the US “tea” is merely a beverage, whereas in Britain it may mean “a pot of tea,” “high tea,” or “the main evening meal,” depending on locale and context. It is especially important to avoid scientific jargon, except where study participants can be expected to possess specific technical knowledge.

        Clarity and leading questions. While it is often the case that shorter questions are clearer, there are exceptions, especially where a complex subject needs to be introduced. Nevertheless, short questions clarify thinking and reduce unnecessary words. They also reduce the chance of overloading the respondent with too much information to digest. If the purpose of the study is to obtain objective information about the participant’s working situation, it is important to word questions in a neutral way and to avoid “leading” questions that may favour a particular answer, such as “Do you agree that your workplace conditions are harmful to your health?”

        Questionnaire layout. The physical layout of a questionnaire can affect the cost and efficiency of a study. It is more important for self-administered questionnaires than those which are conducted by interviewers. A questionnaire which is designed to be completed by the respondent but which is overly complex or difficult to read may be filled out casually or even discarded. Even questionnaires which are designed to be read aloud by trained interviewers need to be printed in clear, readable type, and patterns of question skipping must be indicated in a manner which maintains a steady flow of questioning and minimizes page turning and searching for the next applicable question.

        Validity Concerns


        The enemy of objective data gathering is bias, which results from systematic but unplanned differences between groups of people: cases and controls in a case-control study or exposed and non-exposed in a cohort study. Information bias may be introduced when two groups of participants understand or respond differently to the same question. This may occur, for instance, if questions are posed in such a way as to require special technical knowledge of a workplace or its exposures that would be understood by exposed workers but not necessarily by the general public from which controls are drawn.

        The use of surrogates for ill or deceased workers has the potential for bias because next-of-kin are likely to recall information in different ways and with less accuracy than the worker himself or herself. The introduction of such bias is especially likely in studies in which some interviews are carried out directly with study participants while other interviews are carried out with relatives or co-workers of other research participants. In either situation, care must be taken to reduce any effect that might arise from the interviewer’s knowledge of the disease or exposure status of the worker of interest. Since it is not always possible to keep interviewers “blind,” it is important to emphasize objectivity and avoidance of leading or suggestive questions or unconscious body language during training, and to monitor performance while the study is being carried out.

        Recall bias results when cases and controls “remember” exposures or work situations differently. Hospitalized cases with a potential occupationally related illness may be more capable of recalling details of their medical history or occupational exposures than persons contacted randomly on the telephone. A type of this bias that is becoming more common has been labelled social desirability bias. It describes the tendency of many people to understate, whether consciously or not, their indulgence in “bad habits” such as cigarette smoking or consumption of foods high in fat and cholesterol, and to overstate “good habits” like exercise.

        Response bias denotes a situation in which one group of study participants, such as workers with a particular occupational exposure, may be more likely to complete questionnaires or otherwise participate in a study than unexposed persons. Such a situation may result in a biased estimation of the association between exposure and disease. Response bias may be suspected if response rates or the time taken to complete a questionnaire or interview differ substantially between groups (e.g., cases vs. controls, exposed vs. unexposed). Response bias generally differs depending upon the mode of questionnaire administration. Questionnaires which are mailed are usually more likely to be returned by individuals who see a personal stake in study findings, and are more likely to be ignored or discarded by persons selected at random from the general population. Many investigators who utilize mail surveys also build in a follow-up mechanism which may include second and third mailings as well as subsequent telephone contacts with non-respondents in order to maximize response rates.

        Studies which utilize telephone surveys, including those which make use of random digit dialling to identify controls, usually have a set of rules or a protocol defining how many times attempts to contact potential respondents must be made, including time of day, and whether evening or weekend calls should be attempted. Those who conduct hospital-based studies usually record the number of patients who refuse to participate, and reasons for non-participation. In all such cases, various measures of response rates are recorded in order to provide an assessment of the extent to which the target population has actually been reached.

        Selection bias results when one group of participants preferentially responds or otherwise participates in a study, and can result in biased estimation of the relationship between exposure and disease. In order to assess selection bias and whether it leads to under- or over-estimation of exposure, demographic information such as educational level can be used to compare respondents with non-respondents. For example, if participants with little education have lower response rates than participants with higher education, and if a particular occupation or smoking habit is known to be more frequent in less educated groups, then selection bias with underestimation of exposure for that occupation or smoking category is likely to have occurred.

        Confounding is an important type of selection bias which results when the selection of respondents (cases and controls in a case-control study, or exposed and unexposed in a cohort study) depends in some way upon a third variable, sometimes in a manner unknown to the investigator. If not identified and controlled, it can lead unpredictably to underestimates or overestimates of disease risks associated with occupational exposures. Confounding is usually dealt with either by manipulating the design of the study itself (e.g., through matching cases to controls on age and other variables) or at the analysis stage. Details of these techniques are presented in other articles within this chapter.


        In any research study, all study procedures must be thoroughly documented so that all staff, including interviewers, supervisory personnel and researchers, are clear about their respective duties. In most questionnaire-based studies, a coding manual is prepared which describes on a question-by-question basis everything the interviewer needs to know beyond the literal wording of the questions. This includes instructions for coding categorical responses and may contain explicit instructions on probing, listing those questions for which it is permitted and those for which it is not. In many studies new, unforeseen response choices for certain questions are occasionally encountered in the field; these must be recorded in the master codebook and copies of additions, changes or new instructions distributed to all interviewers in a timely fashion.

        Planning, testing and revision

        As can be seen from figure 1, questionnaire development requires a great deal of thoughtful planning. Every questionnaire needs to be tested at several stages in order to make certain that the questions “work”, i.e., that they are understandable and produce responses of the intended quality. It is useful to test new questions on volunteers and then to interrogate them at length to determine how well specific questions were understood and what types of problems or ambiguities were encountered. The results can then be utilized to revise the questionnaire, and the procedure can be repeated if necessary. The volunteers are sometimes referred to as a “focus group”.

        All epidemiological studies require pilot testing, not only for the questionnaires, but for the study procedures as well. A well designed questionnaire serves its purpose only if it can be delivered efficiently to the study participants, and this can be determined only by testing procedures in the field and making adjustments when necessary.

        Interviewer training and supervision

        In studies which are conducted by telephone or face-to-face interview, the interviewer plays a critical role. This person is responsible not simply for presenting questions to the study participants and recording their responses, but also for interpreting those responses. Even with the most rigidly structured interview study, respondents occasionally request clarification of questions, or offer responses which do not fit the available response categories. In such cases the interviewer’s job is to interpret either the question or the response in a manner consistent with the intent of the researcher. To do so effectively and consistently requires training and supervision by an experienced researcher or manager. When more than one interviewer is employed on a study, interviewer training is especially important to insure that questions are presented and responses interpreted in a uniform manner. In many research projects this is accomplished in group training settings, and is repeated periodically (e.g., annually) in order to keep the interviewers’ skills fresh. Training seminars commonly cover the following topics in considerable detail:

        • general introduction to the study
        • informed consent and confidentiality issues
        • how to introduce the interview and how to interact with respondents
        • the intended meaning of each question
        • instructions for probing, i.e., offering the respondent further opportunity to clarify or embellish responses
        • discussion of typical problems which arise during interviews.


        Study supervision often entails onsite observation, which may include tape-recording of interviews for subsequent dissection. It is common practice for the supervisor to personally review every questionnaire prior to approving and submitting it to data entry. The supervisor also sets and enforces performance standards for interviewers and in some studies conducts independent re-interviews with selected participants as a reliability check.

        Data collection

        The actual distribution of questionnaires to study participants and subsequent collection for analysis is carried out using one of the three modes described above: by mail, telephone or in person. Some researchers organize and even perform this function themselves within their own institutions. While there is considerable merit to a senior investigator becoming familiar with the dynamics of the interview at first hand, it is most cost effective and conducive to maintaining high data quality for trained and well-supervised professional interviewers to be included as part of the research team.

        Some researchers make contractual arrangements with companies that specialize in survey research. Contractors can provide a range of services which may include one or more of the following tasks: distributing and collecting questionnaires, carrying out telephone or face-to-face interviews, obtaining biological specimens such as blood or urine, data management, and statistical analysis and report writing. Irrespective of the level of support, contractors are usually responsible for providing information about response rates and data quality. Nevertheless, it is the researcher who bears final responsibility for the scientific integrity of the study.

        Reliability and re-interviews

        Data quality may be assessed by re-interviewing a sample of the original study participants. This provides a means for determining the reliability of the initial interviews, and an estimate of the repeatability of responses. The entire questionnaire need not be re-administered; a subset of questions usually is sufficient. Statistical tests are available for assessing the reliability of a set of questions asked of the same participant at different times, as well as for assessing the reliability of responses provided by different participants and even for those queried by different interviewers (i.e., inter- and intra-rater assessments).

        Technology of questionnaire processing

        Advances in computer technology have created many different ways in which questionnaire data can be captured and made available to the researcher for computer analysis. There are three fundamentally different ways in which data can be computerized: in real time (i.e., as the participant responds during an interview), by traditional key entry methods, and by optical data capture methods.

        Computer-aided data capture

        Many researchers now use computers to collect responses to questions posed in both face-to-face and telephone interviews. Researchers in the field find it convenient to use laptop computers which have been programmed to display the questions sequentially and which permit the interviewer to enter the response immediately. Survey research companies which do telephone interviewing have developed analogous systems called computer-aided telephone interview (CATI) systems. These methods have two important advantages over more traditional paper questionnaires. First, responses can be instantly checked against a range of permissible answers and for consistency with previous responses, and discrepancies can be immediately brought to the attention of both the interviewer and the respondent. This greatly reduces the error rate. Secondly, skip patterns can be programmed to minimize administration time.

        The most common method for computerizing data still is the traditional key entry by a trained operator. For very large studies, questionnaires are usually sent to a professional contract company which specializes in data capture. These firms often utilize specialized equipment which permits one operator to key a questionnaire (a procedure sometimes called keypunch for historical reasons) and a second operator to re-key the same data, a process called key verification. Results of the second keying are compared with the first to assure the data have been entered correctly. Quality assurance procedures can be programmed which ensure that each response falls within an allowable range, and that it is consistent with other responses. The resulting data files can be transmitted to the researcher on disk, tape or electronically by telephone or other computer network.

        For smaller studies, there are numerous commercial PC-based programs which have data entry features which emulate those of more specialized systems. These include database programs such as dBase, Foxpro and Microsoft Access, as well as spreadsheets such as Microsoft Excel and Lotus 1-2-3. In addition, data entry features are included with many computer program packages whose principal purpose is statistical data analysis, such as SPSS, BMDP and EPI INFO.

        One widespread method of data capture which works well for certain specialized questionnaires uses optical systems. Optical mark reading or optical sensing is used to read responses on questionnaires that are specially designed for participants to enter data by marking small rectangles or circles (sometimes called “bubble codes”). These work most efficiently when each individual completes his or her own questionnaire. More sophisticated and expensive equipment can read hand-printed characters, but at present this is not an efficient technique for capturing data in large-scale studies.

        Archiving Questionnaires and Coding Manuals

        Because information is a valuable resource and is subject to interpretation and other influences, researchers sometimes are asked to share their data with other researchers. The request to share data can be motivated by a variety of reasons, which may range from a sincere interest in replicating a report to concern that data may not have been analysed or interpreted correctly.

        Where falsification or fabrication of data is suspected or alleged, it becomes essential that the original records upon which reported findings are based be available for audit purposes. In addition to the original questionnaires and/or computer files of raw data, the researcher must be able to provide for review the coding manual(s) developed for the study and the log(s) of all data changes which were made in the course of data coding, computerization and analysis. For example, if a data value had been altered because it had initially appeared as an outlier, then a record of the change and the reasons for making the change should have been recorded in the log for possible data audit purposes. Such information also is of value at the time of report preparation because it serves as a reminder about how the data which gave rise to the reported findings had actually been handled.

        For these reasons, upon completion of a study, the researcher has an obligation to ensure that all basic data are appropriately archived for a reasonable period of time, and that they could be retrieved if the researcher were called upon to provide them.



        Monday, 07 March 2011 18:13

        Asbestos: Historical Perspective

        Several examples of workplace hazards often are quoted to exemplify not only the possible adverse health effects associated with workplace exposures, but also to reveal how a systematic approach to the study of worker populations can uncover important exposure-disease relationships. One such example is that of asbestos. The simple elegance with which the late Dr. Irving J. Selikoff demonstrated the elevated cancer risk among asbestos workers has been documented in an article by Lawrence Garfinkel. It is reprinted here with only slight modification and with the permission of CA-A Cancer Journal for Clinicians (Garfinkel 1984). The tables came from the original article by Dr. Selikoff and co-workers (1964).

        Asbestos exposure has become a public health problem of considerable magnitude, with ramifications that extend beyond the immediate field of health professionals to areas served by legislators, judges, lawyers, educators, and other concerned community leaders. As a result, asbestos-related diseases are of increasing concern to clinicians and health authorities, as well as to consumers and the public at large.

        Historical Background

        Asbestos is a highly useful mineral that has been utilized in diverse ways for many centuries. Archaeological studies in Finland have shown evidence of asbestos fibres incorporated in pottery as far back as 2500 BC. In the 5th century BC, it was used as a wick for lamps. Herodotus commented on the use of asbestos cloth for cremation about 456 BC. Asbestos was used in body armour in the 15th century, and in the manufacture of textiles, gloves, socksand handbags in Russia c. 1720. Although it is uncertain when the art of weaving asbestos was developed, we know that the ancients often wove asbestos with linen. Commercial asbestos production began in Italy about 1850, in the making of paper and cloth.

        The development of asbestos mining in Canada and South Africa about 1880 reduced costs and spurred the manufacture of asbestos products. Mining and production of asbestos in the United States, Italy and Russia followed soon after. In the United States, the development of asbestos as pipe insulation increased production and was followed shortly thereafter by other varied uses including brake linings, cement pipes, protective clothing and so forth.

        Production in the US increased from about 6,000 tons in 1900 to 650,000 tons in 1975, although by 1982, it was about 300,000 tons and by 1994, production had dropped to 33,000 tons.

        It is reported that Pliny the Younger (61-113 AD) commented on the sickness of slaves who worked with asbestos. Reference to occupational disease associated with mining appeared in the 16th century, but it was not until 1906 in England that the first reference to pulmonary fibrosis in an asbestos worker appeared. Excess deaths in workers involved with asbestos manufacturing applications were reported shortly thereafter in France and Italy, but major recognition of asbestos-induced disease began in England in 1924. By 1930, Wood and Gloyne had reported on 37 cases of pulmonary fibrosis.

        The first reference to carcinoma of the lung in a patient with “asbestos-silicosis” appeared in 1935. Several other case reports followed. Reports of high percentages of lung cancer in patients who died of asbestosis appeared in 1947, 1949 and 1951. In 1955 Richard Doll in England reported an excess risk of lung cancer in persons who had worked in an asbestos plant since 1935, with an especially high risk in those who were employed more than 20 years.

        Clinical Observations

        It was against this background that Dr. Irving Selikoff’s clinical observations of asbestos-related disease began. Dr. Selikoff was at that time already a distinguished scientist. His prior accomplishments included the development and first use of isoniazid in the treatment of tuberculosis, for which he received a Lasker Award in 1952.

        In the early 1960s, as a chest physician practising in Paterson, New Jersey, he had observed many cases of lung cancer among workers in an asbestos factory in the area. He decided to extend his observations to include two locals of the asbestos insulator workers union, whose members also had been exposed to asbestos fibres. He recognized that there were still many people who did not believe that lung cancer was related to asbestos exposure and that only a thorough study of a total exposed population could convince them. There was the possibility that asbestos exposure in the population could be related to other types of cancer, such as pleural and peritoneal mesothelioma, as had been suggested in some studies, and perhaps other sites as well. Most of the studies of the health effects of asbestos in the past had been concerned with workers exposed in the mining and production of asbestos. It was important to know if asbestos inhalation also affected other asbestos-exposed groups.

        Dr. Selikoff had heard of the accomplishments of Dr. E. Cuyler Hammond, then Director of the Statistical Research Section of the American Cancer Society (ACS), and decided to ask him to collaborate in the design and analysis of a study. It was Dr. Hammond who had written the landmark prospective study on smoking and health published a few years earlier.

        Dr. Hammond immediately saw the potential importance of a study of asbestos workers. Although he was busily engaged in analysing data from the then new ACS prospective study, Cancer Prevention Study I (CPS I), which he had begun a few years earlier, he readily agreed to a collaboration in his “spare time”. He suggested confining the analysis to those workers with at least 20 years’ work experience, who thus would have had the greatest amount of asbestos exposure.

        The team was joined by Mrs. Janet Kaffenburgh, a research associate of Dr. Selikoff’s at Mount Sinai Hospital, who worked with Dr. Hammond in preparing the lists of the men in the study, including their ages and dates of employment and obtaining the data on facts of death and causes from union headquarters records. This information was subsequently transferred to file cards that were sorted literally on the living room floor of Dr. Hammond’s house by Dr. Hammond and Mrs. Kaffenburgh.

        Dr. Jacob Churg, a pathologist at Barnert Memorial Hospital Center in Paterson, New Jersey, provided pathologic verification of the cause of death.

        Tabe 1. Man-years of experience of 632 asbestos workers exposed to asbestos dust 20 years or longer


        Time period



































































        The resulting study was of the type classified as a “prospective study retrospectively carried out”. The nature of the union records made it possible to accomplish an analysis of a long-range study in a relatively short period of time. Although only 632 men were involved in the study, there were 8,737 man-years of exposure to risk (see table 1); 255 deaths occurred during the 20-year period of observation from 1943 through 1962 (see table 2). It is in table 28.17 where the observed number of deaths can be seen invariably to exceed the number expected, demonstrating the association between workplace asbestos exposure and an elevated cancer death rate. 

        Table 2. Observed and expected number of deaths among 632 asbestos workers exposed to asbestos dust 20 years or longer

        Cause of death

        Time period








        Total, all causes

        Observed (asbestos workers)






        Expected (US White males)






        Total cancer, all sites

        Observed (asbestos workers)






        Expected (US White males)






        Cancer of lung and pleura

        Observed (asbestos workers)






        Expected (US White males)






        Cancer of stomach, colon and rectum

        Observed (asbestos workers)






        Expected (US White males)






        Cancer of all other sites combined

        Observed (asbestos workers)






        Expected (US White males)







        Significance of the Work

        This paper constituted a turning point in our knowledge of asbestos-related disease and set the direction of future research. The article has been cited in scientific publications at least 261 times since it was originally published. With financial support from the ACS and the National Institutes of Health, Dr. Selikoff and Dr. Hammond and their growing team of mineralogists, chest physicians, radiologists, pathologists, hygienists and epidemiologists continued to explore various facets of asbestos disease.

        A major paper in 1968 reported the synergistic effect of cigarette smoking on asbestos exposure (Selikoff, Hammond and Churg 1968). The studies were expanded to include asbestos production workers, persons indirectly exposed to asbestos in their work (shipyard workers, for example) and those with family exposure to asbestos.

        In a later analysis, in which the team was joined by Herbert Seidman, MBA, Assistant Vice President for Epidemiology and Statistics of the American Cancer Society, the group demonstrated that even short-term exposure to asbestos resulted in a significant increased risk of cancer up to 30 years later (Seidman, Selikoff and Hammond 1979). There were only three cases of mesothelioma in this first study of 632 insulators, but later investigations showed that 8% of all deaths among asbestos workers were due to pleural and peritoneal mesothelioma.

        As Dr. Selikoff’s scientific investigations expanded, he and his co-workers made noteworthy contributions toward reducing exposure to asbestos through innovations in industrial hygiene techniques; by persuading legislators about the urgency of the asbestos problem; in evaluating the problems of disability payments in connection with asbestos disease; and in investigating the general distribution of asbestos particles in water supplies and in the ambient air.

        Dr. Selikoff also called the medical and scientific community’s attention to the asbestos problem by organizing conferences on the subject and participating in many scientific meetings. Many of his orientation meetings on the problem of asbestos disease were structured particularly for lawyers, judges, presidents of large corporations and insurance executives.



        Page 3 of 7

        " DISCLAIMER: The ILO does not take responsibility for content presented on this web portal that is presented in any language other than English, which is the language used for the initial production and peer-review of original content. Certain statistics have not been updated since the production of the 4th edition of the Encyclopaedia (1998)."