The failure of vivisection

Part 1. 'Animal Experimentation: A Failed Technology' by Dr Robert Sharpe.
Part 2. 'A Critical Evaluation of Animal Experimentation' by Dr Robert Sharpe.


Part 1. 'Animal Experimentation: A Failed Technology' by Robert Sharpe

'The idea, as I understand it, is that fundamental truths are revealed in laboratory experiments on lower animals and are then applied to the problems of a sick patient. Having myself been trained as a physiologist I feel in a way competent to assess such a claim. It is plain nonsense'.
Sir George Pickering, Professor of Medicine, University of Oxford. In Pickering, G. (1964). Physician and scientist, Br. Med. J,. 2, 1615-19.

Origins of animal testing
    In 1882, during a speech to the Birmingham Philosophical Society, the great 19th-century surgeon, Lawson Tait, argued that animal experimentation should be stopped 'so the energy and skill of scientific investigators should be directed into better and safer channels'. Tait undoubtedly made a major contribution to the advance of abdominal surgery, but stated without hesitation that he had been led astray again and again by the published results of experiments on animals, so that eventually he had to discard them entirely (Tait 1882a).
    Tait believed that vivisection is an error, not only because it can produce misleading results, but also because there is the constant danger that attention will be diverted from more reliable sources of information, such as clinical studies. Indeed in Science, History and Medicine, Cawadias (1953) argues that, whenever medicine has strayed from clinical observation, the result has been chaos, stagnation and disaster. The view needs to be taken seriously because the assumption that people and animals are alike in the way their bodies work diverted attention from the study of humans, which ultimately held back medical progress for hundreds of years.
    The Greek physician who was destined to dominate medicine for centuries was born in the year AD 131. Galen has been described as the founder of experimental physiology and based his anatomy almost entirely on the study of apes and pigs. Galen unhesitatingly transferred his discoveries to human beings, thus perpetuating many errors (Guthrie 1945).
    Unfortunately, Galen's dogmatic style, together with the church's reluctance to allow dissection of human cadavers, meant that his errors went uncorrected for literally hundreds of years (Fraser Harris 1936). Galen's mistakes passed into current teaching and became authoritative statements of the universities until as late as the 16th century. Only with the publication of Vesalius's Structure of the Human Body in 1543, based on actual human dissections, did the long period of Galenic darkness begin to clear. Galen was held to blame not just for his faulty results but for using the wrong method (Tomkin 1973).
    Yet during the 19th century, thanks essentially just to a handful of men, vivisection was transformed from an occasional, often criticised, method into the scientific fashion we know today. In 1865 the French physiologist, Claude Bernard, published his Introduction to the Study of Experimental Medicine, a work specially intended to give physicians rules and principles to guide their study of experimental medicine. Bernard regarded the laboratory as the 'true sanctuary of medical science' and considered it far more important than the clinical investigation of patients.
    Bernard popularised the artificial production of disease by chemical and physical means, so leading the way for today's reliance on 'animal models' in medical research. Furthermore, this influential figure created the impression that animal experiments are directly applicable to humans (Bernard 1865): 'Experiments on animals, with deleterious substances or in harmful circumstances, are very useful and entirely conclusive for the toxicity and hygiene of man. Investigations of medicinal or of toxic substances also are wholly applicable to man from the therapeutic point of view...'
    Bernard's Introduction was to prove the charter for 20th century medicine but he was not alone in establishing the vivisection method. Another key figure was Louis Pasteur, whose apparently successful attempts to develop a vaccine against rabies had further glamorised the role of laboratory research and animal experiments. Pasteur's vaccine was made from the deliberately infected brain tissue of living animals but we know that it simply does not work when injected after a rabid bite (Hattwick and Gregg 1985).
    Clinical experience has shown that few people bitten by rabid animals actually develop the disease (DHSS 1977), so those who miraculously 'recovered' after a course of inoculations may not have been infected in the first place. Nevertheless Pasteur's apparent success meant that his methods were to prove highly influential, with living animals and that their tissues subsequently used to produce vaccines and sera against a wide variety of conditions. This has proved a dangerous approach as contaminants from animal tissues have often produced fatal results in humans (Hayflick 1970).
    Furthermore the oncogenic viruses such as SV40, which contaminate tissues from primates, only become carcinogenic when they cross the species barrier (Hayflick 1970), so the use of human cells to prepare human viral vaccines is potentially the safest approach.
    The current reliance on animal models of human disease was further popularised by the German doctor, Robert Koch, who was Pasteur's rival in developing the germ theory of disease. Koch had produced a set of rules for establishing proof that a particular germ caused the disease under investigation and one of these 'postulates' stated that, when inoculated into laboratory animals, the microbe should reproduce the same condition (Walker 1954, Lancet 1909).
    The idea was soon discredited by Koch (1884) himself during a study of cholera, when it proved impossible to reproduce the disease in animals. Ultimately Koch was forced to rely on clinical observation of patients and microscopic analysis of samples from actual cases of human cholera. As a result he was successful in isolating the microbe and discovering its mode of transmission, so that preventive action could be taken (Koch 1884). Human disease can take an entirely different form in animals so, as the Lancet subsequently concluded, 'We cannot rely on Koch's postulates as a decisive test of a causal organism' (Lancet 1909).
    But occasionally Koch's animal tests seemed to work. When injected with tuberculosis (TB) bacillus isolated from dying patients, many species-mice, guinea-pigs and monkeys but not frogs or turtles - also succumbed (Reidman 1974). Unfortunately TB takes a different form in animals (Lancet 1946) and human trials with Koch's highly acclaimed 'cure' - tuberculin- ended in disaster (Dowling 1977, Lehrer 1979, Westacott 1949). Nevertheless the die was cast in the 20th century; animals would be widely used as living test tubes to screen new antibacterial drugs.

Animal disease models
    Despite all of Galen's errors, and the realisation that mice are not 'miniature men', the growing influence of laboratory scientists like Bernard, Koch and Pasteur turned animal experiments into an everyday practice: medical research came to rely on artificially induced animal models of human disease. And since the direct study of human patients requires so much more skill and patience so that unnecessary risks to volunteers are avoided, it was perhaps not surprising that researchers preferred the greater convenience offered by a 'disposable species'.
    But with animals now being used not only to assess, to develop surgical techniques and to acquire physiological knowledge, what are the implications for patients? In view of the complex and often subtle nature of human disease, it is not surprising that, for the great majority of disease entities, the animal models are considered either very poor or non-existent (Dollery 1981). Take arteriosclerosis as an example. This is a condition which results in some of the Western world's major killers, including heart attacks and strokes, so it needs to be investigated if its aetiology is to be understood. Animal models of arteriosclerosis have included birds, dogs, rats, pigs, rabbits and monkeys, but species differences have been seen in each case.
    The most widely used species is the rabbit, and when administered an unnatural, high-cholesterol diet, their arteries quickly become blocked, but the lesions are quite different to those found in people, in both their content and distribution (Gross 1985). While it is rare for lesions in rabbits to develop fibrosis, haemorrhage, ulceration and/or thrombosis, all of these are characteristic of lesions in human patients (Gross 1985). In animal models of stroke, it has been argued that no laboratory animal has an entirely comparable cerebrovascular supply to that of humans, and most, if not all, have a considerably greater cerebral circulatory reserve, all of which makes them far less prone to stroke (Whisnat 1958).
    Animals have been used in dental research to investigate the aetiology of periodontal disease, but differences between rodents and humans could confuse clinical and epidemiological findings. To begin with, there is considerable variation in periodontal breakdown in different strains of inbred mice (Baer and Lieberman 1959). Furthermore, extrapolation from rodent periodontum to that of humans could be invalid, as both the development and the potential for cementoblast differentiation in the rodent are different from those in humans (Manson and White 1983).
    Huge resources have been expended on animal-based cancer research yet artificially induced cancers in animals have often proved quite different to the spontaneous tumours which arise in patients. Indeed the Lancet (1972) warned that, since no animal tumour is closely related to a cancer in human beings, an agent which is active in the laboratory may well prove useless clinically. This was certainly the case with the US National Cancer Institute's 25-year screening programme in which 40,000 plant species were tested for antitumour activity. As a result of the programme several materials proved sufficiently safe and effective on the basis of animal tests to be considered for clinical trials.
    Unfortunately all of these were either ineffective in treating human cancer or too exotic for general use (Farnsworth and Pezzuto 1984). Thus in 25 years of this extensive programme not a single antitumour agent safe and effective enough for use in patients has yet emerged, despite promising results in animal experiments. Indeed one former cancer researcher has argued that clues to practically all the chemotherapeutic agents which are of value in the treatment of human cancer were found in a clinical context rather than in animal studies (Bross 1987). Like a number of other centres, the National Cancer Institute is now using a battery of human cancer cells as a more relevant means of screening new drugs (Scrip 1987). Animal cancer tests have also proved confusing in developing immunotherapy (Williams 1982). Although the techniques worked with experimental tumours in laboratory animals, and thereby raised great hopes, their clinical application proved disappointing.
    Once again this has been attributed to differences between the species: experimental tumours, in contrast with most human tumours, grow rapidly and are biologically different from spontaneous tumours. In addition, spontaneous cancers are less susceptible to attack by the body's defences than artificially induced cancers (Williams 1982). Even in primates, presumably the animals closest to us in evolutionary terms, a disease can take quite a different form. The use of monkeys to investigate cerebral malaria led to the suggestion that coma in human patients is due to an increased concentration of protein in the cerebrospinal fluid, and that this leakage from the serum could be corrected with steroids (Lancet 1987). However, monkeys do not lapse into coma, nor do they have sequestered red cells infected with parasites, as typically seen in the human disease. In fact, steroids do not help patients and subsequent clinical investigation of the human condition showed that the monkey model may simply not be relevant (Lancet 1987).
    The generally poor quality of animal models has been advanced as a strong argument for testing new drugs in volunteers and patients as early as possible to reduce the possibility of misleading predictions (Dolery 1981). Indeed it has been stated that most pharmacologists are happy if 30% of the usual action of drugs, as determined by experiments on animals, are reproduced in humans (Saunder 1981). So it is not surprising that many of the therapeutic actions of drugs are discovered through their clinical evaluation on patients (Breckenridge 1981) or by astute analysis of accidental or deliberate poisoning (Dayan 1981), rather than by experiments on animals.
    This is particularly the case with many psychotropic medicines because adequate animal models of serious mental illnesses such as schizophrenia, mania, dementia and personality and behaviour disorders simply do not exist. Screening tests for the drugs generally used to treat these conditions - the major tranquillisers - are based on animal models of specific side-effects of existing drugs, on the assumption that many of the beneficial actions and undesirable effects of these products may be related to a similar aspect of brain chemistry, i.e. their effect on dopaminergic neurons (Worms and Lloyd 1979). Most of the rapid screening tests for potential major tranquillising drugs therefore depend on the side-effects of these drugs which are related to central dopomorphic-receptor blockade, for instance, catalepsy after apomorphine effects and anti-emetic effects (Vorms and Lloyd 1979).
    For example, rats are monitored for the onset of catalepsy after administration of the test drug while dogs are observed for inhibitions of chemical induced emesis. Unfortunately the nature of the tests almost inevitably means that, when successful, they lead to drugs with serious built-in side-effects such as catalepsy, Parkinsonism and tardive dyskinesia, which are also linked with dopamone-receptor blockade (Worms and Lloyd 1979).
    Drug-induced Parkinsonism and tardive dyskinesia have now become major problems following the treatment of serious mental illness (Melville and Johnson 1982, Stephen and Williamson 1984). Many antidepressants were also first identified by their effects in patients. After the clinical discovery of imipramine's antidepressant properties (Sitaram and Gershon 1983), scientists accidentally found that the drug reversed chemically induced hypothermia in mice (Leonard 1984). Subsequently reserpine-induced hypothermia was used as an animal model of depression, yet several antidepressants discovered by clinical investigation were found to 'fail' the hypothermia test (Leonard 1984).

    Clinical work has also proved the cornerstone of advances in surgery. With the rapid developments in surgical techniques following the discovery of anaesthetics in the 19th century, a number of surgeons argued strongly that advances must come from clinical practice rather than animal experiments (Beddow Bayly 1962). Tait (1882b) believed that vivisection had done far more harm than good in surgery, while the Royal Surgeon, Sir Frederick Treves (1898) issued a salutary warning about experiments that he had carried out on dogs: 'Such are the differences between the human and the canine bowel, that when I came to operate on man I found I was much hampered by my new experience, that I had everything to unlearn, and that my experiments had done little but unfit to deal with the human intestine'.
    The same principles apply today when transplants and other surgical feats are being attempted. The crucial point is the underlying biological differences which make the animal experiments hazardous. It is therefore revealing that, despite thousands of experiments on animals, the first human transplants were almost always disastrous. Only after considerable clinical experience did techniques improve. At California's Stanford University, 400 heart transplants were carried out on dogs, yet the first human patients both died because of complications which had not arisen during the preliminary experiments (Iben 1968). By 1980, 65% of heart transplant recipients at Stanford were still alive after one year, with the improvement attributed almost entirely to increased skill with anti-rejection drugs, and in the careful choice of patients for surgery (Lancet 1980).
    In the case of combined heart and lung transplants, the early experience was once again disastrous, with none of the first three patients surviving beyond 23 days (Jamieson, et al., 1983). In 1986 Stanford reported 28 heart-lung transplants carried out between March 1981 and August 1985. Eight patients died during or immediately after the operation. In another 10 a respiratory disorder called obliterative bronchiolitis (OB) developed after surgery, from which four more patients died and three were left functionally limited by breathlessness. The surgeons noted that (Burne, et al., 1986): extensive experience with animal models in this and other institutions had not indicated a serious hazard from airway disease, so the emergence of post-transplant OB as the most important implication was unexpected.

Safety evaluation
    But the danger of relying on animal experiments is most vividly illustrated by the growing list of animal-tested drugs which are withdrawn or restricted because of unexpected, often fatal, side-effects in people. Examples include Eraldin, Opren, chloramphenicol, clioquinol, Flosint, Ibufenac and Zelmid (Sharpe 1988). Apart from drug withdrawal or restriction, unexpected reactions may lead to warnings from the Committee on Safety of Medicines or the medical press (Sharpe 1988). In the case ICI's heart drug, Eraldin, there were serious eye problems, including blindness, and there were 23 deaths.
    Ultimately ICI compensated more than 1000 victims (Office of Health Economics 1980). Yet animal experiments had given no warning of the dangers (Inman 1977) and even after the drug was withdrawn in 1976 the harmful effects could not be reproduced in laboratory animals (Weatherall 1982). The antibiotic chloramphenicol, passed safe after animal experiments, was later discovered to cause aplastic anemia, which often proved fatal (Venning 1983).
    The British Medical Journal (1952) reports how the drug was thoroughly tested on animals, producing nothing worse than transient anemia in dogs given the drug for long periods by injection, and nothing at all when given orally. Scientists have recently suggested the use of human bone-marrow cells as a more reliable means of detecting such toxic effects prior to clinical trials (Gyte and Williams 1985).
    In 1982 the non-steroidal anti-inflammatory drug, Opren, was withdrawn in Britain after 3500 reports of side-effects including 61 deaths mainly through liver damage (British Medical Journal, 1982). Prolonged tests in rhesus monkeys, in which the animals received up to seven times the maximum tolerated human dose for a year, revealed no evidence of toxicity (Dista Products Ltd 1980). Furthermore, animal tests cited in the company's literature make no mention of the photosensitive skins reactions which proved such a problem for patients (Dista Products Ltd 1980).
    During the 1960's at least 3500 young asthmatics died in Britain following the use of isoprenaline aerosol inhalers (Inman 1980). Isoprenaline is a powerful asthma drug and deaths were reported in countries using a particularly concentrated form of aerosol which delivered 0,4mg of drug per spray (Stolley 1972). Animal tests had shown that a large doses increased the heart rate but not sufficiently to kill the animals. In fact cats could tolerate 175 times the dose found dangerous to asthmatics (Collins, et al., 1969).
    Even after the event it proved difficult to reproduce the drug's harmful effects in animals (Carson, et al., 1971). Japan suffered a major epidemic of drug-induced disease in the case of clioquinol, the main ingredient of Ciba Geigy's antidiarrhoea drugs, Enterovioform and Maxaform (Lancet, 1977a). At least 10,000 people were victims of a new disease called SMON (subacute myelo-optic neuropathy), yet animal experiments carried out by the company revealed 'no evidence that clioquinol is neurotoxic' (Hess, et al., 1972).
    Reliance in animal tests can therefore be dangerously misleading. In fact, what protection there is comes mainly from clinical trials where 95% of the drug passed safe and effective on the basis of animal tests are rejected (Medical World News, 1965). Nevertheless the problem appears less serious than it is, because side-effects are grossly under-reported (Lesser 1980): only about a dozen of the 3500 deaths linked with isoprenaline aerosol were reported by doctors at the time, while 11% of fatal reactions associated with anti-inflammatory drugs, phenylbutazone and oxyphenbutazone, were reported as such (Inman 1980).
    Most adverse reactions which can occur in patients cannot be demonstrated, anticipated or avoided by the routine subacute and chronic toxicity experiment (Zbinden 1966). This is partly because animals do not have the potential to predict some of the most common or life-threatening effects (Welch 1967). For instance animals cannot tell us if they are suffering from nausea, dizziness, amnesia, headache, depression or other psychological disturbances. Allergic reactions, skin lesions, some blood disorders and many central nervous system effects are some of the more serious problems which once again cannot generally be demonstrated in animals. But even when such effects are excluded, toxicity tests can still prove misleading.
    In 1962, the side-effects of six different drugs, reported during clinical practice, were compared with those originally seen in toxicity tests with rats and dogs (Litchfield 1962). The comparisons were restricted to those tests which animals have the potential to predict. Even so, of the 78 adverse reactions seen in patients, the majority (42) were not predicted in animal tests. In most cases, then, predictions based on animal experiments proved incorrect.
    Another comparison this time based on 45 drugs, revealed that at best only one out of every four side-effects predicted by animal experiments actually occurred in patients (Fletcher 1978). Even then it is not possible to tell which predictions are accurate until human trials are commenced. Furthermore the report confirmed that many common side-effects cannot be predicted by animal tests at all: examples include nausea, headache, sweating, cramps, dry mouth, dizziness, and in some cases skin lesions and reduced blood pressure. But this study has an additional implication.
     With most of the adverse reactions predicted by animal experiments not occurring in people, there is also the danger of unnecessarily rejecting potentially valuable medicines. A classic example is penicillin which, as Florey (1953) admitted, would in all probability have been discarded had it been tested on guinea-pigs, to whom it is highly toxic (Koppanyi and Avery 1966). But the good fortune did not end there. In order to save a seriously ill patient, Fleming wanted to inject penicillin into the spine but the possible results were unknown. Florey tried the experiment with a cat but there wasn't time to wait for the results if Fleming's patient was to have a chance. Fleming's patient received his injection and improved, but Florey's cat died (BBC 1981)!
    Another case is digitalis. Although discovered without animal experiments, its more widespread use was delayed because tests on animals incorrectly predicted a dangerous rise in blood pressure (Beddow Bayly 1962). One of the most common animals used in toxicity tests is the rat, yet comparisons with humans reveal major differences in skin characteristics, respiratory parameters, the location of gut flora, B-glucurondase activity, plasma protein binding, biliary excretion, metabolism, allergic hypersensitivity and teratogenicity (Calabrese 1984).
    Differences in respiratory parameters are particularly important in inhalation studies, where rats are used extensively. As 'high-risk' animal models used for respiratory and cardiovascular problems, rats are considered inappropriate for asthma, bronchitis and arteriosclerosis, but the species of choice for hypertension (Calabrese 1984). In fact the species most routinely used for toxicological studies are chosen not on consideration of their phylogenetic relationship to humans but on practical grounds of cost, breeding rate, litter size, ease of handling, resistance to intercurrent infections and laboratory tradition (Davies 1977).
    One of the most important factors resulting in differences between the species is the speed and pattern of metabolism. Indeed reports show that variations in drug biotransformation are the rule rather than the exception (Levine 1978, Smith and Caldwell 1977, Zbinden 1963). Toxic drug effects which are not predicted by animal test may be seen in people when their metabolism is slower, resulting in longer exposure. But differences in the rate of biotransformation are only one aspect of the metabolic comparison. Of even greater importance is the route of metabolism.
    Species variability here can result in poisonous effects which it would be impossible to predict by animal tests. A comparative study of 23 chemicals showed that in only four cases did rats and humans metabolise drugs in the same way (Smith and Caldwell 1977). One example is amphetamone, which is metabolised by the same route in humans, dogs and mice (although faster in the mouse) but by a different pathway in the rat and by still another route in the guinea-pig (Levine 1978).
    These difficulties once again stress the need to assess new drugs in volunteers as early as possible. Bernard Brodie of Bethesda' National Heart Institute has stated (Brodie 1962): 'These problems highlight the importance for drug development of testing a drug in man as soon as possible to see whether its rate of metabolism makes it clinically practical. The practice of studying the physiological disposition of a drug in man may only after it is clearly the drug of choice in animals not only may prove shortsighted and time consuming, but also may result in relegating the best drug for man to the shelf for evermore'.

Specific tests
    Differences in metabolism are expected to have their greatest impact in subacute and chronic toxicity tests where animals are dosed every day for weeks, months and sometimes even years. Nevertheless the more specialised are as such as skin and eye irritancy, carcinogenicity and teratogenicity, not to mention the notorious LD50 test for systemic toxicity, have all presented problems when reliance has been placed on animal experiments. In the case of the Draize eye irritancy test, the animal most commonly used is the rabbit because it is cheap, easy to handle and has a large eye for assessing results (Ballantyne and Swanston 1977).
    But there are major differences which make the rabbit eye a bad model for the human eye (Ballantyne and Swanston 1977, Buehler and Newman 1964, Coulston and Serrone 1969). Unlike humans, the rabbit has a nictitating membrane and also produces tears less effectively. The acidity and buffering capacity of the aqueous humour in the eyes of human beings and rabbits are different and so is the thickness, tissue structure and biochemistry of the cornea. In humans the thickness of the cornea is 0.51 mm but it is only 0.37 mm in rabbits.
    Inevitably there have been conflicting results and scientists warn that extreme caution is required in extrapolating the results from animals to the likely condition in people (Ballantyne and Swanston 1977). When the sensory irritants o-chlorobenzylidene malononitrile (CS) and dibenzoxazepine (CR) were tested in the eyes of human beings and rabbits, large differences were found, with humans being 90 times more sensitive to CR and 18 times more sensitive to CS than rabbits (Swanston 1983).
    On the other hand a liquid anionic surfactant formulation, with a long history as a basic ingredient of light-duty dishwashing products, caused severe eye irritation in rabbits, but extensive human experience of accidental exposure has shown it to be completely non-hazardous (Buehler and Newman 1964). The Draize test has also proved misleading in devising therapy: clinical experience in treating human eye burns caused by alkali led to the preferred treatment of thorough rinsing followed by complete denudation of the cornea. In rabbits the same technique was unsuccessful, with denudation actually retarding recovery threefold (Buehler and Newman 1964).
    The rabbit is also the animal most frequently used to test for skin irritancy. Criticism of its predictive reliability led Britain's Huntingdon Research Centre to carry out comparative trials with six species - mice, guinea-pigs, minipigs, piglets, dogs and baboons (Davies et al. 1972). The researchers found 'considerable variability in irritancy response between the different species'. The most pronounced variability occurred with the more irritant materials such as an antidandruff cream shampoo where the irritancy ranged from severe in rabbits to almost non-existent in the baboon. Human volunteers suffered mild irritation.
    Furthermore the Huntingdon study revealed considerable differences between minipigs and piglets, which in turn differed from human responses. This is noteworthy because there appears to be a widely held belief that these species are good models for skin problems in people. Some chemical irritants produce pain without causing structural damage and can be assessed by a variety of methods including the human blister-base technique. The procedure is reported to cause little pain and produces reproducible results, with volunteers able to distinguish the intensity of discomfort (Foster and Weston 1986).
    Using the technique a comparison of relative potencies of three chemical irritants - o-chlorobenzylidene malononitrile (CS), n-nonanoylvanillylamine (VAN) and dibenzoxazepine (CR) - produced results which conflicted with those found from animal tests (Foster and Weston 1986). According to the blister base technique, CR is more potent than CS, which is confirmed by other human test systems, yet this is the reverse of that found from experiments with rodents. Furthermore the study found that VAN is less potent than CR, which is once again the reverse of that found from animal experiments. The authors concluded that 'data derived from humans thus appears to be of importance when assessing irritant potency'.
    Experience with the Draize skin and eye irritancy tests has shown that results for the same chemical can vary widely from laboratory and indeed within the same laboratory, because of the subjective nature of assessing the results. What is classed as a severe eye irritant by one observer may be dismissed as a mild irritant by another. This was the outcome of a study in which 25 laboratories tested 12 chemicals of known irritancy. The authors (Weil and Scala 1971) found 'extreme variation' between laboratories and concluded: 'Thus, the tests which have been used for 20 years to decide the degree of eye or skin irritation produce quite variable results among the various laboratories. To use these tests, or minor variations of them, to obtain consistency in classifying a material as an eye or skin irritant or non-irritant,therefore is not deemed is suggested that the rabbit eye end skin procedures currently recommended by the Federal agencies for use in delineation of irritancy of materials should not be recommended as standard procedures in any new regulations. Without careful re-education these tests result in unreliable results'.
    In the case of the LD50 poisoning test (LD standing for the lethal dose; 50 signifying the single dose necessary to kill 50% of these animals), results can vary widely between the species, making reliable predictions of the human lethal dose impossible. A comparison of the lethal dose of various chemicals is animals with those found from accidental or deliberate exposure in humans showed frequent extreme variations (Zbinden and Flury-Roversi 1981).
    When originally introduced in 1927, the LD50 test was designed to measure the strength of drugs like digitalis, a purpose for which it has since become obsolete; but since it is easy to perform, scientists started using it as an index of toxicity for a wide variety of substances including pesticides, cosmetics, drugs, household products and industrial chemicals. And naturally the idea of a single numerical index of toxicity appealed to government bureaucrats so that the LD50 became enshrined in official government requirements for a wide range of chemical substances. According to one of Britain's largest contract houses, the Huntingdon Research Centre, approximately 90% of the LD50 tests which they perform are purely to obtain a value for various legislative needs (Heywood 1977).
    It also became clear that LD50 tests could not be used to predict the results of overdose; only careful analysis of patients suffering accidental or deliberate poisoning could do that. In an account of how the National Poison Centre at New Cross Hospital in London collates information and prepares advice on the prevention and management of drug overdosage, the Director, Dr. G. N. Volans (1986), demonstrates that 'acute toxicity data from animal tests contributes very little of value to this work'.
    To illustrate the failings of the LD50, Volans uses examples from two classes of drugs - the non-steroidal anti-inflammatory drugs (NSAIDs) and the antidepressants. For instance, according the safest NSAID listed appears to be aspirin, followed by ibuprofen. Yet clinical experience, Volan notes, does not accord with this since aspirin can cause death in humans at doses which are not difficult to take. On the other hand, although of supposedly greater toxicity, the largest doses of ibuprofen recorded in over 14 years of clinical use failed to produce serious toxicity, even at plasma concentrations over 20 times peak therapeutic levels.
    Several of the NSAIDs listed have roughly similar degrees of toxicity, yet once again this is not the case in humans because, although most appear relatively safe in overdose, it is well known that phenylbutazone can produce severe toxicity and death. Choice of a drug on the basis of its LD50 in animals could therefore be dangerously misleading. Ultimately, preliminary tests, whatever their nature, cannot prevent accidental poisoning. On the other hand the introduction of child-resistant containers in 1976 resulted in a dramatic fall in hospital admissions after accidental poisoning with analgesics (Jackson et al. 1983).
    Nor can LD50 tests be used to select dose levels of drugs suitable either for repeated administration to volunteers, or in the more prolonged animal tests. This is because the toxic effects of repeated dosing cannot usually be predicted from a test like the LD50 which uses a single dose. The LD50 of dexamethasone in rats is 120mg kg-1, but on repeated administration, rats and dogs could not tolerate daily doses above 0.07 mg kg-1 (Zbinden and Flury-Roversi 1981). LD50 test results are also affected by genetic strain, sex, age, body weight, diseases, parasites present, quality of feed, degree of starvation, number of animals per cage, cage size, season and climatic conditions such as temperature, humidity and air pressure (Schutz 1986). Furthermore results can be influenced by the formulation and availability of test substances, quality of vehicle, administered volume, rate of application and handling during application.
    The LD50 can hardly be considered a biological constant and results for the same chemical can vary from laboratory to laboratory by as much as 8-14 times, using the same species and the same method of dosing (Bass et al, 1982). In recent years the LD50 test has come under increasing scrutiny with a number of eminent toxicologists condemning it. Zbinden and Flury-Roversi (1981) published a major review of the LD50 and concluded that: 'For the recognition of the symptomatology of acute poisoning in man, and for the determination of the human lethal dose, the LD50 is of very little value'.
    In the case of carcinogenicity tests the usual problems of species variation are aggravated by the high cost and long duration (over three years) of the procedure. As a result they are unsuitable for the task of assessing the safety of more than 40,000 largely untested chemicals currently in use in our environment (Davis and Magee 1979). This must be one of the strongest arguments in favour of the quicker, cheaper and more easily reproducible in vitro systems which have emerged in recent years.
    At a conference on Public Health Control of Environmental Health Hazards held at the New York Academy of Science in 1978, Peter Hutt argued that for these very reasons, reliance on animal testing for future regulatory decision-making is replaced. Hutt (1978) was referring specifically to food additives and food safety, and observed that 'even if all animal testing facilities available in the country were deployed solely in testing the potential carcinogenicity of all food substances, it is unlikely that the project would be completed in our lifetime'. Hutt argued that the 'single most important priority for food safety policy in the future is the development, refinement validation, and acceptance of a battery of new in vitro short-term carcinogenicity predictive tests, on the basis of which sound regulatory decisions can be made'.
    A recent report compared the results of carcinogenicity tests in the most commonly used species-rats and mice-and found that 46% of the substances carcinogenic in one species were safe for the other (Di Carlo 1984). Differences in response between males and females were also found. Of 33 chemicals found carcinogenic to both rats and mice, only 13 caused cancer in both male and female animals of each species. The author concluded (Di Carlo 1984): 'It is painfully clear that carcinogenesis in the mouse cannot now be predicted from positive data obtained from the rat and vice versa'.
    Another study investigated whether rodent carcinogenicity tests successfully predicted the 26 substances presently thought to cause cancer in humans. An analysis of the scientific literature revealed that only 12 of these (46%) have been shown to cause cancer in rats or mice (Salsburg 1983). It was concluded that: 'the lifetime feeding study in mice and rats appears to have less than a 50% probability of finding known human carcinogens. On the basis of the probability theory, we would have been better off to toss a coin.
    The implications for humans are obvious. As a result of the thalidomide disaster, which left 10,000 children crippled and deformed, teratogenicity tests became a legal requirement for new medicines. While it is true that thalidomide had not yet been tested specifically for birth defects prior to marketing, a close analysis of the tragedy suggests that animal testing could actually have delayed warnings of thalidomide's effect on the foetus. By June 1961, Dr W.G. McBride, an obstetrician practising in Sydney, had seen three babies with unusual malformations and had strongly suspected thalidomide. To test his suspicions, McBride commenced experiments with guinea-pigs and mice but when no deformities were found, he began to have doubts that were to nag him for months (Sunday Times. Insight Team 1979).
    Then, late in September came further malformed babies and McBride became certain that thalidomide was responsible. He wrote to the Lancet and the Medical Journal of Australia and published his findings (McBride 1961). Further experiments revealed that even if thalidomide had been tested in pregnant rats, the animals so often used to look for foetal damage, no malformations would have been found. The drug does not cause birth defects in rats (Koppanyi and Avery 1966) or in any other species (Lewis 1983), so the human tragedy would have occurred just the same.
    According to the Catalogue of Teratogenic Agents (Shepard 1976): 'several...principles were forcefully illustrated by observations made of the outbreak. The first point was that there existed extreme variability in species susceptibility to thalidomide'. The Catalogue reports that by 1966 there were 14 separate publications describing the effects of thalidomide on pregnant mice yet nearly all reported negative findings or else a few defects which did not resemble the characteristic effects of the drug. Only in certain strains of rabbit and primate can thalidomide's effect on the human foetus be reproduced.
    Teratogenicity tests have particular problems which make the results even more difficult to extrapolate to humans than other animal tests. In addition to the usual variation in metabolism, excretion, distribution and absorption which can exist between species, there are also differences in placental structure, function and biochemistry (Panigel 1983). Foetal and placental metabolism, and the handling of foreign compounds, are different in different species, and the use of several species does not necessarily overcome the problem. The difficulties are highlighted by aspirin, a proven teratogen in rats, mice, guinea-pigs, cats, dogs and monkeys, yet despite many years of extensive use by pregnant women, it has not been linked to any kind of characteristic malformation (Mann 1984).
    Furthermore, if important drugs such as penicillin and streptomycin were discovered today, would we reject them because of their known ability to cause birth defects in laboratory animals (Smithells 1980)? In fact many drugs are marketed despite causing malformations in laboratory animals. British biochemist Dennis Parke (1983) gives one example: 'corticosteroids are known to be teratogenic in rodents, the significance of which to man has never been fully understood, but nevertheless is assumed to be negligible. However, the practice of evaluating corticosteroid drugs in rodents still continues, and drugs which exhibit high levels of teratogenesis in rodents at doses similar to the human therapeutic dose are marketed, apparently as safe, with the manufacturer required only to state that the drug produces birth defects in experimental animals, the significance of which to man is unknown'.
    It is examples like this which suggest that much animal testing is more in the nature of a public relations exercise than a serious contribution to drug safety (Smithells 1980). Peter Lewis (1983), a Consultant Physician at Hammersmith Hospital in London, agrees that teratogenicity tests are 'virtually useless scientifically' but do provide 'some defence against public allegations of neglect of adequate drug testing'.
     In other words 'something' is being done, although it is not 'the right thing'. That 'something' is identified by D. F. Hawins (1983), Professor of Obstretic Therapeutics at the Institute of Obstetrics and Gynaecology, and Consultant Ostetrician and Gynaecologist at Hammersmith Hospital: 'The great majority of perinatal toxicological studies seem to be intended to convey medico-legal protection to the pharmaceutical houses and political protection to the official regulatory bodies, rather than produce information that might be of value in human therapeutics'.
    Animal tests can also misleading imply advantages of a new product over existing competitors. The anti-arthritis drug, Opren, was promoted as having the potential to modify the disease process (BBC 1983), an enormous commercial advantage over existing arthritis drugs which could only alleviate symptoms at best. But Opren's apparent advantage was based on studies with artificially induced arthritis in rats (BBC 1983), which is not a good model for the human condition (Rheumatology in Practice 1986).
    Ultimately the effect could not be reproduced in humans. On the basis of animal tests another NSAID, Surgam, was promoted as giving 'gastric protection', a considerable advantage over similar drugs where gastrointestinal side-effects are a major problem. Unfortunately for the manufacturers, clinical trials showed that Surgam did indeed damage the stomach and Roussel Laboratories were found guilty of misleading advertising and fined 20,000 pounds. A report of the case in the Lancet stated that experts witnesses for both sides 'agreed that animal data could not safely be extrapolated to man' (Collier and Herxheimer 1987).

The effects of animal tests
    Although the motivation for much testing is questionable on scientific grounds, the obsession with animal experiments has had the effect of delaying the development and introduction of safer tests and monitoring systems. Yet the thalidomide tragedy should have alerted governments to the need for superior methods of safety evaluation. For instance, knowing the extreme differences in species susceptibility to the drug, the British Government could have taken a number of steps to prevent further disasters.
    It could have introduced legislation, as in Norway, limiting new drugs to those fulfilling a real medical need, thereby minimising potential hazards; it could have introduced really effective system of post-marketing surveillance (PMS) for the early detection of side-effects; and it could have urgently promoted research into more reliable test procedures, for instance based on human cells and tissues. But the opportunities were largely neglected. Instead the Medicine Companies were allowed to flood the market with 'me-too' drugs, so that by 1981 the Department of Health and Social Security (DHSS) had to conclude that new chemical entities marketed in Britain during the previous ten years 'have largely been introduced into therapeutic areas already heavily oversubscribed' (Griffin and Diggle 1981).
    Other surveys estimated that around 70% of new chemical entities add little or nothing to those already available (Scrip 1985, Steward 1978). Little attention was given to effective PMS schemes, with the current yellow card system detecting only 1-10% of advance reactions (Lesser 1980). Paediatrician Robert Brent (1972) has argued that efficient clinical surveillance schemes would have uncovered the link between thalidomide and limb malformations after only a handful of cases, thus preventing a major catastrophe.
    Another case where effective PMS would have prevented a major drug disaster is Eraldin. It took over four years to detect the drug's capacity to damage the eyes (Inman 1980). Post-marketing surveillance has recently received more serious attention with Inman's monitoring scheme, although this is still based on a voluntary reporting system. Until recently little attempt had been made to validate human cell tests, yet researchers have argued that despite limitations, they can give a degree of reassurance not provided by in vivo animal experiments or procedures based on animal tissue (Gyte and Williams 1985).
    As long ago as 1961 scientists reported that thalidomide's potential to damage the foetus could be detected in human tissue tests (Lash and Saxen 1971). By giving so little attention to effective PMS and more reliable test systems based on human tissues, the result is that reliance is almost exclusively being placed in the original animal experiments and in clinical trials.
    But clinical trials only involve relatively small numbers of people so, in the absence of efficient PMS, doctors are, in effect, relying to a considerable extent on the preliminary animal experiments as a warning against adverse reactions. In fact the indications are that iatrogenic disease is reaching what DHSS Principle Medical Officer, Ronald Mann (1984) describes as 'epidemic proportions'. Government figures reveal that in 1977 over 120,000 were discharged from or died in UK hospitals after suffering the side-effects of medicinal products (Mann 1984). Adverse reactions are considered to be an increasingly important problem with perhaps 1 in 20 general hospital beds (1 in 7 in the USA) occupied by patients suffering from their treatment (D'arcy and Griffin 1979). In general practice as many as 40% of patients may experience side-effects (Mann 1984).
    One estimate puts the annual number of drug-induced deaths in Britain at 10, 000-15,000 (Melville and Johnson 1982), which is 4-5 times the official estimate. Nevertheless a recent investigation of just one category of drugs suggests that the higher value may well be nearer the truth. The study focused on NSAIDs and estimated that they may be associated with over 4000 deaths every year in UK from gastrointestinal complications (Cockel 1987).
    Powerful chemical drugs will always carry a risk of toxic effects and the increased level of iatrogenic issued. Nevertheless reliance on tests which give so little relevant information about effects in humans, and which so often prove misleading, must add to the burden of iatrogenic disease. Certainly animal tests are failing to curb the current epidemic of drug-induced disease.

The effects on medical research
    The growing reliance on animal models has made a fundamental effect in medical research too, for it has often diverted attention from methods which directly apply to humans such as clinical studies and epidemiology. One example is the study of diabetes. In 1788 Thomas Cawley discovered the link between diabetes and a damaged pancreas when he examined a patient who had died from the disease (Jackson and Vinik 1977). This was subsequently confirmed by further autopsies but the idea was not accepted for many years, partly because physiologists failed to induce diabetic state in animals by artificially damaging the pancreas (Levine 1977).
Eventually in 1898, Mering and Minkowski 'confirmed' the clinical findings when they produced diabetes in a dog by surgically removing the entire pancreas (Volk and Wellman 1977).
    The link between smoking and lung cancer was first discovered through epidemiology and potentially represents one of the most important contributions to health policy in recent years. Yet attempts to duplicate the effect by forcing laboratory animals to inhale smoke met with little success (Lancet 1977b). Nevertheless negative findings in animals must have been welcome news for those who wished to deny the association.
    In fact the traditional emphasis on animal-based cancer research seems to have diverted attention from a true understanding of the underlying causes, so that little attention has been given to preventive measures. Before World War I, epidemiology and clinical observation had identified several causes of cancer. It was found, for instance, that pipe smokers had an increased risk of lip cancer while radiologists often contracted skin cancer. Then in 1918 Yamagiva and Ischikawa produced cancer on a rabbit's ear by painting it with tar and the apparent potential for laboratory experiments captured the imagination of the scientific world (Doll 1980).
    As Sir Richard Doll (1980) explains, observational data were commonly dismissed and carried little weight in comparison with those obtained by experiment, as it was confidently believed that the mechanisms by which all cancers were caused would be discovered within a few years.
    The resulting over-emphasis on trying to cure the disease once it had risen was to prove a grave mistake. A recent analysis of cancer trends in the United States indicates a substantial increase in the overall death rate since 1950, despite progress against some rare forms of the disease, amounting to 1-2% of total cancer deaths. 'The main conclusion we draw', states the report, 'is that some 35 years of intense efforts focused largely on improving treatment must be judged a qualified failure. The report concludes that 'we are losing the war against cancer' and argues for a shift in emphasis towards prevention if there is to be substantial progress (Bailar and Smith 1986).
    With the recent revival of epidemiology we know much more about the major risk factors, so that 80-90% of cancers are considered potentially preventable (Doll 1977, Muir and Parkin 1985). And interestingly the United States Office of Technology Assessment report on the causes of cancer relied far more on epidemiology than on animal experiments or other laboratory studies because, its authors argued these 'cannot provide reliable risk assessment' (Roe 1981).
    The ultimate test of the success of animal experiments in medical research is whether they lead to real improvements in health which cannot be achieved by other means. There can be no doubt that life expectancy has improved considerably since the mid-19th century but this has been attributed chiefly to improvements in the public health, with medical measures playing only a relatively small part (McKeown 1979, Mckinlay and Mckinlay 1977). By 1950 the fall in death rate had started to level out (OPCS 1978), but it was around this time that animal experiments started to increase dramatically, so has this resulted in corresponding improvements in health?
    In fact hospital admissions are increasing (DHSS 1976, Melville and Johnson 1982, Annual Abstract of Statistics 1987) as is the level of chronic sickness in all age groups (Social Trends 1975, 1985); more working days are being lost (Wells 1987) and the number of prescriptions issued per person has risen from an average of 4,7 in 1961 to 7,0 in 1985 (Social Trends 1987). The picture for major serious diseases is equally disturbing with overall cancer mortality showing no signs of decline (Smith 1982), while Britain's death rate from heart disease is on of the highest in the world (Ball 1983).
    Whatever animal experiments are doing, they appear to have little overall effect on our state of health.
    Those who def
end experiments on animals often present us with a simple choice: which life is more important, they ask, that of a child or that of a dog (Noble 1985)? Indeed the basic rationale behind animal experimentation, as spelled out by Claude Bernard (1865), is that lives can only be saved by sacrificing others.
    But since animal-based research is unable to combat our major health problems and, more dangerously, often diverts attention from the study of humans, the real choice is not between animals and people; rather it is between good science and bad science because they all tell us about animals, usually under artificial conditions, when we really need to know about people. Only a human-based approach can accurately identify the principal causes of human disease, so that a sound basis for treatment is available and preventive action can be taken.


Part 2. 'A Critical Evaluation of Animal Experimentation' by Dr Robert Sharpe


     Laboratory animals are used to develop drugs and investigate disease, to test agricultural and industrial chemicals, cosmetics and household products, in psychological and behavioural research, and in a multitude of other ways. Yet the method has never been fully validated and it is merely assumed that people and animals will respond in the same way. If these experiments are in fact misleading, then animals will not be the only victims of science and it will be in all our interests to promote more humane and reliable means of investigation.

The cost in animal suffering
    By its very nature, animal research is virtually inseparable from suffering or death. This is partly to do with the experimenter's desire for a disposable species that can be manipulated as required and killed when convenient. It also arises from the way in which many tests are performed. In the field of toxicology, which accounts for approximately 20% of all animal experiments,[1] dose levels are often chosen to induce adverse effects. For instance, acute toxicity tests like the LD50 require some of the animals to be poisoned to death to provide a numerical index of toxicity. As a British government committee officially concluded, 'LD50s must cause appreciable pain to a proportion of animals subjected to them'.[2]
    Opposition to the LD50 led toxicologists to suggest the 'fixed dose procedure' as a more humane alternative. Although there is no requirement for a lethal endpoint, the procedure nevertheless requires clear signs of toxicity before it is stopped. In more prolonged toxicity tests with new drugs, the highest dose levels are again chosen to induce harmful effects so that doctors have some idea of which body systems require special monitoring during trials with human volunteers. And to avoid the huge number of animals that would be required to mimic the human population's exposure to small amounts of suspect chemicals, high doses are administered to smaller groups of animals in carcinogenicity tests.

    Another major area where animals are deliberately harmed is the study of illness and injury. Here the condition is artificially induced to produce an 'animal model'. In cancer research for instance, the UK's leading charities acknowledge that 'animals with local or disseminated tumours are likely to experience pain and/or distress'.[3]
    A recent development is the use of biotechnology to breed animals that automatically become ill. A genetically-engineered mouse model of Gaucher's Disease (a rare condition involving anaemia and enlargement of the liver and spleen) dies within 24 hours of birth whilst the 'oncomouse', produced by inserting human cancer genes into the embryos of mice, quickly develops fatal breast cancer. Genetically-engineered 'cystic fibrosis mice' die within 40 days.
    Many of the techniques and devices used by animal researchers seem more suited to an old fashioned torture chamber than a 20th century laboratory. Pain responses are often measured by putting animals onto hot plates, by dipping their tails into hot water and by the 'mouse writhing test' in which acetic acid is injected into the animal's stomach; stress is induced by forcing animals to swim in a tank of water; and to produce the desired behaviour in psychological experiments, researchers use electric shocks, drugs or food deprivation.
    Animals are made to breathe suspect fumes in inhalation chambers; they are addicted to drugs, and potential irritants are applied to their skin or instilled into their eyes. Animals may also suffer from the way in which they are kept or through poor experimental technique. And for many primates, additional hazards arise during capture and transportation from the country of origin.

Misleading research

    Although worldwide, millions of animals die as human surrogates, the method has never been validated.[4] Few major comparative studies have been performed and it is assumed that animal experiments will correctly predict human responses. In fact, a recent analysis of adverse drug reactions found only a 5-25% correlation between harmful effects in people and the results of animal experiments.[5] Similar problems can arise when animals are used as 'models' of disease and injury. Between 1978 and 1988 for instance, 25 drugs were found useful in treating animals with artificially-induced stroke, but none has come into general clinical use.[6]
    Even the genetically-engineered animals that scientists hope will more closely mimic human disease, are proving unreliable. In the case of cystic fibrosis mice, the animals' lungs do not become infected or blocked with mucous as they do in human patients,[7] although it is lung infections that kill 95% of people with the disease; mice that are genetically engineered to develop Waardenburg's Syndrome type 1 do not become deaf even though loss of hearing is a common feature in humans; [8] and in mice genetically programmed to be more susceptible to HIV, the immune system is not suppressed as it is in AIDS patients.[9] Such problems are rarely mentioned by animal research lobbyists yet as the following case studies reveal, they represent a serious threat to our health.

Case studies - alcohol research

    For centuries, alcohol has been regarded as poisonous for the liver. That is, until the first half of the 20th century when it was cleared of liver toxicity following experiments on animals. Based largely on work with rats, it was claimed that 'there is no more evidence of a specific toxic effect of pure ethyl alcohol on liver cells than there is for one due to sugar'.[10]
    Today, alcohol is once again considered a liver toxin but there are still some who doubt the evidence since it has proved so difficult to induce cirrhosis in animals.[11] Only baboons appear susceptible.
    The carcinogenic effect of alcohol has also been questioned following experiments on animals. [12] Although it has been known for decades that excessive consumption can cause human cancer, it has not been possible to produce the disease in animals. In addition, it has proved difficult to replicate the heart disease experienced by alcoholics, and whereas prolonged consumption raises the blood pressure in people, this is not usually the case in rats.[13]
    Nor are animals a reliable guide to treatment. For instance, it is fortunate that the beneficial effect of Librium in controlling withdrawal symptoms was first identified in clinical trials, for later research with physically-dependent mice suggested the treatment had a lethal side-effect.[14]

Case studies - asbestos and occupational carcinogens

    The first reports of an association between asbestos and lung cancer came during the 1930s following examination of people who had died with asbestosis. But attempts to induce cancer in animals repeatedly failed and despite further evidence from exposed workers (in 1949 and again in 1955) the carcinogenic action of asbestos was doubted until the 1960s.[15] Prior to this 'a large literature on experimental studies has failed to furnish any definite evidence for induction of malignant tumours in animals exposed to various varieties and preparations of asbestos by inhalation or intratracheal injection'.[16]
    Asbestos is not the only occupational carcinogen where animal experiments cast doubt on human findings. In the case of benzene, an important industrial chemical that nevertheless causes leukaemia in people, 14 separate animal trials starting in 1932, failed to show that it caused cancer.[17] Only during the late 1980s were scientists finally able to induce the disease in animals.
    Arsenic, naphthylamine and soot are further examples where decades elapsed before animal researchers were able to replicate clinical results.

Case studies - drug toxicity

    In June 1993 the US National Institutes of Health halted clinical trials of hepatitis drug fialuridine, following deaths and serious complications among participants. Patients unexpectedly suffered liver toxicity yet the drug seemed both safe and effective in laboratory animals.[18]
    However, the metabolism of antiviral drugs of this type is thought to be very different in animals and people, and the tragedy has prompted a closer look at related drugs to see if other patients are experiencing similar problems.
    Fialuridine is not an isolated case and further examples are listed in the Table (below). It is estimated that around 85% of major drug hazards occurring since thalidomide were not predicted, and could not have been predicted, by animal experiments.[5]
    Animal tests not only give a false sense of security, there is also the risk that worthwhile therapies may be lost or delayed through toxic effects that do not occur in human patients. Development of propranolol, the first widely used beta-blocking drug, was put in jeopardy when it caused rats to collapse and dogs to vomit severely;[19] on the basis of animal tests, the antirejection drug FK506 was feared too toxic for human use, and if it hadn't been given as a last chance option to transplant patients in 'desperate plight', its life-saving qualities may never have been appreciated;[20] the discovery that tamoxifen caused cancer in rats would have halted development of this important anticancer drug had ICI not already been reassured by its safety profile in human patients;[21] and Howard Florey, who developed penicillin for therapeutic use, later admitted it was a 'lucky chance' that mice rather than guinea pigs had been used - 'If we had used guinea pigs exclusively we should have said that penicillin was toxic, and we probably should not have proceeded to try and overcome the difficulties of producing the substance for trial in man'.[22]

Case studies - methanol poisoning

    Methanol is employed in a wide variety of consumer products and is also imbibed as a cheap alternative to alcohol. Although methanol is a highly poisonous, potentially lethal substance, this was not realised for many years. Common laboratory species such as rats and mice are resistant to its effects, giving the impression that methanol is only slightly toxic, and far less poisonous than alcohol.[23] In fact, methanol is ten times more toxic and a single bout of drinking methanol can lead to blindness in people. This does not happen in rats, mice, dogs, cats, rabbits or chickens although it was later reproduced in monkeys.
    Animal experiments also proved misleading in devising treatment. Although good results were achieved using bicarbonate in cases of human poisoning, the treatment not only failed in animals but generally proved fatal, prompting some to advise against its use. In a review of the subject, Roe states that 'it is indeed deplorable that about 30 years elapsed before the good effects of this treatment became commonly known, and unfortunately some still doubt its value. It seems that the authors of medical textbooks have paid more attention to the results of animal experiments than to clinical observations'.[23]
    Another valuable treatment is to administer alcohol to reduce the toxicity of methanol. Although effective in human patients, animal tests suggested it would increase the danger of methanol and again, some discouraged its use.[23]

Case studies - pneumoconiosis and the coal industry

    It has long been known that coal miners suffer the lung disease pneumoconiosis, but for many years researchers believed that breathing coal dust was 'completely innocuous' and that respiratory illness arose from the silica that sometimes contaminates the coal. The idea originated from the laboratory where pure coal dust had no harmful effects on animals' lungs.[24]
    Reliance on these experiments proved devastating. Since there is little exposure to silica in bituminous coal pits, mining was not considered dangerous, and few observational studies were carried out in the US. As a result there was almost no information on American coal workers' pneumoconiosis until the Public Health Service performed studies as recently as 1962/63.
    The animal data were in fact contradicted by the discovery that men who worked with pure coal dust or carbon alone also developed pneumoconiosis, evidence that shows that coal dust can cause lung disease in the absence of silica. [25] The animal tests were further undermined when coal dust, collected at a coal face where pneumoconiosis among miners was high, proved innocuous to laboratory rats.[24]

Case studies - polio and AIDS research

    Following discovery of the polio virus in 1908, scientists focused their main attention on the artificially induced disease in monkeys, believing it to be an exact replica of the human infection.
    Based on these experiments, it was generally believed that poliovirus entered the body through the nose and that it only attacked the central nervous system. Yet by 1907, epidemiological studies of actual human cases had correctly suggested that poliovirus was not entirely or even chiefly a disease of the central nervous system, and that people are infected through the digestive tract. Tragically, animal experiments so dominated research that prior to 1937, most scientists rejected the notion that polio is an intestinal disease.[26]
    Whether the virus entered the body by the mouth or nose was of great practical importance for it determined the kind of remedies developed. By 1937, for instance, researchers had produced a nasal spray that prevented infection in monkeys. It was widely promoted for human use but inevitably failed. The only result was to abolish the children's sense of smell, in some cases permanently.[27]
    Eventually, support for the nasal route of infection started to wane, and it was only when scientists understood that poliovirus enters the mouth and first resides in the intestines that it was possible to develop an orally administered vaccine.
    Although monkey experiments delayed a proper understanding of polio by over 25 years, the lessons do not seem to have been learned, and some still put reliance on animal models of infectious disease. Indeed, failure to induce AIDS in laboratory animals has been used to support arguments against HIV as the cause.[28]

Case studies - smoking and lung cancer

    By the mid 1950s more than a dozen epidemiological studies had identified the link between smoking and lung cancer. Nevertheless, some still argued that the association was unwarranted because no one had produced the disease in laboratory animals.[29] In a review of the evidence, Northrup states that the 'inability to induce experimental cancers, except in a handful of cases, during 50 years of trying, casts serious doubt on the validity of the cigarette-lung cancer theory'.[30]
    Health warnings were delayed costing many lives. Yet despite years of further experimentation, it proved 'difficult or impossible' to induce lung cancer in animals using the method (inhalation) by which people are exposed to the smoke.[31]

Case studies - X-rays and the foetus

    In 1956 British doctors drew attention to a link between X-rays during pregnancy and subsequent childhood cancers. Within a few years similar findings were reported in American children. But for a quarter of a century, scientists questioned whether X-rays were actually the cause and cited animal experiments to show that the foetus is not especially sensitive to radiation.[32] However, it seems that compared with other species, the human foetus is more susceptible to the carcinogenic effects of X-rays, and during the 1980s further observational studies confirmed the hazards, particularly in early pregnancy.

The humane option

    It is part of a physician's code of ethics to do no harm to his patients, an idea that is incorporated in the Hippocratic Oath. We believe these same standards of behaviour should also apply to medical scientists who carry out research. They too should undertake to 'do no harm' by using only humane methods of investigation. At present, much biomedical research is based on the assumption that lives can only be saved by sacrificing others.
    There are many methods of investigation that do no harm, and these include human epidemiological studies, clinical observations of patients who are ill or who have died, work with healthy volunteers, in vitro experiments with human tissues and computer simulation of biological systems. These techniques not only avoid animal suffering, they also produce results of direct relevance to human medicine. The following case studies stress both their importance and future potential.

Case studies - the Framlington Project

    Disease prevention is the primary role of epidemiology, a technique based on comparisons: epidemiologists obtain clues by comparing disease rates in groups or populations with differing levels of exposure to the factor under investigation. The method has taught us how to prevent nearly all cases of heart disease, the West's number one killer.
    Perhaps the most important study in the history of heart research began during 1948 in the small Massachusetts town of Framingham where researchers set out to determine 'factors influencing the development of heart disease'. Inhabitants received medical examinations and supplied information about their diet and lifestyle with doctors monitoring their health over the ensuing years. At that time practically nothing was known about the cause of heart disease but the Framingham study demonstrated clearly and for the first time that smoking, elevated blood pressure and high cholesterol levels are major risk factors.
    The Framingham project, together with further population studies showing that coronary illness is more common in people who seldom take exercise (London bus drivers had been compared with their more active colleagues, bus conductors), demonstrated how heart disease could be prevented. The results were later confirmed in other countries,[33] and since the 1960s heart disease mortality in the United States has fallen sharply in line with improvement in diet and lifestyle.[34]

Case studies - drug research and human tissue

    Traditionally, laboratory scientists have relied on animals to search for new drugs but in vitro tests with human tissue offer important advantages. During the 1980s, America's National Cancer Institute replaced many of its animal experiments with human cancer cells since the usual tests with leukaemic mice were failing to identify new treatments against any of the main cancers.[35]
    Human cells are also enabling scientists to identify drugs like AZT which combat AIDS. Knowledge of how HIV disrupts human white blood cells led to an in vitro system where drugs are added to the cells to see if they prevent damage caused by the virus. In another case, arthritis researchers point out that 'the use of animal cartilage to test new drugs for human cartilage protective, or damaging, activity could be very misleading and such screening would best be carried out using human cartilage'.[36]
    Much drug research focuses on how medicines or naturally occurring body chemicals exert their effects on the tissues. By understanding these processes, pharmacologists hope to provide a more rational basis for devising treatments. Nevertheless, Muller-Schweinitzer found that 'despite the limited relevance for human pharmacology of most of the animal tissues, the use of human material in pharmacological studies is still the exception rather than the rule'.[37]

Clinical research - drug discovery

    Analysis of some of the main heart drugs illustrates the vital role of human clinical research in developing treatments. Digoxin and digitoxin are extracted from digitalis whose value in treating heart failure and cardiac arrhythmias resulted from studies of heart patients;[38] the development of nitroglycerin as a major remedy for angina derived from experiments carried out by London physician William Murrell on himself; the use of quinidine to control arrhythmias originated with the observation that an almost identical drug, quinine, could ease fibrillation in a patient who took it to prevent malaria;[39] and the introduction of lignocaine and phenytoin as further anti-arrhythmic drugs resulted from chance observations after their introduction for other purposes.[40]
    The history of quinidine illustrates an important therapeutic principle: that a skilled clinician can discover new uses for drugs or chemicals originally intended for a different purpose. The first effective anticancer agents originated with the observation that one of the long term effects of the mustard gases used in WW1 was damage to the bone marrow.[41] Doctors noticed that exposed soldiers and workers experienced a dramatic lowering of their white blood cell count and suggested the chemicals as a possible treatment for leukaemia and lymphoma - cancers characterised by an overproduction of white blood cells. The nitrogen mustards are now used in combination with other anticancer drugs to treat conditions like Hodgkin's disease.

Drug safety

    Biochemist Dennis Parke, former member of the British Government's Committee on Safety of Medicines, states '...there are indeed more appropriate alternatives to experimental animal studies and, for the safety evaluation of new drugs, these comprise short-term in vitro tests with micro-organisms, cells and tissues, followed by sophisticated pharmacokinetic studies in human volunteers and patients'.[4]
    Furthermore by using human tissues in their in vitro tests, scientists can avoid the problems of species variation and improve the safety of drugs. [42] It is now known, for instance, that the potentially fatal cases of aplastic anaemia associated with both chloramphenicol and phenylbutazone, which led to severe restrictions in their use, could have been identified with human bone marrow cells in vitro.[43] In addition, thalidomide's harmful effects can be studied in human cell cultures,[44] although many animal species proved resistant to the drug.[45]
    There is also a way to make in vitro systems more closely resemble a living person. Sometimes chemicals only become hazardous when metabolised by the liver, so researchers include liver cells in their in vitro tests to mimic the body's main metabolic processes. Although ingenious, the obsession with animal experiments means that liver cells usually derive from rats, mice or hamsters rather than people.
    The problem is illustrated by mianserin, an antidepressant that can cause serious blood disorders. The effects had not been predicted by animal experiments,[46] and later research found that only in vitro tests with human tissues could correctly identify the hazards.
    Mianserin is thought to produce its harmful effects following metabolism by the liver, where it is transformed into another substance which is toxic to the blood. This can be observed in the test tube by incubating white blood cells with liver tissue from human donors but not from mice, rats, rabbits or guinea pigs. The findings demonstrate that 'extrapolation of in vitro toxicity data from animals to humans is a poor substitute for the use of human tissue'.[47]
How the body works

    A central objective of biomedical research is to understand how the body works and how it is disrupted by disease. Traditionally physiologists have used animals for the purpose, a common procedure being to interfere with one part of the body to see how it affects another. But since there are almost unlimited cases of human illness and injury, an effective alternative would be to take advantage of these 'experiments of nature' and make careful observations and deductions.
    Studies of patients with deficiencies in their immune systems has provided important clues to our understanding of the body's natural defences.[48] Immunologist Robert Good referred to such experiments of nature as 'guiding beacons' in the early study of immunology. Another example is research into brain function. Neurologists Antonio and Hanna Damasio observe patients with brain injuries and relate changes in their behaviour to the damaged part of the brain.[49]
    Human physiological studies are not restricted to cases of ill health and the same scanning techniques which permit the investigation of disease in patients, are also being applied to physiological studies of healthy volunteers. There are also new ways of investigating brain function: non-invasive techniques are used to stimulate nerve cells through the intact skull of human subjects.

In vitro carcinogens

    The use of animals to identify cancer-causing chemicals is fraught with danger. Not only are there species differences (one survey found that human carcinogens would be better predicted by tossing a coin, [50] but animal tests are unable to cope with the thousands of untested chemicals currently in use. It has been estimated, for instance, that no toxicity information exists for most chemical intermediates and by-products involved in manufacturing processes, and while these do not reach consumer goods, they still represent a potential occupational hazard.[51]
    Animal experiments are not the answer because carcinogenicity tests continue for the animals' lifespan (2-3 years in the case of rats, mice and hamsters) and are therefore very expensive, costing around 1 million dollars per chemical. As a result, dozens of in vitro tests have been developed to assess potential carcinogens and mutagens, and while they can no doubt be improved, they take only days to perform and can be repeated to check results, something which is difficult to do with animal tests. Some of the in vitro tests use human white blood cells where the effect of chemicals on chromosomes can be monitored in the test-tube. Well known occupational carcinogens such as arsenic, vinyl chloride, benzene and chromium are correctly detected by this system.[52]
    Other in vitro systems rely on bacteria, yeast and cells from other species. Just one of these tests - the Salmonella bacterial assay - can correctly identify 77% of known human carcinogens.[53]

Computer simulations in research, testing and education

    The power of modern computers allows the simulation of complex biological systems, providing more rapid insights into the nature of disease. This technology has seen the emergence of a growing number of simulations of the immune system. For instance, Delisi and co-workers have used mathematical modelling techniques to show that the immune system can not only fight cancer growth but stimulate it as well. Delisi explains, 'if our model had been around ten years ago, it could have predicted what it's taken scientists countless man-hours and animals to figure out. This is the value of mathematical modelling - it comes up with things you might otherwise miss'.[54]
    Based on the idea that drugs must be the right shape to trigger their effects, scientists are using 3D computer graphics to design new treatments, thereby avoiding the old hit and miss approach in which thousands of chemicals were subjected to animal tests. Computer simulations are also being developed that predict the harmful effects of medicines.
    Together with physical models, videotape recordings, in vitro experiments and human studies, computer simulations offer great scope for replacing animals in medical education. An example is the 'Mac' family of interactive digital computer programs for use in clinical, physiological and pharmacological teaching and research. The student can monitor important physiological variables such as heart rate, blood pressure, arterial oxygen and carbon dioxide pressure, acid-base status, urine output, plasma potassium, hemoglobin and plasma concentration of drugs, and by altering one or more of these variables, can study the effects over a period of time.[55]

The draize test

    Rabbits are usually chosen for eye irritancy tests because they are cheap, readily available, docile and have a large eye for assessing results.[56]
    In fact the rabbit eye is a poor model for the human eye and the test has been repeatedly criticised: after 40 years, researchers announced that the traditional Draize test, using comparatively large doses, 'has essentially no power to predict the results of accidental human eye exposure'.[57]
    It was only during the 1980s, when animal protection groups focused attention on the Draize test, calling for an urgent programme of research to find an alternative, that attitudes started to change. Within a decade,[60] in vitro tests were either in use or under development,[58] and the use of animals declined rapidly. One of the most successful alternatives is Eyetex which monitors the interaction of test chemicals with large biological substances like proteins.
    Although a combination of in vitro systems can now be used to identify eye irritants, some still only regard them as preliminary tests prior to confirmation in animal experiments.[59]


    Developing humane technologies depends very much on attitudes prevalent within the scientific community. Some tests continue long after they are considered essential because scientists do not feel strongly about the unnecessary loss of life. Perkins has estimated, for instance, that 120,000 mice, 60,000 suckling mice, 30,000 guinea pigs and 60,000 rabbits were used between 1960 and 1980 for checking the purity of polio vaccine batches, 'without adding anything to the safety of vaccines'.[60] In 1982 the World Health Organisation finally recommended that such tests were unnecessary when human cells were used instead of animal tissues to produce the vaccine.
    In 1972 Britain's Tuberculosis Reference Laboratory reported that an in vitro technique could be used instead of guinea pigs for diagnosing TB. Nevertheless, 14 years later the Medical Microbiology Department at the London Hospital was still routinely inoculating guinea pigs for the diagnosis of TB.[61]
    In contrast, the Draize campaign shows what can be achieved when scientists are sufficiently motivated to replace animals. Other examples arise when the lack of an animal model forces researchers to use other approaches. Although animals are traditionally employed to test the potency of vaccines, this is of no use in the case of pneumonia vaccines. Here, the causal organisms are not generally virulent for laboratory animals, prompting an alternative method based on chemical analysis and studies with human volunteers.[62]
    Cases like this contradict claims that without animals medical research would be delayed. As former animal researcher Harold Hewitt explains, ' underrates the ingenuity of researchers to suggest that medical progress would have been seriously impeded had animal experiments been illegal, although a different strategy would have been required. It is the skill of the scientist to find a way round the intellectual, technical and ethical limitations to investigation. No one complains, surely, that we have been denied the benefit of potential advances by prohibiting experiments on unsuspecting patients, criminals or idiots...'.[63].
    The issue then, is whether the scientific community cares enough to avoid killing animals. There is much that governments can do to stimulate positive attitudes. Even if unwilling to immediately prohibit animal experiments, they can set target dates after which specific tests would no longer be permitted; they can mandate a continuing and substantial annual decline in the use of animals; and they can insist that drug companies improve safety profiles by always testing new products on human tissues in vitro.
    At the same time government funding agencies can provide incentives by giving priority to grant applications featuring methods of direct relevance to people, such as clinical, epidemiological and human tissue studies. And by establishing national, coordinated networks of tissue banks, they can overcome the shortage of human material for research and testing.

    It is our hope that scientists, governments and industry will recognise the merit in these proposals and join with us in promoting a more humane and sophisticated approach to biomedical research and healthcare.


1 Statistics of Scientific Procedures on Living Animals, Great Britain 1992 (HMSO, 1993).
2 Report on the LD50 Test. Home Office Advisory Committee, 1979.
3 UK Co-ordinating Committee on Cancer Research Guidelines for the Welfare of Animals in Experimental Neoplasia (UKCCCR, July 1988).
4 D.V. Parke, ATLA, 1994, vol 22. 207-209.
5 R. Heywood in Animal Toxicity Studies: Their Relevance for Man, Eds. C. E. Lumley and S.R. Walker (Quay Publishing, 1990).
6 D.O. Wiebers et al, Stroke, 1990, vol 21, 1-3.
7 Editorial, Lancet, 1992, September 19, 702-703.
8 K.Davies, Nature, 1992, September 3, 86: P.Parham, Nature, 1991, October 10, 503-505.
9 M.B. Gardner and P. A. Luciw, The Faseb Journal, 1989, vol3, 2593-2606.
10 C.S Lieber and L. M. DeCarli, Journal of Hepatology, 1991, vol 12, 394-401.
11 R.F. Derr et al, Journal of Hepatology, 1990, vol 10, 381-386.
12 L. Tomatis et al, Japanese Journal of Cancer Research, 1989, vol 80, 795-807.
13 J.V. Jones et al, Journal of Hypertension, 1988, vol 6, 419-422.
14 D.B. Goldstein, Journal of Pharmacology and Experimental Therapeutics, 1972, vol 183, 14-22: Librium was shown to be clinically effective in 1965 (G. Sereny and H. Kalant, British Medical Journal, 1965, January 9, 92-97.
15 P.E. Enterline American Review of Respiratory Diseases, 1978, vol 118, 975-978: P.E. Enterline in Epidemiology and Health Risk Assessment, Ed. L. Gordis (Oxford University Press, 1988).
16 W.E Smith et al, Annals of the New York Academy of Sciences, 1965, vol 132, 456-488.
17 D.M. De Marini et al, in Benchmarks: Alternative Methods in Toxicology, Ed. M.A. Mehlman (Princeton Scientific Publishing Co. Inc., 1989).
18 N. Touchette, Journal of NIH Research, 1993, vol 5, 33-35.
19 D.R. Laurence et al (Eds.) Safety Testing of New Drugs (Academic Press, 1984).
20 R. Allison, Journal of the American Medical Association, 1990, April 4, 1766.
21 J Patterson, reported in P. Brown, New Scientist, 1992, February 29, 11.
22 H. Florey, Conquest, January, 1953.
23 O. Roe, Pharmacological Reviews, 1955, vol 7, 399-412.
24 Editorial, British Medical Journal, 1953, January 17, 144-146.
25 W.K.C. Morgan in Occupational Lung Diseases, Eds. W.K.C. Morgan and A. Seaton (Saunders, 1982); see also ref.24.
26 J.R. Paul, A History of Poliomyelitis (Yale University Press, 1971).
27 M.F. Dowling, Fighting Infection (Harvard University Press, 1977).
28 New Scientist, 1988, March 3, 34.
29 Reported in S. Peller, Quantitative Research in Human Biology (Wright and Sons, 1967).
30 E. Northrup, Science Looks at Smoking (Conrad-McCann, 1957).
31 Editorial, Lancet, 1977, June 25, 1348-1349; F.T. Gross et al, Health Physics, 1989, vol 56, 256.
32 E.B. Harvey et al, New England Journal of Medicine, 1985, February 28, 541-545.
33 A. Stewart Truswell, British Medical Journal, 1985, July 6, 34-37.
34 J. Pemberton, Upgrading in Epidemiology, Report to Director General for Science, Research and Development Committee of the European Communities, 1985.
35 R. Kolberg, Journal of NIH Research, 1990, vol 2, 82-84; A. Pihl, International Journal of Cancer, 1986, vol 37, 1-5.
36 S. Ismaiel et al, Journal of Pharmacy and Pharmacology, 1990, vol 43, 207-209.
37 E. Miller Schweinitzer, Trends in Pharmacological Sciences, 1988, vol 9, 221-223.
38 For treating heart failure: W. Sneader, Drug Discovery: the Evolution of Modern Medicine, (Wiley, 1985): for treating atrial fibrillation: T. Lewis, Clinical Science (Shaw and Sons Ltd., 1934).
39 E.M. Vaugan Williams, Antiarrhythmic Action and the Puzzle of Perhexiline (Academic Press, 1980): S.Bellet, Clinical Disorders of the Heart Beat (Lea and Febiger, 1971).
40 E.S. Snell, Pharmacy International, 1986, February, 33-37.
41 J. Cairns, Scientific American, 1985, November, 31-39.
42 J.M. Frazier et al, Toxicology and Applied Pharmacolgy, 1989, vol 97, 387-397; see also ref. 43.
43 G.M.L. Gyte and J.R.B. Williams, ATLA, 1985, vol 13, 38-47.
44 J.W. Lash and L. Saxen, Nature, 1971, August 27, 634-635.
45 R.D. Mann, Modern Drug Use, an Enquiry on Historical Principles, (MTP Press Ltd., 1984).
46 H.M. Clink, British Journal of Clinical Pharmacology, 1983, vol 15, 291S-293S.
47 P. Roberts et al, Drug Metabolism and Disposition, 1991, vol 19, 841-843.
48 R.A. Good et al, Annals of the New York Academy of Sciences, 1957, vol 64, 882-928.
49 Science, 1990, May 18, 812-814.
50 D. Salsburg, Fundamental and Applied Toxicology, 1983, vol 3, 63-67.
51 L. Magos, British Journal of Industrial Medicine, 1988, vol 45, 721-726.
52 N.E. Garrett et al, Mutation Research, 1984, vol 134, 89-111.
53 M.D. Shelby, Mutation Research, 1988, vol 204, 3-15.
54 M. Stephens in Animal Experimentation: the Consensus Changes, Ed. G. Langley (Macmillan Press, 1989).
55 C.J. Dickinson et al, ATLA, 1985, vol 13, 107-116.
56 B. Ballantyne and D.W. Swanston in Current Approaches in Toxicology, Ed. B. Ballantyne (Wright and Sons, 1977).
57 F.E. Freeberg et al, Fundamental and Applied Toxicology, 1986, vol 7, 626-634.
58 C.G. Shayne in Benchmarks: Alternative Methods in Toxicology, Ed. M. A. Mehlman (Princeton Scientific Publishing Co. Ltd., 1989).
59 L.H. Bruner et al, Fundamental and Applied Toxicology, 1991, vol 17, 136-149.
60 F.T. Perkins, Developments in Biological Standardisation, 1980, vol 46, 3-13.
61 G. Langley (Ed). Animal Experimentation: the Consensus Changes, (Macmillan Press, 1989).
62 J.B. Robbins, Journal of Infection, 1979, vol 1, Supplement 2, 61-72.
63 H. Hewitt, British Medical Journal, 1990, March 24, 811-812.