You are currently viewing An Animal Breeder’s Approach to Producing High-Quality Working Dogs

An Animal Breeder’s Approach to Producing High-Quality Working Dogs


Genetic improvement in a set of traits occurs when dogs selected to become parents of the next
generation are superior in performance compared to the population from which they were

The amount of genetic change to expect from one generation of selection is governed
by both the heritability of the trait(s) being improved and by the degree to which average
performance of newly selected parents exceeds the average performance of the population
from which they were selected (Falconer and Mackay, 1996).

In 1980, these principles of genetic selection were implemented at The Seeing Eye in the
successful production of several thousand German Shepherd Dogs, Labrador Retrievers,
Golden Retrievers and some Labrador by Golden Retriever crossbred dogs. The methodical
approach to choosing replacement breeders and deciding how those breeders were mated
resulted in improved hip quality and an improved ability to be trained to guide blind people.
The purpose of this paper is to describe the breeder selection process used by The Seeing Eye
to obtain genetic improvement in both hip quality and trainability. It will also document the
amount of genetic improvement actually realized for these two traits in each of the three pure

Materials and Methods

An organized breeding plan must address four key components:

  1. Define goals to be obtained
  2. Decide whether to use pure breeding or crossbreeding as the primary method for producing new
  3. Define and implement a record keeping system with accurate phenotypic measurements capable of
    supporting the breeding plan
  4. Define and implement objective criteria for selecting replacement breeders

The Seeing Eye addressed each of these four components in 1980 and began implementing the
current breeding plan in that year. The objectives were clear: dramatically improve hip quality
while simultaneously improving the ability of dogs to be trained for work as guides. Purebred
production was chosen as the primary method for producing new offspring. A record keeping
system was developed using computer equipment that pre-dated introduction of the IBM
personal computer by two years. Over the intervening years, three additional record keeping
systems were developed, with data collected by each preceding system being imported into its
successor. The current record keeping system contains information on over 18,000 dogs, some
of which were born in the 1970’s. Many of those early dogs define the foundation generation in
pedigrees of puppies currently being born, now up to nine generations later.

Phenotypic measurements for hip quality included both an extended view and a distraction view
radiograph. Each extended view radiograph was scored by a single veterinary radiologist using
a 9-point scale (Leighton, 1997), where 9 is the most desirable.

All films were scored by the same radiologist. To aid in summarizing the change in hip quality over time, 
extended view hip scores 1-5 were classified as dysplastic while scores greater than 5 were classified as 
normal. All distraction view radiographs were submitted to PennHip (Smith, ???) for scoring. Because
the PennHip procedure was developed in the mid- to late-1980’s and perfected into the 1990’s,
hip quality of few dogs from the earlier years was assessed by this technique. Beginning in the
early-1990’s, hip quality of all dogs was assessed by both an extended view hip score and a
PennHip score. Radiographic films for both techniques were taken when dogs returned to The
Seeing Eye to begin training. Most dogs were at least 14 months old, but some were as old as
18-20 months of age.

The trainability score is the phenotypic measure assessing a dog’s ability to be trained to guide
blind people. Methodology for the trainability score was developed by The Seeing Eye in the
early 1980’s. Since then, only four different people have assigned the score. At its most
elementary level, the trainability score is a comparison rating that ranks dogs into one of 9
classes with 9 being the most trainable dogs and 1 being the least trainable.. Each dog is given
a score that reflects its ability to be trained as a guide relative to all other dogs of that breed
scored in the last 1-3 months. As time passes, the quality of dogs improves. The best to worse
rank scale is a successful measure because it is always accounting for the change in score
definition over time. This change in score definition over time is accounted for in the statistical
model by including a term for contemporary groups. A contemporary group is simply defined as
all dogs of a given breed passing through the system in the same calendar quarter.

Estimated breeding values (EBV’s) (Mrode, 2005) provided a means to use the phenotypic
measures to make objective selection decisions among available candidates. EBV’s are
statistical calculations utilizing all of data on the individual and its relatives to provide an
estimate of how well each dog would transmit desired traits to its offspring compared to other
dogs in the population. Breeders are selected as the dogs with the best EBV’s. EBV’s were
calculated nightly by the Seeing Eye record-keeping system, since EBV’s change as new
phenotypic data are entered into the database each day.

EBV’s for the PennHip score and the trainability score were combined into one overall index
value, with trainability being weighted with twice the importance of hip quality. This single
number formed the basis for deciding which litters were most likely to contain superior breeding
candidates. Based only on their pedigree information, litters of 9-month old dogs with the
highest overall index are marked as possible breeder candidates. When dogs return to The
Seeing Eye from their puppy development homes at 14 months of age, they are thoroughly
scrutinized, both from a health perspective and for trainability. Any dogs marked as possible
breeder candidates identified with health problems like megaesophagus, elbow dysplasia, or
inherited ophthalmic conditions were eliminated as breeder candidates. Remaining breeder
candidates completed a one-month long compressed training regimen, wherein they
demonstrated their ability and willingness to be trained for work as guides.

To track genetic change over time, a generation coefficient was calculated for each litter by
adding 1 to the average generation coefficients of the sire and dam (Pattie, 1965). Foundation
animals for which parents were unknown were assigned a generation coefficient of zero. A
mating between two zero generation parents produced first generation offspring, while a mating
between a first generation sire and a zero generation dam, for example, produced offspring with
a generation coefficient of 1.5. To summarize the data, generation coefficients were classified
by rounding them to the nearest whole number.

All breeders were housed in a separate breeding center, where they remained for their
approximate 2-year breeding career. Males were kept in breeding until they completed 8
matings or until their replacement was found. Females were kept in breeding until 48 months of
age or until their replacement was identified. In recent years, the goal was to produce about 600
puppies per year, from which approximately 55% were eventually chosen for breeding or trained
for work as guides. Many dogs not chosen for use by The Seeing Eye were chosen for work by
police agencies or by U.S. Government agencies. The remaining unused dogs became pets.


By breed, descriptive statistics are shown in Table 1 for two measures of hip quality and the
trainability score for all dogs born into generation 1 or later after fiscal year 1980.

As assessed by the extended view hip score, genetic change in hip quality is shown in Table 2
for each breed. Based on this criterion, hip dysplasia dropped below 5% of dogs affected by the
6th generation in Labrador Retrievers and by the 7th generation in German Shepherd Dogs and
Golden Retrievers. Generation class means are significantly different (P<0.01) for all three
Hip quality assessed by the PennHip score is summarized across generations in Table 3. For all
three breeds, sex of the dog and generation class explained a significant (P<0.001) part of the
variation observed in PennHip score.

The trainability score is a comparison ranking of dogs into 1 of 9 score groups with dogs with
high scores being superior in their ability to be trained for work as guides compared with lower
scored dogs. The criterion for assignment of the score can change from one calendar quarter to
the next one, but it is constant in this scoring protocol that superior dogs are ranked higher than
dogs with lesser trainability. Even though mean value of the trainability score has changed only
slightly over multiple generations of selection, genetic quality of the dogs has steadily improved
because the superior dogs with respect to trainability are kept for breeding.
The genetic model used for calculating trainability score estimated breeding values included
pedigree information and a term for contemporary groups, which were the calendar quarters
across time when dogs were evaluated. To view the trend in genetic change of the trainability
score, trainability EBV’s across generations of selection are summarized by generation class in
Table 4. They are expressed in genetic standard deviation units to make it easier to interpret the

In 6 generations of selection in German Shepherds and 8 in Labrador Retrievers, the trainability
score improved by more than two genetic standard deviations. In Golden Retrievers, essentially
no genetic improvement was realized over 7 generations, which might be a reflection of the
smaller population size for this breed. It is well known (Falconer and Mackay, 1996) that chance
plays a much larger role than selection in producing genetic change as population size
decreases. Over the years, there was also an influx of numerous Golden Retrievers into the
population, some of which came from non-guide dog sources.


For phenotypic measurements to be useful as a selection tool, they must differentiate among
individual breeding candidates. By the 6th generation of selection using extended view hip
scores, hip quality had improved to the point in both German Shepherd Dogs and Labrador
Retrievers that almost all dogs received a score of 8. From about the 7th generation onward in
these two breeds, few dogs were lost from either training or field service due to hip dysplasia, so
selection had worked to produce higher-quality hips.
When hip quality was assessed in more recent generations using the PennHip score, however,
it was clear that a wide range in laxity still exists among dogs receiving an extended view hip
score of 8. Clearly, there remains a need to continue placing some selection pressure on hip
quality using the PennHip score, if for no other reason than to guard against allowing hip quality
to begin degenerating. This degeneration, if it occurred, would result from choosing dogs for
breeding based on a high extended view score only, which would be equivalent to choosing
dogs at random with respect to laxity as measured by PennHip.

Phenotypic measures need to be validated to show that they actually measure the trait
intended. Validation involved verifying that the measure is predictive and demonstrating it was
heritable. A novel scoring system in the trainability score was developed by The Seeing Eye
Since because there was not a measure of trainability available in 1980 that could be used to
genetically improve the ability of dogs to be trained for work as guides. This system has
continued to be effective in improving trainability because the definition of score quality was
relative to the contemporary group of dogs being assessed. This allowed the scale to remain
effective in defining best to worst and supported The Seeing Eye’s goal to make genetic
change over time. In contrast, the usefulness of the fixed 9-point extended view hip quality scale
degenerated over time because as quality improved, almost all dogs had the same high score.
Another key factor in the ability of the trainability score to work over decades of selection was
consistency by the person doing the scoring, because only four highly-skilled, very senior dog
trainers have assigned the scores.

Genetic improvement in any heritable trait can be directed with sound methods of selective
breeding. For this to work, the process must include accurate phenotypic measures and a
consistent application of selection methods over time. Use of these methods by The Seeing Eye
has yielded a steady supply of high-quality purpose-bred dogs.


  • Falconer DS and Mackay TFC. 1996. Introduction to Quantitative Genetics, 4th Ed. Addison Wesley
    Longman Ltd, Essex, England.
  • Leighton EA. 1997. Genetics of canine hip dysplasia. J Am Vet Med Assoc 210: 1474-1479.
  • Mrode RA. 2005 Linear models for the prediction of animal breeding values, 2nd Ed. CABI Publishing,
    Cambridge, MA 02139 USA.
  • Pattie WA. 1965. Selection for weaning weight in Merino sheep, 1: Direct response to selection. Aust
    J Exp Ag 5: 353-360.
  • Smith GK , Biery DN and Gregor TP. New concepts of coxofemoral joint stability and development of
    a clinical stress-radiographic method for quantitating hip joint laxity in the dog, JAVMA. 196: 59-70.