Wednesday, April 24, 2024

The genome effect

Avatar photo
In an industry where selective breeding has contributed to huge productivity gains, unlocking genomic information – effectively hacking into biology – offers huge potential advantages.
Reading Time: 5 minutes

The human and bovine genomes are both about three billion base pairs long and the human genome was mapped in 2000. Piggy-backing on the technology that enabled that breakthrough, the first bovine genome was mapped in 2006. LIC chief scientist Dr Richard Spelman said the pace of development since then had been rapid.

“The first human genome sequence back in 2000 cost somewhere in the vicinity of a couple of billion dollars and probably took 3-4 years to be sequenced. The first bovine was sequenced in 2006, and that cost about US$50 million and probably took 6-12 months.

“Now we can sequence a bull for somewhere around the $2000-$3000 mark, and it probably takes around 3-4 days. It’s absolutely remarkable.”

The biology factory

The aim of genomic selection is to find out what variations in those three billion bases actually affect the trait – or phenotype – of interest. For dairy herds, Spelman said the traits of most interest were those that contributed to the Breeding Worth (BW) index – milk volume, fat and protein, fertility, somatic cell count, longevity, and liveweight.

“We know there’s about 3-5% of the genome that actually codes for protein (genes) – it’s a very small component. The other 95-97% is what used to be known as ‘junk DNA’. Even up to the last 5-10 years, people just thought it really had no use at all, but now as we’ve got more knowledge – and there’s still a lot more knowledge to be gained – we know that variations in that 95-97% can regulate how much of that protein is actually expressed. The variations that occur in the 3-5% can actually change the function of the proteins.”

The progress in genomic research means the focus can expand from variations observed in the 3-5% of the DNA sequence that codes for specific genes, to encompass the whole three billion base pairs of the genome.

“The challenge is to work out which variations actually affect the traits of interest.”

That requires combining two datasets, one that measures the phenotype – for example for milk production traits, progeny-tested bulls or cows that have been herd tested – and one that records the genotype – individual animal blood or tissue samples.

Divergence

For the last six or so years, Spelman said they had focused on measuring about 50,000 variations through the genome.

“These variations may not be the variation itself that causes the change in, say, milk fat production, but it might sit quite close to a variation that actually does cause that. You can get a signal from that, even though it doesn’t actually cause it, since it’s really close to it. It gives us a good indication.

“So we go across those 50,000 markers and we identify the ones that give us a positive or negative signal for an effect on protein or fat or whatever trait we have. Then we pretty much build an equation, taking into account all the markers – we know what the predictive ability is for each one. We can go through and we know that Marker A has a positive effect, Marker B has a negative effect.”

From that they can sum up all the marker effects – generated by the 50,000 variations measured – and estimate the animal’s genetic potential for the trait of interest. Because of the huge number of variations measured, to attain a reasonable level of accuracy, a large sample size is required – the more samples, the more powerful the data.

The next step

But it is the dawning age of big data. Why measure 50,000 variations when you can measure the whole shebang?

Spelman said for the past 18 months, LIC’s focus had shifted to sequencing the whole genome.

“So we’re pretty much measuring all three billion base pairs for an animal. Each individual animal has about 20-25 million variations out of the three billion. What we expect to do through statistical methods probably early- to mid-2015 is that instead of measuring and trying to estimate 50,000 effects, we’ll be trying to do it for probably 20-25 million effects.”

“Instead of having these 50,000 markers that are close to the variations that cause the different expression, we hope now that in our dataset we’ll have all the variations that actually cause that.”

Spelman said the hope was that having this expanded dataset would improve the accuracy of predictions.

Changing technologies have made the cost proposition of sequencing entire genomes more viable.

Progress so far

When LIC started pursuing genomic selection six years ago, Spelman said they had the genotypes of 5000-6000 bulls, from which they estimated trait effects based on their daughters’ herd test information.

“Now we’ve probably genotyped about 115,000 animals in total.”

A lot of those 115,000 animals are cows, and Spelman estimated that it took about nine cows to be genotyped to give the same information as one bull. This was because a cow had just her own lactation performance – her herd-testing history – while a progeny-tested bull had 80-odd daughters in his proof.

How it’s used

Current LIC practice is to genomically screen all potential sire bulls before they are four months old.

“We progeny test just over 200 bulls a year, and those 200 animals are selected from a wider pool of 2000 animals based on genomic testing. We have bought in a more elite group of sires than without the technology,” Spelman said.

They also make the bulls’ semen available two years earlier than they would if they were waiting for progeny testing. So, the generation interval is shortened, improving the rate of genetic improvement.

“The accuracy of the genomic proofs are certainly less than what they are with progeny testing. So we have a reliability of about 80-85% for a bull that has been progeny tested. For the genomics they sit around 55%.”

This means that while the genomic-only bulls should be genetically superior there is a greater volatility in reliability.

Mapping biology

Deoxyribonucleic acid (DNA) is the molecule that holds the each species biological map. Each organism has a unique bundle of DNA molecules – their chromosome – located in the nucleus of each of their cells.

DNA is made up of chemical building blocks called nucleotides that are linked into chains to form a strand of DNA. These building blocks are made up of three parts – a phosphate group, a sugar group, and one of four different nitrogen bases.

The four types of nitrogen bases are: adenine (A), thymine (T), guanine (G), and cytosine (C). The order of the bases – the sequence – determines the instructions held in each DNA strand. Pairs of the bases make up the “rungs” of the double helix ladder. Chemical properties mean A always pairs with T, and C always pairs with G.

During cell division, this structure enables the molecule to copy itself. The helix splits down the middle, becoming two single strands which effectively act as templates for two new, double-stranded DNA molecules.

A gene is a specific DNA sequence that contains instructions to make a specific protein. For example, ATCGTT might instruct for blue eyes. The size of a gene varies, ranging from about 1000 bases up to one million bases in humans.

The complete DNA map, or genome, for a human contains about three billion bases, and about 20,000 genes on 23 pairs of chromosomes.

Total
0
Shares
People are also reading