Experimental determination of translation elongation rates

This is a repository that contains scripts for the determination of the translation elongation rates per gene from ribosome profiling runoff experiments. To date, these techniques have not been successfully applied on a per-gene level since ribosome profiling is agnostic to the situation of ribosomes on a single transcript. Therefore, it is hard to extract information about ribosome stalling on an individual transcript from ribosome profiling data. A starting point for our consideration was given by Ingolia et. al.: Ribosome Profiling of Mouse Embryonic Stem Cells Reveals the Complexity and Dynamics of Mammalian Proteomes, Cell (2011), where the so-called SL method is introduced. This method is applied to single genes in Dana, Tuller: Determinants of Translation Elongation Speed and Ribosomal Profiling Biases in Mouse Embryonic Stem Cells, PLoS Comp. Biol. (2012). Last but not least, we also endeavoured a modeling effort ourselves, which we will discuss in the last paragraph.

Ribosome profiling runoff experiments

The general philisophy behind all determinations of translation elongation speed via ribosome profiling is the following: You block translation initiation ¹ with harringtonine and apply cycloheximide later on to block all ribosome movement. This can be done at different time differences. The crucial step is now to construct an observable from ribosome profiling data that changes at the speed of runoff. The following two paragraphs illustrate the SL method using the starting location as an observable that moves at the speed of translation elongation, and the CR method that presents the ratio of riboseq coverage of the first half of the transcript normalized by the coverage of the second half. The latter quantity should decay at the speed of translation elongation.

SL method

This is the original method to compute global estimates of the translation elongation rates using riboseq runoff experiments from Ingolia et. al.. 2011. “Ribosome Profiling of Mouse Embryonic Stem Cells Reveals the Complexity and Dynamics of Mammalian Proteomes.” Cell 147 (4). Here, it is applied on a single-transcript-level (previously done by Dana, Tuller. 2012. "Determinants of Translation Elongation Speed and Ribosomal Profiling Biases in Mouse Embryonic Stem Cells". PLoS Comp. Biol.). The script that implements this method is called differential_translation_speed.py.

Input

Required input files are:

a .tsv file with the transcript ID, corresponding gene ID, and CDS coordinates on the transcript
a dictionary with time separations, .bam/.sam mapping files at the corresponding time separation, and the corresponding .json file with the p-site-offsets
column names indicating the method of reference location determination (SL_locationin our case) and the corresponding error column
the size of bins for depletion curves (15nts in our case)
the number of jackknife bins (for error estimation, usually 10-20)
a cutoff value for chi^2_reduced (transcripts with fits that produce worse values are discarded)

Procedure

The script processes the input as follows:

reading the CDS coordinates, transcript ID, and gene ID into internal dictionaries
creating profiles from the mapping (and offset) files
computing the (SL) locations for each transcript at each jackknife bin
compute the error (from jackknife bins) of for each transcript and time point
perform the fit of each transcript

Output

The following output is created:

the SL curves in file p-sites.tsv
the (SL) locations in file locations.tsv
the locations with errors in locations_with_errors.tsv
the elongation speeds in elongation_speeds.tsv

CR method

The goal of the CR method is to overcome some of the shortcomings of the SL method:

The SL method uses one particular curve as reference (t = 0). This introduces additional noise.
The recovery threshold of the SL method (0.5) is an arbitrary choice. Different choices have a high impact on the result.
Genes that have a high ribosome flux (speed times coverage) dominate the global estimate. The average of single-gene SL estimates is ~ 3 times smaller than the global estimate as described above.
It can only be applied to long enough genes (length of the CDS > 3000nt).
It requires sufficient coverage of a transcript.

The alternative definition of a quantity that varies with the speed of translation elongation is the ratio of the coverage of the first half of the transcript divided by the coverage of the second half. This quantity should linearly fall with the translation elongation rate being the negative slope.

The method is implemented in compute_elong_speed_coverage_ratio.py.

Input

Required input files are:

a .tsv file with the transcript ID, corresponding gene ID, and CDS coordinates on the transcript
a dictionary with time separations, .bam/.sam mapping files at the corresponding time separation, and the corresponding .json file with the p-site-offsets
the number of jackknife bins (for error estimation, usually 10-20)
a cutoff value for chi^2_reduced (transcripts with fits that produce worse values are discarded)

Procedure

The script processes the input as follows:

reading the CDS coordinates, transcript ID, and gene ID into internal dictionaries
computing the coverage ratios from .sam and .json files
compute the error (from jackknife bins) of for each transcript and time point
perform the fit of each transcript

Output

The following output is created:

the coverage ratios in file p-sites.tsv
the coverage ratios with errors in locations_with_errors.tsv
the elongation speeds in elongation_speeds.tsv

License

The code is published under the MIT license.

There is one important caveat to this: Harringtonine actually blocks translation initiation by keeping the ribosome on the start codon to make its first elongation step. Therefore, the start codons are occupied, and no new ribosomes can enter the downstream parts of the CDS. However, that means that the first codon has to be discarded from all sorts of runoff analyses. ↩

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
coverage_ratio_method		coverage_ratio_method
starting_location_method		starting_location_method
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Experimental determination of translation elongation rates

Ribosome profiling runoff experiments

SL method

Input

Procedure

Output

CR method

Input

Procedure

Output

License

About

Releases

Packages

Languages

License

zavolanlab/experimental-determination-elongation-rates

Folders and files

Latest commit

History

Repository files navigation

Experimental determination of translation elongation rates

Ribosome profiling runoff experiments

SL method

Input

Procedure

Output

CR method

Input

Procedure

Output

License

Footnotes

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages