Bioinformatics Tools

Pages

Wednesday, April 30, 2014

How to make a protein soluble?

Cloning, expression and purification of difficult to clone, express and purify proteins in E. coli 


I have got some mails in relation to the expression of difficult to purify proteins, so I thought of making a short do's and don't's. For pure bioinformatic people, please bear with me for a couple of posts. First of all it is important to know about the protein, gather as much information about the protein as you can. All those small pieces of information help a lot if kept in mind while designing the strategy for cloning, expression and purification of the proteins. Also be informed about the source of protein, eukaryotic or prokaryotic or any others source. Some of the basic parameters like the size of the protein, PI, amino acid composition etc. pays a vital role in designing the strategy. Here are some tools to look for such information I have compiled on this blog before http://bioinformatictools.blogspot.in/2014/04/functional-annotation-of-hypothetical.html and http://bioinformatictools.blogspot.in/2011/11/in-silico-characterization-of-proteins.html. Look for other sources too. Main theme is to find as much information about the protein as much one could. I am not a big fan of purifying the protein under denaturing condition. There are lots of question that are difficult to answer if the protein needs to be refolded from denaturing conditions, like if the protein has folded properly, if this is the way the protein is natively folded and not just any random refolding of the protein, which are difficult to demonstrate experimentally until you already have some assay in mind. Since I have tried that too I will end by suggesting what all I have learned on that part. 

Downstream experimental procedures: Before designing strategy for Cloning, expression and purification of protein, it is wise to determine the downstream experimental procedure you are going to perform and strategy for Cloning, expression and purification mainly depends on this. At times it is possible to purify the protein in soluble form in very small amount using a very large culture (which is ok, if you need very small amount of protein for downstream experiments) for which one need not go through all the standardization experiments with trials in different vectors and host cells. However, in case if large amount of protein is required (such as in crystallization experiments) it is advised to optimize the purification process overall.

Read as much as you can: There are various resources available for suggestions for cloning, expression and purification of the protein in soluble fraction (i.e. QIAexpress handbook). But please keep in mind that it’s easy to suggest in wet lab work but it takes a lot of time and energy to perform the experiments the way one wishes to, so try what you think is logical and more importantly easily available to you (do-able).

Membrane or membrane associated protein: check if the selected protein is Membrane or membrane associated protein. This can be done by using surface localization tools, some of them are listed here http://bioinformatictools.blogspot.in/2007/09/predicting-subcellular-localization-of.html. Also, check if the protein Transmembrane domain (TMHMM http://www.cbs.dtu.dk/services/TMHMM/) or signal peptide (Signal Phttp://www.cbs.dtu.dk/services/SignalP/) in it. These are hydrophobic regions and are normally intrinsically disordered.  Membrane proteins are bit tough to get in soluble form till one removes the transmembrane or signal peptide part. It is logical to remove the initial (normally N-terminal) transmembrane or signal peptide part to get the functional domain or multiple domains in soluble form. (I had similar problem with a protein I was working on, when removed the signal peptide and transmembrane domain, it solved everything, got the protein into soluble fraction and got purified as charm, got it crystallized also). 

Check for the functional domain in protein if any:  This will help in determining the probable function the protein might be having. This will also indicate the other proteins with similar domain and their nature with respect to the cloning, expression and purification of the protein in E. coli. If you can find the protein with the similar domain use the cloning, expression and purification protocol for target protein. Also, for some of the protein the sequence based analysis results/characters change with addition of the tag, keep this in mind too, it might lead to change in PI or so on.

Optimize the temperature: Try different temperature for growth and induction. Induction temperature is more crucial.
  1. Try growing cells at 37 C and induction at 37 C.
  2. Try growing cells at 37 C and induction at 25 C for long time.
  3. Try growing cells at 37 C and induction at 16 C for long time.
  4. Try growing cells at 25 C and induction at 16 C for long time.
  5. Try growing cells at 37 C followed by chilling at 16 C at least one hour before induction.

        Low temperature decreases the rate of protein synthesis and usually more soluble protein is obtained. Also, if the temperature is reduced before induction of the cells, it is more likely to yield protein in soluble fraction, it kind of diverts from the pathway of going into inclusion bodies (Sorry, I do not know how).

        Optimize the IPTG concentration: it is a good idea to check a gradient in a small scale for the amount of IPTG (using a range from 0.1, 0.2, 0.3 ….mM) required for optimal expression level of the protein. Normally, IPTG is required at very low levels for optimal expression and using higher concentration not only is costly, but also doesn’t show much improvement in the expression level of the protein.

        Use a large tag, but make sure to make and arrangement to remove it once you have the protein: Larger tags like intein tag, His-SUMO, GST tag, MBP (maltose binding protein) etc. are known to increase the solubility of proteins, use them if you have the corresponding vectors easily available for them.

        Change the vector: using a weaker promoter (e.g. trc instead of T7) and using a lower copy number plasmid normally increases the chance of protein to be purified in soluble fraction. Also, using N- and/or C- terminal tags (in various vectors) affects the solubility of the protein, especially in those protein where folding is dependent on any of these terminals.

        Change the host cells: Some of the E. coli strains are better capable of handling toxic or membrane proteins in comparison to others. I had very good experience working with C41 and C43 strains which I came to know through this paper http://www.ncbi.nlm.nih.gov/pubmed/15294299. There are also pLysS versions of these strains, I did not try but you can read and try. Other strains like rosetta etc. might also be good to try (depends upon the strains you can get your hands on) (So, beg, borrow or steal ;)). For a new protein I usually perform as many changes one by one as I can do at small scale and then move them onto large scale. Also, check if your protein is using codons that are rarely used in E. coli. You can check ‘rare codon usage’ using different software available.

        Change the culture media: After changing and optimizing as many parameters I could, I was getting low level of protein in soluble fraction in LB media, I read somewhere that someone had good yield with the Terrific Broth, I tried and it gave a way more protein in soluble fraction. I was happy to use it thereafter for any protein I had to purify.  

        Use Auto-induction media: it will be worthwhile trying auto-induction. The idea is that instead of using an inducing agent like IPTG one uses the native function of the T7 promoter. So if you use media containing glucose and lactose and grow the cells, as the glucose is depleted, the cells will slowly start activating their T7 promoters which will start using lactose in place of glucose. This will also induce the promoters on your expression vector and lead to a much more gradual expression than from using IPTG.

        No comments: