SUPPLEMENTARY INFORMATION for OptSSeq: High-throughput sequencing readout of growth enrichment defines optimal gene expression elements for homoethanologenesis Indro Neil Ghosh and Robert Landick Table S Table S2 Table S3 Table S4 Table S5 Table S6 Table S7 Table S8 Figure S Figure S2 Figure S3 Figure S4 Figure S5 Figure S6 Figure S7 Figure S8 Figure S9 Figure S0 Supplementary Tables Strains and plasmids used in study Primers used in study Sequences and predicted TIRs of RBSs tested Sequences, CSS and predicted RNAP binding strengths of promoter tested Apparent OD600 and ethanol yields of strain libraries during passagings Normalized rates of RFP production from a set of promoters Enzyme levels in strains driving Pdc from strong predicted RBSs σ 70 level measurements in E. coli strain lysates Supplementary Figures End product predictions and measurements for homoethanologens Average growth rates of libraries obtained at the end of each passage of growth enrichment Primer binding sites and sections of homoethanologenic cassettes amplified to enable high throughput sequencing of promoters and RBSs Promoter sequences present in library populations after each passage of growth enrichment RBS sequences present in library populations after each passage of growth enrichment Potential compensatory co-selection of promoters and first-gene RBSs Selection of adhb and adha RBSs in adhe and adhe + strains Degree of enrichment (Ex) for each gene expression element at each stage of library preparation and growth enrichment Quantitation of protein levels by western blot Aggregate protein levels in E. coli strains containing the ethanologenic plasmids before and after optimization (including σ 70 levels) compared to levels produced by Z. mobilis
SUPPLEMENTARY TABLES TABLE S: Strains and plasmids used in study Name Strains MG655 RL3000 RL308 RL309 Description E. coli K-2 F λ ilvg rfb-50 rph- crl::ins-i yche::ins5-u glpr gatc flhdp::insab-5 MG655 ilvg + rph + rfb-50 ycii::82bp nudf(g02v) Δ(glcByghO) crl + (Δins-I) ybhj(l54i) yche + (Δins5-U) mntp(g25d) yecd(n86h) gatc + glpr + flhd + ( insab-5) RL3000 acka frda ldha RL3000 acka frda ldha adhe Source or Note Blattner et al, 997 DH0B E. coli K-2 F λ enda deor + reca, gale5, galk6, nupg, rpsl35(str R ) Δ(lac)X74 φ80laczδm, arad39 Δ(ara,leu)7697 mcra - Δ(mrr-hsdRMS-mcrBC) Durfee et al, 2008 2 Z. mobilis Z. mobilis ZM4 ATCC 382 ATCC s a pbbr - MCS-5 ppbwt ppbasyn prh52 prs002 ppba00 derived from the plasmid pbbr with the cryptic pbbr ori encoding gentamycin resistance; aacc(gn R )-Plac-lacZα-MCS-[mob-rep](pBBRori) derived from pbbr-mcs5 and ploi295 containing in order pbbrori-aacc(gn R )-Plac-RBSZM_pdc-Z. mobilis pdc- RBSZM_adhB-Z. mob. adhb encoding codon-optimized pdc, adhb and adha; pbbrori- Pdps00-PydfZ-RBSZM_pdc-pdc- RBSZM_adhB-adhB RBSZM_adhA-adhA-evoglow-flp-aphA(Kn R )-flp-tp22-aacc(gn R ) derived from pbbr-mcs-5 encoding spectinomycin resistance gene aada and codon optimized laci pbbrori-aada(sp R )-Plac-lacI-TBsu glna-ptrc-laczα-gfp-tp22 derived from prh52 with a MCS flanked by terminators. MCS contains EcoRV site. pbbrori-aada(sp R )- TBsu glna-mcs-gfp-tp22 derived from prs002 and ppbasyn encoding library of PBA cassettes containing in order: pbbrori-aada(sp R )- TBsu glna-pprom_lib-rbspdc_lib-pdc-rbsadhb_lib-adhb-rbsadha_libadha-gfp-tp22 Kovach, et al 995 3 Gardner and Keating, 200 4
pabp00 ppb00 ppa00 pgr-bba- B000 ping00 derived from prs002 and ppbasyn encoding library of ABP cassettes containing in order: pbbrori-aada(sp R )- TBsu glna-pprom_lib-rbsadha_lib-adha-rbsadhb_lib-adhb-rbspdc_libpdc-gfp-tp22 derived from prs002 and ppbasyn encoding library of PB cassettes containing in order: pbbrori-aada(sp R )- TBsu glna- Pprom_lib-RBSpdc_lib-pdc-RBSadhB_lib-adhB-gfp-TP22 derived from prs002 and ppbasyn encoding library of PA cassettes containing in order: pbbrori-aada(sp R )- TBsu glna- Pprom_lib-RBSpdc_lib-pdc-RBSadhA_lib-adhA-gfp-TP22 pbr322 based plasmid encoding ampicillin (Ap) resistance and arabinose inducible promoter. Developed to test termination efficiency of rrnbt terminator placed between gfp and rfp. pbr322ori-bla(ap R )-arac-pbad-gfp-trrnbt-rfp-trrnbt derived from prh52 and pgr-bba-b000 designed to test transcription from promoter#. pbbrori-aada(sp R )-PT7A-gfp-TP22-TBsu glna-ppromoter#-rfp-trrnbt Chen et al, 203 5 ping002 derived from ping00 encoding promoter#2 ping003 derived from ping00 encoding promoter#3 ping009 derived from ping00 encoding promoter#9 ping03 derived from ping00 encoding promoter#3 ping05 derived from ping00 encoding promoter#5 ping025 derived from ping00 encoding promoter#25 ping027 derived from ping00 encoding promoter#27 ping03 derived from ping00 encoding promoter#3 ping033 derived from ping00 encoding promoter#33 ping049 derived from ping00 encoding promoter#49 ping05 derived from ping00 encoding promoter#5 pl7a04f pl7a2e pl7a03e PBA plasmid that encodes promoter ID#37, pdc RBS ID# pbbrori-aada(sp R )-TBsu glna-ppromoter#37-rbspdc#-pdc- RBSadhB#4-adhB RBSadhA#2-adhA-gfp-TP22 PBA plasmid that encodes promoter ID#37, pdc RBS ID#2 pbbrori-aada(sp R )-TBsu glna-ppromoter#37-rbspdc#2-pdc- RBSadhB#5-adhB RBSadhA#6-adhA-gfp-TP22 PBA plasmid that encodes promoter ID#37, pdc RBS ID#3 pbbrori-aada(sp R )-TBsu glna-ppromoter#37-rbspdc#3-pdc-
pla08b prm630 ping004 ping04 ping042 RBSadhB#5-adhB RBSadhA#3-adhA-gfp-TP22 PBA plasmid that encodes promoter ID#37, pdc RBS ID#4 pbbrori-aada(sp R )-TBsu glna-ppromoter#37-rbspdc#4-pdc- RBSadhB#9-adhB RBSadhA#4-adhA-gfp-TP22 Expression plasmid derived from pet28 encoding 0 his tagged NusG ColEori-lacI-PT7-lacO-0 his-ppx-nusg-tt7-m3oriapha(kn R ) derived from prm630 to express 0 his tagged Pdc ColEori-lacI-PT7-lacO-0 his-ppx-pdc-tt7-m3ori-apha(kn R ) derived from prm630 to express 0 his tagged AdhB ColEori-lacI-PT7-lacO-0 his-ppx-adhb-tt7-m3oriapha(kn R ) derived from prm630 to express 0 his tagged AdhA ColEori-lacI-PT7-lacO-0 his-ppx-adhb-tt7-m3oriapha(kn R ) Sevostyano va et al, 20 6 a Gene names with asterisks (e.g., pdc) indicate codon-optimized genes
TABLE S2: Primers used in study a Primer names Primer Sequence (5-3 ) To generate promoter libraries Prom_ Prom_2 Prom_3 Prom_PBA_4 TATGTTAATACACCATCACAGAATTGTGAGCGCTCACAATCTAGGTCTATGAGTG GT ATVCGAGCCGGATGATTAATTRTMARGAGGTCCAGCAACCACTCATAGACCTAGA TTGTG AATTAATCATCCGGCTCGBATAATGBGTGGAATTGGTAGAGTATTTTTATTGCGC GGTCA TTAGATGATATATGGCGGGTGACCGCGCAATAAAAATACT Prom_ABP_4 AAGATAGTGGATTTAAGGTGACCGCGCAATAAAAATACT To generate fragment libraries to construct plasmids PBA_pdc_fw GAGTATTTTTATTGCGCGGTCACCTTAAATCCACTATCTTMAGGABRTGTTACAT GTCCTATACTGTC PBA_pdc_rev CATACTAGATTGCAAAAATTACAGCAACTTATTGAC PBA_adhb_fw GCCGGTCAATAAGTTGCTGTAATTTTTGCAATCTAGTATGSCCTTAWGKGKGATA GCTATGGCCTCGAGCAC PBA_adhb_rev TTAGATGATATATGGCGTTAGAATGCGC PBA_adha_fw AGCTGTTCCTGAGCGCATTCTAACGCCATATATCATCTAARGABGWTCACCATGA AAGCTGCAGTTATC PBA_adha_rev TTTCTAGAACTAGGGATCCCCCGGGCTGCAGGAATTCGATGTTAATGATGCGTGA AGTCGAC ABP_adha_fw GAGTATTTTTATTGCGCGGTCACCCGCCATATATCATCTAARGABGWTCACCATG AAAGCTGCAGTTATC ABP_adha_rev CATACTAGATTGCAAAAAGTTAATGATGCGTGAAGTCGAC ABP_adhb_fw GTCGACTTCACGCATCATTAACTTTTTGCAATCTAGTATGSCCTTAWGKGKGATA GCTATGGCCTCGAGCAC ABP_adhb_rev AAGATAGTGGATTTAATTAGAATGCGCTCAGGAACA ABP_pdc_fw GAGCTGTTCCTGAGCGCATTCTAATTAAATCCACTATCTTMAGGABRTGTTACAT GTCCTATACTGTC ABP_pdc_rev TTTCTAGAACTAGGGATCCCCCGGGCTGCAGGAATTCGATTTACAGCAACTTATT GACCG PB_adhb_rev TTTCTAGAACTAGGGATCCCCCGGGCTGCAGGAATTCGATTTAGAATGCGCTCAG
PA_adha_fw GAACA GCCGGTCAATAAGTTGCTGTAATTTTTGCAATCTAGTATGRGABGWTCACCATGA AAGCTGCAGTTATC To amplify sections of plasmid libraries for sequencing prom_up_fw AATCTTCGGTAGTCCAGCGGGTCTATGAGTGGTTGCTGGA pdc_end_fw AATCTTCGGTAGTCCAGCGGCCGGTCAATAAGTTGCTGT adhb_end_fw AATCTTCGGTAGTCCAGCGCGCATTCTAACGCCATATATCA adha_end_fw AATCTTCGGTAGTCCAGCGTGTCGACTTCACGCATCATT adhb2_end_fw AATCTTCGGTAGTCCAGCGTGTTCCTGAGCGCATTCTAA pdc2_end_fw AATCTTCGGTAGTCCAGCGGCCGGTCAATAAGTTGCTGT pdc_st_rev TGTAGGCTGGAGCTGCTTCGCAGACGCTCCGCTAAATAGG adhb_st_rev TGTAGGCTGGAGCTGCTTCGACGTGCTCGAGGCCATAG adha_st_rev TGTAGGCTGGAGCTGCTTCGGGCGTAATTTGGTGTCTTTCA 5 Stem + Index AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGA TCTNNNNNNNNNNAATCTTCGGTAGTCCAGCG 3 Stem CAAGCAGAAGACGGCATACGAGATCTTCCGATCTTGTAGGCTGGAGCTGCTTCG a Annealing sequences are in bold, and variable sequences are in red.
TABLE S3: Sequences and predicted TIRs of RBSs tested Cassette pdc RBS region encoding 2 RBSs b,c PBA ABP PB PA ppbasyn ppbwt TATTGCGCGGTCACCTTAAATCCACTATCTTMAGGABRTGTTACATG CTGAGCGCATTCTAATTAAATCCACTATCTTMAGGABRTGTTACATG TATTGCGCGGTCACCTTAAATCCACTATCTTMAGGABRTGTTACATG TATTGCGCGGTCACCTTAAATCCACTATCTTMAGGABRTGTTACATG TGCGCGGTCACCTTAAAAAATCCACTTAAGAAGGTAGGTGTTACATG TCATCCTGATTCAGACATAGTGTTTTGAATATATGGAGTAAGCAATG Predicted TIR a RBS ID Sequence b PBA, PB, PA ABP TAAGGAGGT 9467 5566 2 TCAGGAGGT 22427 4299 3 TAAGGAGAT 48 6960 4 TCAGGAGAT 4242 4056 5 TAAGGATGT 3706 2259 6 TCAGGATGT 377 37 7 TAAGGATAT 44 878 8 TAAGGACGT 203 733 9 TCAGGATAT 535 52 0 TCAGGACGT 447 427 TAAGGACAT 63 373 2 TCAGGACAT 228 28 ppbasyn See above 3707 ppbwt See above 4232 Cassette PBA ABP PB ppbasyn ppbwt adhb RBS region encoding 6 RBSs b,c TGCTGTAATTTTTGCAATCTAGTATGSCCTTAWGKGKGATAGCTATG TCATTAACTTTTTGCAATCTAGTATGSCCTTAWGKGKGATAGCTATG TGCTGTAATTTTTGCAATCTAGTATGSCCTTAWGKGKGATAGCTATG TCCAGCTCGGTACCCAATCTAGTATGTAGGGTGAGGTTATAGCTATG TCGAGCTCGGTACCCAAACTAGTATGTAGGGTGAGGTTATAGCTATG RBS ID Sequence b Predicted TIR a in PBA, ABP, PB, PA
GCCTTAAGGGGGA 23459 2 CCCTTAAGGGGGA 5557 3 GCCTTATGGGGGA 260 4 CCCTTATGGGGGA 00 5 GCCTTAAGTGGGA 839 6 GCCTTAAGGGTGA 733 7 CCCTTAAGGGTGA 70 8 CCCTTAAGTGGGA 64 9 CCCTTAAGTGTGA 32 0 GCCTTATGGGTGA 276 CCCTTATGGGTGA 252 2 GCCTTATGTGGGA 39 3 CCCTTATGTGGGA 27 4 GCCTTAAGTGTGA 8 5 CCCTTATGTGTGA 38 6 GCCTTATGTGTGA 8 ppbasyn See above 396 ppbwt See above 526 Cassette adha RBS region encoding 6 RBSs b,c PBA CCTGAGCGCATTCTAACGCCATATATCATCTAARGABGWTCACCATG ABP TTATTGCGCGGTCACCCGCCATATATCATCTAARGABGWTCACCATG PA AATAAGTTGCTGTAATTTTTGCAATCTAGTTAARGABGWTCACCATG ppbasyn CCTGAGCGCATTCTAACGCCATATATCAACAAAAGGTAGTCACCATG Predicted TIR a RBS ID Sequence b PBA ABP PA TAAGGAGGT 287767 252523 2493 2 TAAGGAGGA 222625 9093 583 3 TAAGGACGT 2439 34859 670 4 TAAGGATGT 33626 33626 670 5 TAAAGAGGT 69087 3375 203 6 TAAGGACGA 720 6440 409
7 TAAAGAGGA 3069 3069 802 8 TAAGGATGA 4642 4642 409 9 TAAAGACGT 4056 4056 372 0 TAAAGATGT 974 974 356 TAAAGACGA 98 98 372 2 TAAAGATGA 249 249 284 ppbasyn See above 5067 a Predicted TIRs calculated using RBS calculator. 7 b RBS is bold. c Variable sequences are red.
TABLE S4: Sequences, CSS and predicted RNAP binding strengths of promoter tested Promoter sequence gtggttgctggacctcytkayaattaatcatccggctcgbataatgbgtggaattg ID Promoter Sequence (-34 to -4) CSS a Binding b TTGACAattaatcatccggctcgTATAATGTG.0000 00 2 TTGACAattaatcatccggctcgTATAATGGG 0.9999 69 3 TTGACAattaatcatccggctcgTATAATGCG 0.9995 88 4 TTGATAattaatcatccggctcgTATAATGTG 0.9565 5 TTGATAattaatcatccggctcgTATAATGGG 0.9564 76 6 TTGATAattaatcatccggctcgTATAATGCG 0.9560 97 7 TTGACAattaatcatccggctcgCATAATGTG 0.923 3 8 TTGACAattaatcatccggctcgCATAATGGG 0.922 2 9 TTGACAattaatcatccggctcgCATAATGCG 0.9209 27 0 TTGATAattaatcatccggctcgCATAATGTG 0.8778 35 TTGATAattaatcatccggctcgCATAATGGG 0.8777 24 2 TTGATAattaatcatccggctcgCATAATGCG 0.8774 3 3 TTGACAattaatcatccggctcgGATAATGTG 0.8689 23 4 TTGACAattaatcatccggctcgGATAATGGG 0.8688 5 5 TTGACAattaatcatccggctcgGATAATGCG 0.8685 20 6 TTTACAattaatcatccggctcgTATAATGTG 0.8578 37 7 TTTACAattaatcatccggctcgTATAATGGG 0.8577 25 8 TTTACAattaatcatccggctcgTATAATGCG 0.8573 32 9 TTGATAattaatcatccggctcgGATAATGTG 0.8254 25 20 TTGATAattaatcatccggctcgGATAATGGG 0.8253 7 2 TTGATAattaatcatccggctcgGATAATGCG 0.8249 22 22 TTTATAattaatcatccggctcgTATAATGTG 0.842 4 23 TTTATAattaatcatccggctcgTATAATGGG 0.84 28 24 TTTATAattaatcatccggctcgTATAATGCG 0.838 36 25 TTTACAattaatcatccggctcgCATAATGTG 0.779 26 TTTACAattaatcatccggctcgCATAATGGG 0.7790 6.7 27 TTTACAattaatcatccggctcgCATAATGCG 0.7786 9.
28 TTTATAattaatcatccggctcgCATAATGTG 0.7356 2 29 TTTATAattaatcatccggctcgCATAATGGG 0.7355 7.6 30 TTTATAattaatcatccggctcgCATAATGCG 0.735 0 3 TTTACAattaatcatccggctcgGATAATGTG 0.7267 7.2 32 TTTACAattaatcatccggctcgGATAATGGG 0.7266 4.3 33 TTTACAattaatcatccggctcgGATAATGCG 0.7262 6. 34 TTTATAattaatcatccggctcgGATAATGTG 0.683 8.2 35 TTTATAattaatcatccggctcgGATAATGGG 0.6830 5.0 36 TTTATAattaatcatccggctcgGATAATGCG 0.6827 7.0 37 CTGACAattaatcatccggctcgTATAATGTG 0.48 35 38 CTGACAattaatcatccggctcgTATAATGGG 0.480 24 39 CTGACAattaatcatccggctcgTATAATGCG 0.476 3 40 CTGATAattaatcatccggctcgTATAATGTG 0.3746 39 4 CTGATAattaatcatccggctcgTATAATGGG 0.3744 27 42 CTGATAattaatcatccggctcgTATAATGCG 0.374 34 43 CTGACAattaatcatccggctcgCATAATGTG 0.3394 0 44 CTGACAattaatcatccggctcgCATAATGGG 0.3393 6.2 45 CTGACAattaatcatccggctcgCATAATGCG 0.3390 8.5 46 CTGATAattaatcatccggctcgCATAATGTG 0.2959 47 CTGATAattaatcatccggctcgCATAATGGG 0.2958 7.2 48 CTGATAattaatcatccggctcgCATAATGCG 0.2954 0 49 CTGACAattaatcatccggctcgGATAATGTG 0.2870 6.8 50 CTGACAattaatcatccggctcgGATAATGGG 0.2869 4.0 5 CTGACAattaatcatccggctcgGATAATGCG 0.2865 5.7 52 CTTACAattaatcatccggctcgTATAATGTG 0.2758 2 53 CTTACAattaatcatccggctcgTATAATGGG 0.2757 7.8 54 CTTACAattaatcatccggctcgTATAATGCG 0.2754 0 55 CTGATAattaatcatccggctcgGATAATGTG 0.2435 7.7 56 CTGATAattaatcatccggctcgGATAATGGG 0.2434 4.6 57 CTGATAattaatcatccggctcgGATAATGCG 0.2430 6.5 58 CTTATAattaatcatccggctcgTATAATGTG 0.2323 4
59 CTTATAattaatcatccggctcgTATAATGGG 0.2322 8.8 60 CTTATAattaatcatccggctcgTATAATGCG 0.238 2 6 CTTACAattaatcatccggctcgCATAATGTG 0.972 2.4 62 CTTACAattaatcatccggctcgCATAATGGG 0.97 0.88 63 CTTACAattaatcatccggctcgCATAATGCG 0.967.8 64 CTTATAattaatcatccggctcgCATAATGTG 0.537 2.9 65 CTTATAattaatcatccggctcgCATAATGGG 0.536.2 66 CTTATAattaatcatccggctcgCATAATGCG 0.532 2.2 67 CTTACAattaatcatccggctcgGATAATGTG 0.448. 68 CTTACAattaatcatccggctcgGATAATGGG 0.447 69 CTTACAattaatcatccggctcgGATAATGCG 0.443 0.7 70 CTTATAattaatcatccggctcgGATAATGTG 0.02.5 7 CTTATAattaatcatccggctcgGATAATGGG 0.0 0.25 72 CTTATAattaatcatccggctcgGATAATGCG 0.008.0 a CSS - Consensus similarity score calculated utilizing dataset published by Oliphant et al. 8 See methods. b Binding - RNA Polymerase σ 70 - promoter binding strength predictions made using thermodynamic calculations by Brewster et al. 9 See methods.
TABLE S5: Apparent OD600 and ethanol yields of strain libraries during passagings Growth Enrichment Passage Initial OD Final OD Ethanol Yield a PBA replicate PBA replicate 2 ABP replicate ABP replicate 2 PB replicate PB replicate 2 PA replicate PA replicate 2 0.058 0.45 58% 2 0.052 0.53 75% 3 0.045 0.57 78% 0.058 0.542 5% 2 0.042 0.522 80% 3 0.056 0.497 82% 0.035 0.455 73% 2 0.033 0.520 82% 3 0.045 0.542 80% 0.040 0.459 66% 2 0.037 0.530 77% 3 0.057 0.452 74% 0.046 0.494 54% 2 0.047 0.564 76% 3 0.065 0.524 78% 0.049 0.448 52% 2 0.03 0.437 78% 3 0.044 0.385 76% 0.05 0.490 65% 2 0.035 0.427 79% 3 0.058 0.44 76% 0.052 0.483 62% 2 0.044 0.546 77% 3 0.047 0.424 75% a Calculated from glucose converted to ethanol by the end of each library passage. See Figure S2 for growth rates of each of the library passages.
TABLE S6: Normalized rates of RFP production from a set of promoters Promoter# a CSS b Predicted Binding c Measured Strength (AU) d.0000 00 00 ± 4. 2 0.9999 69 49 ± 3.2 3 0.9995 88 4 ± 0.7 9 0.922 27 29 ± 2.0 3 0.8689 23 29 ± 4. 5 0.8685 20 24 ± 2.2 25 0.779 6.6 ± 5.4 27 0.7786 9. 2.4 ± 3.7 3 0.7267 7.2 BDL 33 0.7262 6. BDL 49 0.2870 6.8 BDL 5 0.2865 5.7 BDL a See Table S4 for sequences. b CSS - Consensus similarity score calculated utilizing dataset published by Oliphant et al. 8 See methods. c Predicted Binding - RNA Polymerase σ 70 - promoter binding strength predictions made using thermodynamic calculations by Brewster et al. 9 See methods. d Measured Strength Rate of RFP production in DH0B from a ping00 derivative normalized to GFP produced from a constant control promoter on the same plasmid. See methods. AU Arbitrary units. BDL Below detectable levels.
TABLE S7: Enzyme levels in strains driving Pdc from strong predicted RBSs pdc Pdc copy number per Predicted TIR b RBS# a cell 0 4 9467.3 ± 0.2 2 22427.6 ± 0.4 3 48 2.6 ± 0.3 4 4242 2. ± 0.3 a s pl7a04f (pdc RBS#), pl7a2e (pdc RBS#2), pl7a03e (pdc RBS#3), pla08b (pdc RBS#4) (see Table S) transformed into RL309. b Translation initiation rate calculated using RBS calculator. 7 TABLE S8: σ 70 level measurements in E. coli strain lysates Strain fmol σ 70 μg - TCP σ 70 molecules per cell a 0 3 ppba 40 ± 4 3.5 ±.0 ppba2 28 ± 2.4 ± 0.7 pabp 27 ± 2 2.3 ± 0.7 pabp2 25 ± 5 2. ± 0.7 ppb 23 ± 5.9 ± 0.7 ppb2 2 ± 3.8 ± 0.6 ppa 25 ± 5 2.2 ± 0.7 ppa2 2 ± 5.8 ± 0.6 ppbasyn 22 ± 6.9 ± 0.7 ppbwt 22 ± 4.9 ± 0.6 RL3000 2 ± 3.8 ± 0.4 RL3000 + O2 29 ± 2 2.9 ± 0.7 RL309 + O2 8 ± 4.7 ± 0.4 a Molecules per cell calculated from measurements of protein mass measured per cell: Anaerobic ethanologenic E. coli 40 ± 40 fg, Anaerobic RL3000 E. coli 40 ± 30 fg, Aerobic RL3000 E. coli 60 ± 40 fg, Aerobic RL309 E. coli 60 ± 20 fg.
A B C Metabolite flux (pmol h - μg - TCP) Metabolite flux (pmol h - μg - TCP) Metabolite flux (pmol h - μg - TCP) Computational prediction - Max biomass; ijr904 ferm. def. + Pdc + Adh 20.7 0 7.0 0.0 0.0 0.3 0.0 0.6 0. 0 20 0 0 20 0 0 Glucose Glucose Glucose Consumption Consumption Production Experimental measurement - RL309 + ppb 8.0±2.0 Pyruvate Pyruvate Pyruvate BDL BDL Acetaldehyde Acetaldehyde Acetaldehyde BDL BDL Succinate Succinate Succinate BDL BDL Lactate Lactate Lactate BDL Production BDL Formate Formate Formate Experimental measurement - RL309 + ppbasyn 3.9±2.7 BDL BDL Acetate Acetate Acetate BDL BDL Ethanol Ethanol Ethanol 5.8±2.3 3.3±.3 Consumption Production Figure S. End product predictions and measurements for homoethanologens. (A) Predicted end-product profile for an ethanologenic E. coli strain (pdc + adh + ΔackA, ΔldhA, ΔfrdA ΔadhE), assuming no alanine or valine secretion and glucose consumption rate of 7.0 pmol h - μg - TCP. (B-C) End product profiles during exponential growth of RL309 ppbasyn (B) and RL309 ppb (C).
Growth Rate (h - ) 0.0 0.05 PBA library Growth Rate (h - ) 0.0 0.05 ABP library 0 2 3 Passage # 0 2 3 Passage # Growth Rate (h - ) 0.0 0.05 PB library Growth Rate (h - ) 0.0 0.05 PA library 0 2 3 Passage # 0 2 3 Passage # Apparent growth rates: Libraries Figure S2. Average growth rates of libraries obtained at the end of each passage of growth enrichment.
A prom_up_fw pdc_st_rev pdc_end_fw adhb_st_rev adhb_end_fw adha_st_rev PCR pdc adhb adha adapters amplicon encoding library of genetic expression element variants B prom_up_fw adha_st_rev adha_end_fw adhb_st_rev adhb2_end_fw pdc_st_rev adha adhb pdc C prom_up_fw pdc_st_rev pdc2_end_fw adhb_st_rev pdc adhb D prom_up_fw pdc_st_rev pdc_end_fw adha_st_rev pdc adha E 5 Stem + Index 3 Stem Illumina 250 bp reads Figure S3. Primer binding sites and sections of homoethanologenic cassettes amplified for high throughput sequencing of promoters and RBSs in (A) PBA, (B), ABP, (C), PB, (D), PA library populations. Sequences of prim-ers depicted are provided in Table S2. All amplicons have identical 5 and 3 adapters encoded in primers. (E) PCR based attachment of 5 and 3 Illumina stem sequences and 5 multiplexing barcode sequences to amplicons. Prim-ers used to attach these stem and barcode sequences are provided in Table S2.
A 2 PBA ABP PB PA 0 2 3 2 3 0 2 3 2 3 Promoters 0 2 3 2 3 0 2 3 2 3 Promoter ID 24 36 48 >50% 50.0% 7.5% 6.35%.39% % Promoter variant fractional representation in library 60 72 B Cassette PBA Rep 2 Sequence Logos Consensus Degrees of enrichments similarity score -35-0 dis T A A--7nt-- ATAATG G T A A--7nt-- ATAATG G 0.89 0.88 E promoter 0.27 0.07 Average Predicted Promoter-RNAP Binding Strength.25 4.74 ABP 2 T A A--7nt-- ATAATG G T A A--7nt-- ATAATG G 0.87 0.87 0.99 0.76 8.87 9.7 PB 2 T A A--7nt-- ATAATG G T A A--7nt-- ATAATG G 0.86 0.90 0.06 0.0 2.85 20.22 PA 2 T A A--7nt-- ATAATG G T A A--7nt-- ATAATG G 0.77 0.79 0.6 0.07 4.02 0.07 Figure S4. Promoter sequences present in library populations after each passage of growth enrichment. (A) Extent of promoter sequence enrichment for each gene cassette library before (0) and after each passage of growth enrichment (-3). library refers to sequences present in plasmid pool prior to transformation into RL309. Increasingly dark shades of blue corresponding to increasingly large fractional representation of a particular sequence in the library from 0% (white) to 50% (dark blue); magenta, >50% fractional representation. Asterisks () indicate the promoter sequences present in isolates chosen for further characterization. (B) Sequence logo repre-sentation, consensus similarity score, E promoter, and average promoter-rnap binding strength for promoters present after the third passage of growth enrichment.
pdc RBS ID A 3 6 9 2 PBA ABP PB PA 0 2 3 2 3 0 2 3 2 3 pdc RBSs 0 2 3 2 3 0 2 3 2 3 >50% 50% 30% 8% 8.3% 0.0% pdc RBS variant fractional representation in library adhb RBS ID B 4 8 2 6 PBA ABP PB adhb RBSs 0 2 3 2 3 0 2 3 2 3 0 2 3 2 3 >50% 50% 27% 5% 6.3% 0.0% adhb RBS variant fractional representation in library adha RBS ID C 3 6 9 2 PBA ABP PA 0 2 3 2 3 0 2 3 2 3 adha RBSs 0 2 3 2 3 >50% 50% 30% 8% 8.3% 0.0% adha RBS variant fractional representation in library Figure S5. RBS sequences present in library populations after each passage of growth enrichment. (A) Extent of pdc RBS sequence enrichment for each gene cassette library before (0) and after each passage of growth enrich-ment (-3). library refers to sequences present in plasmid pool prior to transformation into RL309. Increasingly dark shades of blue corresponding to increasingly large fractional representation of a particular sequence in the library from 0% (white) to 50% (dark blue); magenta, >50% fractional representation. Asterisks () indicate the RBS sequences present in isolates chosen for further characterization. (B) Extent of adhb RBS sequence enrichment for each gene cassette library, as described in panel A for pdc RBSs. (C) Extent of adha RBS sequence enrichment for each gene cassette library, as described in panel A for pdc RBSs.
Average Predicted Promoter-σ 70 binding strength A Av.pred. σ 70 bind str. 64 PBA Rep 6 4 2 4 8 6 32 64 28 256 pdc RBS TIR ( 0 3 ) Av.pred. σ 70 bind str. 64 PBA Rep 2 6 4 2 4 8 6 32 64 28 256 pdc RBS TIR ( 0 3 ) B Av.pred. σ 70 bind str. 64 ABP Rep 6 4 2 4 8 6 32 64 adha RBS TIR ( 0 2 ) Av.pred. σ 70 bind str. 64 ABP Rep 2 6 4 2 4 8 6 32 64 adha RBS TIR ( 0 2 ) C Av.pred. σ 70 bind str. 64 PB Rep 6 4 2 4 8 6 32 64 28 256 pdc RBS TIR ( 0 3 ) Av.pred. σ 70 bind str. 64 PB Rep 2 6 4 2 4 8 6 32 64 28 256 pdc RBS TIR ( 0 3 ) D Av.pred. σ 70 bind str. 64 PA Rep 6 4 2 4 8 6 32 64 28 256 pdc RBS TIR ( 0 3 ) Av.pred. σ 70 bind str. 64 PA Rep 2 6 4 2 4 8 6 32 64 28 256 pdc RBS TIR ( 0 3 ) Figure S6. Potential compensatory co-selection of promoters and first-gene RBSs. Average predicted promoter strengths (grey squares) are plotted versus predicted TIRs of coselected pdc or adha RBSs for cassettes PBA (A), ABP (B), PB (C), PA (D). Expected relationships between promoter predicted strength and RBS TIR if promoters and RBSs were coselected for a single aggregate expression level are depicted by red dashed lines. The lack of correlation between promoter strengths and RBS TIRs indicates that stronger promoters were not selected to com-pensate for with weaker first-gene RBSs (or that weaker promoters were not selected to compensate for stronger first-gene RBSs).
Pred. Str. Strong 6 Enriched adhb RBSs PB -AdhE PB +AdhE Rep Rep2 Rep Rep2 Frac. High Pred. Str. Strong 5 Enriched adha RBSs PA -AdhE PA +AdhE Rep Rep2 Rep Rep2 Frac. High Weak E adhbrbs 0.92 0.96 0.87 0.87 Low Weak E adharbs 0.62 0.63 0.37 0.23 Low Figure S7. Selection of adhb and adha RBSs in adhe and adhe + strains. The fractional representation of adhb or adha RBS sequences ordered from strongest (top) to weakest (bottom) are shown using a color scale of 0% (white) to 50% (dark blue) and >50% fractional (magenta) fractional representation. RBS sequences were obtained after 3 steps of growth enrichment of the of PB and PA libraries in RL309 ( adhe) or RL308 (adhe + ).
PBA library.00 0.0 0.0 E promoter 0.27 0.07.00 0.75 0.50 0.25 E pdcrbs 0.66 0.4.00 0.75 0.50 0.25 E adhbrbs 0.93 0.90.00 0.75 0.50 0.25 E adharbs 0.23 0.08 ABP library.00 0.0 0.0 E promoter 0.99 0.99.00.00.00 0.76 E 0.96 0.75 pdcrbs E 0.99 adhbrbs 0.75 0.75 E adharbs 0.99 0.50 0.50 0.73 0.50 0.72 0.25 0.25 0.25 PB library.00 0.0 0.0 E promoter 0.0 0.06.00 0.75 0.50 0.25 E pdcrbs 0.45 0.30.00 0.75 0.50 0.25 E adhbrbs 0.96 0.92 PA library.00 0.0 E promoter.00 0.6 0.75 E pdcrbs 0.76 0.07 0.50 0.30 0.25.00 0.75 0.50 0.25 E adharbs 0.63 0.62 0.0 Degrees of enrichments Figure S8. Degree of enrichment (E x ) for each genetic expression element at each stage of library preparation and growth enrichment.
A α - Pdc E+05 y = y 0 A(-e -kx )/k 2 3 4 5 6 7 8 9 0 4 ppba ppba2 pabp pabp2 ppb ppb2 ppa ppa2 ppbasyn ppbnwt Z. mobilis 3 ± 2 26 ± 22 ± 8 20 ± 9 7 ± 9 27 ± 3 24 ± 9 6 ± 6 8.2 ± 2.7 4.5 ±.4 ± 5 0 50 00 200 0 4 Pdc molecules per cell measured his-pdc Pdc his-pdc loaded (ng) Signal 5E+04 0E+00 R 2 = 0.9862 y 0 = 6. ± 2.8 0 3 A = -5.4 ± 2.8 0 2 K = 2.7 ±.4 0-3 0 50 00 50 200 250 his-pdc (ng) B α - AdhB 2 3 4 5 6 7 8 9 0 4 ppba ppba2 pabp pabp2 ppb ppb2 ppa ppa2 ppbasyn ppbwt Z. mobilis 3.6 ±.5 3.8 ±.3 4.4 ±.5 4.7 ±.5 2.9 ±.0 6.9 ± 2.4 - - 0.46 ± 0.7 2.75 ± 0.85 4.7 ±.8 0 50 00 200 0 4 AdhB molecules per cell measured his-adhb AdhB his-adhb loaded (ng) Signal 2.E+05.E+05 0.E+00 y = y 0 A(-e -kx )/k R 2 = 0.9629 y 0 = 4.6 ± 6.2 0 3 A = -52 ± 0 0 2 K = 36 ± 6.6 0-3 0 50 00 50 200 250 his-adhb (ng) C D α - AdhA 2 3 4 5 6 7 8 9 0 4 ppba 2 ± 0.6 ppba2 7.9 ± 5 α - σ 70 pabp - pabp2 - ppb - ppb2 - ppa 34 ± 0 2 3 4 5 6 7 8 9 0 2 3 ppa2 3 ± 0 ppbasyn 0.2 ± 0.63 ppbwt - Z. mobilis.8 ± 0.7 0 50 00 200 0 4 AdhB molecules per cell measured his-adha AdhA his-adha loaded (ng) his-σ 70 σ 70 Signal 3.E+05 2.E+05.E+05 0.E+00 6.E+04 4.E+04 y = y 0 A(-e -kx )/k R 2 = 0.986 y 0 = 3.2 ± 6.0 0 3 A = -38 ± 4. 0 2 K = 5 ± 2.0 0-3 0 50 00 50 200 250 y = y 0 A(-e -kx )/k his-adha (ng) ppba 3.5 ±.0 ppba2 2.4 ± 0.7 pabp 2.3 ± 0.7 pabp2 2. ± 0.7 ppb.9 ± 0.7 ppb2.8 ± 0.6 ppa 2.2 ± 0.7 ppa2.8 ± 0.6 ppbasyn.9 ± 0.7 ppbwt.9 ± 0.6 RL3000.8 ± 0.4 RL3000 +O 2 2.9 ± 0.7 RL309 +O 2.7 ± 0.4 6 2 8 his-σ 70 loaded (ng) 0 3 AdhB molecules per cell measured Signal 2.E+04 0.E+00 0 0 20 30 his-σ 70 (ng) R 2 = 0.9758 y 0 = - ± 9. 0 3 A = -56 ± 20 0 2 K = 7 ± 28 0-3 Figure S9. Quantitation of protein levels by western blot. (Right) Representative western blots are shown for Pdc (A), AdhB (B), AdhA (C), and σ 70 (D). Total cell protein loaded in each lane was 0.7 µg (A-C) or 5 µg (D). Purified His 0 -Pdc (A), His 0 -AdhB (B), His 0 -AdhA (C), or His 0 -σ 70 (D) were mixed with equivalent amounts of total cell protein from RL3000 at the levels shown below each lane. (Left) Graphs of the quantified signal from varying levels of protein standards and nonlinear fits used to estimate protein levels in cell lysates (see Methods).
ppba ppba2 pabp pabp2 ppb ppb2 ppa Pdc AdhB AdhA σ 70 ppa2 ppbasyn ppbwt ZM4 0% 0% 20% 30% 40% 50% Ethanologenic enzymes as a fraction of TCP Figure S0. Aggregate protein levels in E. coli strains containing the ethanologenic plasmids before and after optimization (including σ 70 levels) compared to levels produced by Z. mobilis. Pdc (cyan), AdhB (blue), AdhA (purple), σ 70 (red). Colored error bars are the standard deviations in levels of each enzyme from triplicate measure-ments. The light grey error bar is the total error for the sum of the ethanologenic enzymes (Pdc, AdhB, and AdhA).
REFERENCES () Blattner, F. R., Plunkett, G., 3rd, Bloch, C. A., Perna, N. T., Burland, V., Riley, M., Collado- Vides, J., Glasner, J. D., Rode, C. K., Mayhew, G. F., Gregor, J., Davis, N. W., Kirkpatrick, H. A., Goeden, M. A., Rose, D. J., Mau, B., and Shao, Y. (997) The complete genome sequence of Escherichia coli K-2, Science 277, 453-462. (2) Durfee, T., Nelson, R., Baldwin, S., Plunkett, G., 3rd, Burland, V., Mau, B., Petrosino, J. F., Qin, X., Muzny, D. M., Ayele, M., Gibbs, R. A., Csorgo, B., Posfai, G., Weinstock, G. M., and Blattner, F. R. (2008) The complete genome sequence of Escherichia coli DH0B: insights into the biology of a laboratory workhorse, J Bacteriol 90, 2597-2606. (3) Kovach, M. E., Elzer, P. H., Hill, D. S., Robertson, G. T., Farris, M. A., Roop, R. M., 2nd, and Peterson, K. M. (995) Four new derivatives of the broad-host-range cloning vector pbbrmcs, carrying different antibiotic-resistance cassettes, Gene 66, 75-76. (4) Gardner, J. G., and Keating, D. H. (200) Requirement of the type II secretion system for utilization of cellulosic substrates by Cellvibrio japonicus, Appl Environ Microbiol 76, 5079-5087. (5) Chen, Y. J., Liu, P., Nielsen, A. A., Brophy, J. A., Clancy, K., Peterson, T., and Voigt, C. A. (203) Characterization of 582 natural and synthetic terminators and quantification of their design constraints, Nat Methods 0, 659-664. (6) Sevostyanova, A., Belogurov, G. A., Mooney, R. A., Landick, R., and Artsimovitch, I. (20) The beta subunit gate loop is required for RNA polymerase modification by RfaH and NusG, Mol Cell 43, 253-262. (7) Salis, H. M. (20) The ribosome binding site calculator, Methods Enzymol 498, 9-42. (8) Oliphant, A. R., and Struhl, K. (988) Defining the consensus sequences of E.coli promoter elements by random selection, Nucleic Acids Res 6, 7673-7683. (9) Brewster, R. C., Jones, D. L., and Phillips, R. (202) Tuning promoter strength through RNA polymerase binding site design in Escherichia coli, PLoS Comput Biol 8, e0028.