TitleGenColors Logo

Gene list

Applied filters:

Organism: Chlorobium chlorochromatii CaD3, CaD3
Gene type: CDS

Number of genes found: 2002

Free access
Sort by:

 



# Chlorobium chlorochromatii CaD3, CaD3

>Cag_1486 toluene transport protein, putative
MKKTTSRFAMACCALLCAPYSAFATNGMNLEGYGAISHALGGTGSAYNTG
NSGVSNNPATLALRKSKSTQLGFGLRGLHPDVSLQANGISQSSAGDAYYM
PSLSWMHKGSAVTWGVAMLAQGGMGTEYGKGSPLFSMGKPLSGVGMVPMS
GEEIRTEVGVGRVMFPIAWNVSENTTIGASFDVVWAGMDLMMDMDGAHFA
SMMGSGNVNGTMATTLGTMMAPGGGVTDVNYVRYNFSNNNAFWGEASGYG
TGFKLGITHRLSKVVTVGGSFQSKTAMSDLKATKAELSFAGVDGTGSTFT
QKVNGTIKVRNFEWPTTIAAGVALYPSDRWMVAADVKHLGWASVMRSFST
SFEADNTAANGGFAGQELEVAMKQDWDDQTVFGFGVQYRASDRLVLRSGA
SFSSNPVPNAYLNPMFPATTENHYTAGFGYRLSDAATIALAGAWVPKVSA
RNGDGVEVHHSQTNWSLNYTQAL
>Cag_2020 hypothetical protein
MNRVVFIVDGFNLYHSLKAAQAVQRPASTKWLDLCSLFSSQLYHFGKDAV
LHKVFYISALAVHLEASNPNLTKRHQAYIKCLQANGVITLLNRFKRKDVF
CKLCKRSFYKYEEKETDVMLATTLFEQLATDNCDTVVLVTGDTDLAPAIR
SGQKLFPHKLILFAFPYGTKTTELKTLAPASFTLSSAAYAKHQFPNPFIL
SDGTTIPKPIKW
>Cag_1492 iron-sulfur cluster-binding protein
MQRQIIAIDEKKCTGCGDCIPACPEGALQVIDGKARLISDLFCDGLGACI
GCCPTGAMTVETREAEPYNERRVMIESIVPAGKNVIAAHLRHLADHGQQA
FLEEALQCLKELGIENSLQQPASAASHAHSGHHHGGGCPGSAMRSFGSKS
NQPLAASSDAQQPSALRQWPIQLHLVSPEAPYYHGSDLLLAADCAAFAVG
NFHTAFMQGKSLAIACPKLDSGMEVYVEKLAIMIERSRINTITVALMEVP
CCGGLMTLVAEALRRTSRKVPVKRIVIGLEGNVLSEEWVSL
>Cag_0213 WD-40 repeat
MGFLSNIFGKKEVELKRPQVKEDENLIKTMEGHLDRVLCVKYSSDGKKLV
SGSFDETAMLWDVASGKPLHTMKGHSTWVECVDYSRDSKLLASGSTDSTV
RIWDAATGQCLHLCKGHDTAVRMVAFSPDSKVLASCSRDTTIRLWDVANG
KQLAVLNGHTSYIECVAYSRDGKRLASCGEETVIRIWDVASGKNIANYDT
GDRLSHAVQFSPDDKLIAFGGRDAMVKILDAESGNMVKVMKGHGDAVRSV
CFTPDGRKVVSAANDETVRVWDVQSGNELHMYRGHVLEVQSVDVSPDGTV
IASGSDDRKIKLWRLL
>Cag_0271 hypothetical protein
MYPSVKKVVPNENYNLTIDFNNGESGTLDMKPFLNFGVFKRLKDMNHFRQ
VRVSFDTIEWPSGIDMDPEFVYSKCKKSTPPQVEPADA
>Cag_0575 conserved hypothetical protein
MATRLIWSPEALEDIELIASYIERDSLWYAKVVASKIFAVAETIAKNPKA
GRVVPEIANPCIRERLVYNYRIIYRIDVELILVVAVIHGARLLPSLTHRF
EFEQ
>Cag_0690 conserved hypothetical protein
MKLILKEYLSSLRERGELDAIFPDLLSQLGLNVYSRPGRGTRQDGVDVGA
VGRIDGGLEKVYLFSIKPGDLTRKDWDGDSVQSLRPSLNEILDAYIPNRL
PAEHRGKDIVICIGIGGDVQEQVRPQLTGFITKNTTTKITFEEWNGDKIA
SFIQSCFLREDLLPKGARSCLRKSLALLDESESSYRYFAELISSLSAGAD
ELKNSERITAIRQMGICLWILFAWSREAENMESAYLSSELTLLHGWDIIK
RYAEKTGKTAQAVETAFFSIFSAYQQISSEFLLKNVLPHVGKLHGLSSAV
HSSCAFDINLKLFDLLGRLATHGIWAYWITSRFSDEQAEVKKKSLEETLK
LMKSIKELISNNPVLLLPAKDDQAIDIFIATSLLAFNKENYNYINEWFAE
ILGRASFAYQTHGNYPCILNSYTELLSHPKSGDDEYRKTVTSGSVLYPVI
ALWTALLGNEEMYNNVAQFKQAHLSHCNFQFWYPDEYSEAHFWKNSDSHG
AVLSHVPVDRPKEEFLGQVFGECDQSPHYKDLSAIKFGWWPLVIVACRHY
RLPLPLQLLEGLWKT
>Cag_0404 peptide ABC transporter, periplasmic peptide-binding protein, putative
MEKKPPFALIVALLLVVSTLYGCHQQRAEEQHQQVVTAISADFDYLNPLL
IQFAMSREMCKLLYPSLVRPAYDSAQGTIRFLPNAAKQWKFSPDGKGVTF
HLNRNARWEDGVPVTSHDFKFSYALYKNPAVASTRQHYLNDLPLLPDGSP
DVERAVETPNDSTLVIHFSNAMAEEIVLDHLNDLMPVARHRFKDFRPEEI
RQRAAELPLLSAGPFRVKEWKRQERLVLEPNPTSALPHPATLKSMVFMVV
PEYTTRLAMLKAGQIDAMLSAGGINPKDVPELHRSAPNVVIRPVQHRYFD
SIVWLNIDGECYRTRKQIAPHPLFGDKRVRQALTLAIDRQSIIDGFMGPD
HATIVNTSLSPAYRTLADSSLDAYAYNPRRASMLLRQAGWLPGANGILQK
NGKPFSFELAAPTGNPRRNYAATIIQQNLRAIGIDCRLRFDEGLIFNRNQ
NEFRYDAALSGMAAETLPFQLIIWGSNFAERPFNSAAFQNRELDEVIAQL
SKPNSPTRKRAFWQRYQQILHEEQPRTFLYYYDELEGFNKRIRNAEVNML
STLYNLHEWQTE
>Cag_0215 conserved hypothetical protein
MLATNEHKLVEFLLQCQPGQPRTRGTWEVDHRGNPFILPAIGGITLNLQV
GDSAFGWAGDHIEPGVSCTADTHKPYEHPNVGLQIFSCAGNSATVATGDA
KGAEGTVIGHHGGSEHVIVDFPRDVKEKLLYSDTLIIRGVGQGLRLLEYP
DIALFNLSPRLLHAMKVREMGNGMLEVPVTTIVPAVCMGSGIGSAHVAKG
DYDIMTSDAETVREYGLDKIRFGDIVALLDHDNRYGRAYRKGAISIGIVV
HSDCLEAGHGPGVTTLMTCGTSLIRPIIDPTANIADMFGIGTQLTT
>Cag_0077 Biotin--acetyl-CoA-carboxylase ligase
MNQLSEQVLHHLSVAAPHFVSGSELSQLLNVSATAIWKHITQLRHEGYAI
EAVTGRGYRLLAMSERAVAAEVQPLLTTASFGRFFHYYDEVTSTNLLARE
LAQSGTPEGTVVVADTQRAGRGRMGRSWVSPSGVNLYFSLVLRPQVPLFR
VPQVTLLVAAAVHQAIKKLAPDVALGIKWPNDLLVQGKKVAGILCEMFSE
PEQVHFVIVGIGINLNQKEFAPELEDRATSLLCETALFFSRPQLLAEVLN
AFEPLYQQWLGENDLSFILPYLEEHASLTAKEVVIEQGNRTISGTVVGLA
QGGELRLITSDGESLLVASGEAHLRQQ
>Cag_0440 two component transcriptional regulator, winged helix family
MRLLIIEDEPGIAAFLKDGLEEEYFAVDLAFDGKSGLAMALSNDYDVLIV
DWMIPALSGIEVCRQVRKAGCLVPIIFLTARDTLEDVLFGLEAGANDYIK
KPFAFEELLARIRVQLRSRTQGEESLSAGGVTLNPATHQVTYNGSELILT
PKEFALLEYLLRNKDRVCTRSRIIEHVWDIHFDADTSVIDVYITFLRRKL
EAVGCNNFIQTIRGIGYIIREATAS
>Cag_1944 polysulfide reductase, subunit C, putative
MIEKALKGSPTYWLWILLLLALMGFGYSSYATQYAHGLAVAGLGGSAIWG
LKVMQFATMATFAASSVLVVALAYLQASKPATSLLVTSQFIGVSAASTAL
VSLVADCGKPELLFELVRSASLSSPYFLSILSLKLYIITSLVSSWATLGA
EAKGVPAASWVKSVALLSLPFGLLTPLFVALMIADAVAILQLLAASIAGG
MSLLLLVVALLKQRATFSVDANAPKMVATLSLYGGVAHLLLTAVAWFRAS
SDNSGMVAAAFVAALLAVVLFALSLKKGNTTMPALAMVISVVLAVAGTGS
FVTYAPSGVELSIVAGVFSCGLLVATLLLKTATAIRREA
>Cag_0032 hypothetical protein
MRLLPVIQNYLKTIVELKTMKQITAADALGLSIPERIQLVEDIWDTIAMK
SDELEFTVEEKRIIDKRLNAYHRNPEVGSPWEDVYKRILSKQ
>Cag_0678 conserved hypothetical protein
MLAKITRKNQLTLPKSIVTSLPKTDYFQVEVVSGRIMLTPVRMQQADAVR
AKLDVLGINDQDILDAIEWAREG
>Cag_0385 conserved hypothetical protein
MTMRDHTPDFRMHELSQENKALIGSTVKQLLEKLAVDGRLCSEALLEFWV
EVAGAQRPRGTYRNGCLMPDSFIYIRDYFRASESGTLLAGESYVKDGTHD
LESAWDDMLDELFYQIEIFTSPVSTGKGITLELWAGCRQRPEGDWVYAVD
TKVELE
>Cag_1393 Maf-like protein
MNLPYPIILASQSPRRRELLALTLLPFETMSVNTPETLNPTLSPEENVLA
IAHEKADAVATILAHTKRQAIVLTADTMVAQGRHIFGKPSGFDEAFSMLQ
HLQGKTHQVHTGFTLRTPTINHSEYVTTHVTLNAMSSEAIAHYLHQQQPY
DKAGSYGIQDPLMACHISSINGCYYNVVGLPLSRVWLALQAIIAQQ
>Cag_0135 hypothetical protein
MKQILPSCSASISELKKNPTTLLNEADSAPIAIFNHTIIIFQPLILSLLK
PMNG
>Cag_1139 Adenylosuccinate lyase
MAAIWTEEAKFQRWLQLEIAAVEARMEAGVVPPEALATIREKATFNVPEI
LEIEKETKHDVIAFLTNVAQYVGNESRYIHEGLTSSDVVDTCLAMQMRDA
GKILVADIERLIEVLAKRAVEFKYTLEMGRTHGIHAEPTTFGLKLLLWHQ
EMLRNLERMKRAVATVSVGKISGAVGTYQHLSPAIEAAVCQKLGLTPSTI
STQILQRDRHAEFATTLAIVASSIEKFSVELRHLQRTEVREAEEFFSKGQ
KGSSAMPHKRNPITFERLTGLARVVRSNSIAAMENVALWHERDISHSSVE
RIIMPDSTVALCYMLRTFAESVETLLVYPDRMEENFNSSYGLTTSQTVLL
ALTAKGLTREVAYKLVQRNAMQSWSTKVHLRDLMLQDAELLQHISADELT
RLFSPETILGKLRESVDTIFSRNGL
>Cag_0423 exopolyphosphatase, putative
MNKEALRVAAIDLGTNSFHMIIVEGSRDKGIVEIDRVKDMIGIGHGSIAT
KMLTEEAMQAGIATLKKFLVLAAQHGVQFEHILAFATSAIREAKNRLDFI
NRVRAETGLKVKIISGKEEAEFIYYGVRNAVSVGKTADLIFDIGGGSVEF
VLVNHKGVQLLESRKIGVARMHERFVSSDPIAANDVKMLEQFFAAEMVSA
VDKATTMKVRRAVASSGTAETIARMIHAMQGRDSDGALNNSCFTRSEFQQ
LYHTVLLMNSAERKKMSGLDEKRVDLIVPGLILVDMIFKLFRLEEIVIAD
SALREGMVLHYLQQQGSVLKKRGHQESLDIRRESVNELGFRCHWDRGHSE
YIARLCLQLFDKLAPLHQLEENYRELLEYSALLHNIGAFISISSHHKHSQ
YIVMNGELRGFSPSEIAILGHVVRYHRKSPPSEKHTPYNALKLPHKRAVD
VLSGILRIANGLERGHRQNVQNVDVQVKGKSITMALTCCFEPDIEIWAAD
QLKAWLETVLQKTIHFQRA
>Cag_0183 ClpX, ATPase regulatory subunit
MKTQEPGKGRTRTNKNDTANVFCSFCGRPAYEVGSMVAGPNVFICDRCIA
AANDVLLHQTGNAPQPQQPEPSAPISNQPFFPRLVSPKAIVDVLDQYVVG
QERAKKSLAVAVYNHYKRIESHNTSAALDEVVIEKSNILLLGPTGTGKTL
LAQTLANLLDVPFSIVDATSLTEAGYVGDDVETILARLLHASDFDLERTE
RGIIYVDEIDKIARKSANVSITRDVSGEGVQQGLLKILEGAVVGVPPKGG
RKHPEQPLINVNTKNILFICGGAFEGIDKIIARRVSKSSMGFGAKVKSSS
MDYDPDILRQVMQDDLHEYGLIPEFIGRLPVISTLDMLDEKALHNILVEP
KNAITKQYKKLFDMDGVELEFTDEALEKVVNIAKERGTGARALRSVLESV
MIDIMYEIPSMKETHKCIVTAETIEQKTPPEYISEANTTKKSA
>Cag_0042 hypothetical protein
MLQKLIIKRFRGFSTLEVDIPKVLLLMGPNSSGKTTALHAIRISCQAAWI
AVTNNIAWKVEDTVIIFKDFIIRDISQLMPIADWQALFVNQIVGEHTHFS
IEIIFEKTDALSSILIEGKYARNENLKITATIGAETLINNLKNISNRSSQ
YKNIAFEFFQKHLPKAILIPPFYGVIRDEEYRAKAVVDAMVGSADQSHVV
RNMISRLSTTQLEQLNAFIKDMVGATLVQRTQGDDIEKISPLRVTFRDTN
GELELSAAGAGLINLIALYSSLARWESETIDRQIIFLLDEPEAHLHPRLQ
GYTADRLATIITNDFNAQLIMATHSIEIINKIGERDDATIFRTDRLNKEK
GGQQLIGQTPLLDDLSQWADLTPFSIINFLASKRILFYEGKSDGIILTKC
AEILFRNNPDKKKKFEKWTLIQLEGSGNKNIAQLLAHLIDSSTFASVAEK
KDFKIVVQLDKDYNDEVEQLKLITNRDISTFYNIWSKHSIESLFCESATL
YQWLKPKYPDIQEETIEKAIIAANQDNELNQYAREQRQATLLKPLQKISE
NITATNRQADNDIAATPEIWQRGKDRSKVILHHIKTALSTSANSLSTSLT
KVIEKADVNLFPAGNRAVVPSEIKQLLDWMVTNA
>Cag_1668 Glucose-6-phosphate dehydrogenase
MQPKPDNCTIVIFGASGDLAARKLFPSLFDLALWGHLSEEHHIVGVGRAP
LTQEQFRAMVASKMEELFPERTAQHEAFQRFLANLHYSVVDLAQPEDYIK
LRQHIEQLPNGGGISNNLLFYLAIPPTLAPQIVQSLHTAGLGEADGCPHG
WRKIMVEKPYGTNLESARELNQVIGNVFREKQVYRIDHYLAKETVQNIMV
FRFSNGMFEPLWNRQHIANVTITIAENFGIRDRGAFYEDAGLLRDIMQNH
ALQLLAAVAMEPPVDFSADAVRDEKAKVLRSIRPFTAEQVAKTVVLGQYE
GYRQEKNVAPNSTVETFAAIKFFIDNWRWKDVPFYIRAGKNLAQTLTEIV
ITFQCPPQNYFGPTGSEVCTANQVILRIQPEETIAVRFGAKRPGESVITD
PVSMTFDYKTSFVESGLTPYHRLLLDAMAGEQMNFIRQDSVEYSWGIIDA
IRQAVAGQQPESYPVHSYGPAGISRLYE
>Cag_0376 hypothetical protein
MPLNLLKKYPELLEIAHMSEADRNSSLYAIFNRDFVQNDNLYFQGKKVRP
IKGEDGAIAMDVLFQHLTTSSDKEKSSLNRNARTFEMARSCRLHWIRYHI
DKSAGKGVKIFSCEERDQKRRKDVIRTYIFDVDQKYVIILEPQRSGNDYY
LLTAYYLDRDDGKKQIEKKFKQRMKEVL
>Cag_0876 DNA photolyase, class 2
MLHSPVDPRRVRLLNHHIDGNGVVIYWMSRDQRVRHNWALLFARWKAAML
QQPLMVVFTLAPSFLGAPLRHYDVLFNGLQEVETELRALNIPFMVLQGEP
SEELPRYAMHHNASMVVADYSPLHLTRCWKNQVAEALSVPLYEVDAHNIV
PCRVASPKQEYAARTIRPKINKLLGEFLTPFPELEALPQPLTEPPVNWQK
LRSHFHADASVAPVGWLTAGEAGAHATLQCFVQQKLNGYATQRNDPSLEA
TSRLSPYLHFGQISTQFVALQVKAAHAPQEDKDAFLEELIVRRELSDNYC
HYNASYDRLSGIPAWAQETLARHATDPRDYIYSHEAFEQAKTHDPLWNAA
QHELLQSGIIHGYMRMYWAKKILEWSRTPEEAFEIALWLNDRYALDGREP
NGYVGVAWSIGGVHDRPWRERPVYGTIRYMNANGCARKFDVKRYIHNVTS
RRPQQVGLF
>Cag_0629 Hit family protein
MQSFKEESYLCQSGKSIFADLAPEEDEQHFVLHRAKKCFIIMNLYPYNCG
HLMVIPYLQTPEFSDLDRETWLEVMELTDLSIRALKKVMRPHGFNTGANL
GRIAGGSVDNHIHFHIVPRWDGDTNFMPVLADVKVLSNDMISTYKNLKAA
ITELLAQEAQ
>Cag_0577 hypothetical protein
MILLDLGKPYPLDDVFKAAARKHQSHYRATELNVGYSDKYGTKLNEDDAK
KLLNYYDSLNVREELQNRFQKGDEYSFSLKRDGDLLRSEHIPFNLFAPLC
ADTKLAQNLIKNVFGLDCAKNLSIKFEYAPKPKGKYLDDATAFDAFFKFD
DNNGKRIGIGAEVKYTEKSYPIGKKEKKYVHDPKSCYWKVSCKSGAFLEP
SYSPSSALITDELRQIWRNHLLGLAMCQQNELDDFYSITLHPAGNHHFQR
VKPNQGVIPEYQAQLTDSYRSKVFGRTYEEYIAAIDGDSEILKWKQYLHD
RYIVKNTDDQPQ
>Cag_0734 hypothetical protein
MKLTFNDYTLETVYLHISDAQRLEVMDLWQTENAITSAAERERRSYEVVV
MVRHVSGAVVGVSTVTVRTASNGKRYYYYRMFIHSSHRVPYLMRAVTNAS
RDFLATFRHPDGEVESFVVVTENPKLMRQGMRQLFERHGYTYKGKTSQGL
DCWEYSFLQ
>Cag_1178 hypothetical protein
MMTDEIITEVRVIKDEIAAQYNYSFQDRFIAIKKGEAELAAKGMRLIYPP
NNAVMTPATALQRGKLKTHKIFQKLSASQ
>Cag_1111 conserved hypothetical protein
MDHTKINLLVRLQYLDNQIENIVSLQKGLPEEIEALEDDLAFTTRQIESR
KKIADEQLRQRTRLNEQINECNNKINSFKEKQTLARNNKEYDALSKQIEY
EEKEIANANMQLQDIAQTTQRLQELQKKGAQLITENRYDEITEEMMPDDV
LLQQLKDLTAQVAQKREELESIVIETAADVATLEEKVVAQRALITKEAKR
LIDKYDHLRSGTLRNAVVKLNRNACSGCNTRVPTNRHTMIVQGGFYLCES
CGRIVVHERLFEEAKD
>Cag_1932 CBS
MEIFLLFVLILVNGAFAMSEIALVTAKRSRLSRLADDGDKSATTAMKLGE
DSTSFLSTIQIGITSIGILNGIVGEGALAVPFSLFIHSATGIELETAQLI
ATVVVVLGITYVTIVVGELVPKRLGQLNPEQIACLVARPMQILATITRPF
GRLLSFSTNTLLRLMGVKPQITPSVTEEEIHAMLEEGSEAGVIEQQERDM
VRNVFRLDDRQLGSLMVPRADIVFLDVTQPLEENICRVTESEHSRFPVCN
GNLQSLLGVVNAKQLLLKTLRGGLTEFATLLQPCVYVPETLTGMELLDHF
RTSGTQMVFVVDEYGEIQGLVTLQDLLEAVTGEFVPRNLEDSWAVERADG
SWLLDGLIPVPELKDTLKLKEVPDEDKGLYHTLSGMIMWLLGRMPHTGDV
LVWEEWNLEIVDLDGQRIDKVLASPLNNAPKASQKEEKPPVKSDDNAALC
SIRPTP
>Cag_0442 Histidine biosynthesis protein HisF
MLAKRIIPCLDVRDGRVVKGINFEGLRDAGSILEQARFYNNELADELVFL
DISASLESRRTTLEEVMKVSEEVFIPLTVGGGISSVERAREVFLHGADKV
SVNTAAVNDPYLISRIAEKYGSQAVVVAIDIKKVGNHYMVHTHSGKQITQ
YEALEWALKVQELGAGEILLTSMDRDGTKEGYENNSLRMISTAVHIPVIA
SGGAGNLEHLYDGFSKGCADAALAASIFHFRHYSIRQAKEYLHKRGVAVR
F
>Cag_1634 CBS
MQQYPTLTLTTETPLLQLRHLARTIRPTIMAKLEYMNSAYSHHFRAAQAI
VQAAEEQEQIHPGMTLVDWSLGSSAIALAMVAVNRGYKLLLAVPDTIANE
QQNMLRALGAELIMTPADALPDEARSCMMVAHSLEQSIPHAFFVGMYDNP
LSLRIHSDITAPELLLQCNNAVTHVVVPLGSGALAFGMSRALKAANPSLH
IIGVEPKGSIYASLFQRGELSAPERWDVEEMGARQPSPFWERSLLDDVVQ
ISDHDAFNCARELLRTEAIFAGGASGAAMAAALHLAKQCNEDDSIAVMLT
DFGGYSLSRLYCDDWMRKKGFYRKVKSSLEQITAEDILQRKARRNLIFAH
PEHTLAEVFEMMKQNDVSQMPIVSYNAPIGSISENRILSILIEHDDAMNA
KVIGFMEKPFPVCSPDATISELSARLQQHASGVLVNMSDGKLQLLTKSDL
IDALTHK
>Cag_1164 penicillin-binding protein 2
MQGIKALVIAVFALLFVRLLYLQVINFQEPGSVSASNSVRRIWIHSPRGR
MIDRNGSILVDNQPLYSVRIIPAEFNEAKLGYLAWLLELPKEEVAAALAK
GRSYSRFAAATITRNVNEITIARLSENLWQLPGVIVTADNKRLYSDSLRG
AHLFGYLRSIPKEKMEELSEKGYSQDDKIGFSGLEKIYEERLKGQKGARY
EMVTPLGKYAGKYDKGNSDIAAVRGDDLYLSIDGGLQMLAEKLLTRTGKS
GAVVALDPNDGGVLALASAPDYSLNTFNGFTDPKGWREIITSPQKPLFNR
TVQAVYPPGSVYKMVLAMAGMEEKLVDPERHIYDGGVFIYGGRRFLSHGG
RGHGSVNMRNAIAVSSNVYFYTIIFNVGFENWTKYGDMFGFGRKTDIDLP
GERKGLLPSAEYYDKRYGKKGWTKGYLVSLSIGQGELGTTPVQLAAYTAA
IANKGTLFQPHIVNGYRDTQTGRYIPVPYQQTKLPISPETFALMHDGMRG
VVLQGTGTLANVPGVAVAGKTGTAQNTHGKDHAWFIAFAPVEQPKIALAV
LVENAGFGGSISAPIARELIHYYINLRNRRPTATRGDVKVNISNDAAADS
SSAAASTNGEPQDEPSPNSEIDKDKGNSTSPSVQKSANDTQSESLPE
>Cag_1220 shikimate kinase
MQKEPSLIFLTGFSGSGKSTIGPLLANSLGYEFVDLDALIERECGKSINQ
IFAEAGEAAFRQHEEQTLETLLMRRKTVVSLGGGVLEQPRAFELVRQAGT
VVYLKSPVKTLARRLSNKTDRPLLKGEQGEKLSQEEIEQKIAELLARREP
RYECADITVETDAKRIGSTVEELTRKIERYMRTLALSSEE
>Cag_1731 RNA methyltransferase TrmH, group 3
MDGVDSNNIVYGRNAVLELLQHKPESIEKIYFQFNTSHPKLKEIVITARR
LKLVSGKARLEKLSEIAGTTKHQGVCALISSVTYYSLEEVLAQPRNSAPL
LVVLQGLDDPHNLGAIIRTAEAVAADAVVVVEGKGSPINAAVHKAAAGAL
SHLRVCKVKSLVRTLEFLHQQKVQVFAADMDAERNYTDVDMRQPTALVLG
MEGSGLAPEAMQFCDTVVRLPMAGCVESLNVSVTAGVLLYEAMRQRLV
>Cag_1544 phenylalanyl-tRNA synthetase, beta subunit
MKISISWLREFLPNFSCETVSLVERLTFLGFEVEGVEESASLDRRIVVGR
VLETEPHPNAERLTLCLVDVGREEPLRIVCGAPNVRAGMVVPVATEKAKL
QFPDGQTLTIKPSKIRGERSQGMICAADELGLSNDHSGVMELESSWEIGK
PFADYLESDVVLDIAVTPNRPDVLSHLGIARELADGAPLQYPSQQSLTYQ
PAGERIAINDAVACPYYTGVIIRGVTIRESPEWLRKRLQAIGLNPKNNIV
DITNYMLHALGQPMHAFDCAKLAGERIAVRSDCQAEVVALNNLTYKVEGG
MPVICDGSGAIAAIAGVMGGMASAVTESTTDIFLESALFHPSMVRRTAKK
LALASDSSYRFERGVDSRMVQQASATAVALILELAGGTVECAMEQGSVAA
DLQLLALRPERTNKLLGTALSGEQMVELLERIGFRCVEQTTEQLLFAVPS
FRVDVTAEIDLIEEVARLYGYNAIESSRQMATIYPTKRQHPAYFPDFLRG
ELITLGFREILTNPLIKRNDAALASEQLVDVLNPISEGLEVLRPSLLPGL
LKVISHNIRHGNRDQKLFEVAHVFEAKPQVQQTQQPLEGYCEQERLVMAI
TGSRYLRRWNHPTDMVDFYDLSGAVEMLLEQLNILDKSVVNIYTPSALSI
DVFLTEKGKRTTHRLGIMQPVNAAWLKHFDIEQEVYCAELDVALLERCYQ
PTSAYEPPSRFPVVERDISFIIPEGVSAQSLVELVQSSNPLIKTVTVFDR
FERNHESGKECSIALSLTIADAKATLQDEKINDILATISRNAESKLGAVI
RQV
>Cag_1767 Holliday junction DNA helicase RuvB
MRIELLNTPPDAAESRFEEQIRPIRMEDFAGQQRLTDNLKVFISAAKMRG
DALDHVLLSGPPGLGKTTLAYIIASEMGSSIKATSGPLLDKAGNLAGLLT
GLQKGDILFIDEIHRMPPMVEEYLYSAMEDFRIDIMLDSGPSARAVQLRI
EPFTLVGATTRSGLLTSPLRARFGINSRFDYYEPELLTRIIIRASSILGI
GIEPDAAAEIAGRSRGTPRIANRLLRRARDFAQVDGISTITRTIAMKTLE
CLEIDEEGLDEMDKKIMDTIVNKFSGGPVGIASLAVSVGEERDTIEEVYE
PYLIQAGYLARTTRGRVATRKAFSRFADHTLLGGNFGGHKGSLPLFDESE
AD
>Cag_1718 O-succinylbenzoate-CoA synthase
MELELSLYRYAIPFTEPITVKQQRLVMRDGLVLALTSATEAVTAFGEIAP
LFGLHSETLEQAEAQLVTLFATPHTTLARIMEAQFYPSVRTGLEMAIHNF
TAARSGALPRFGTTPHNAEQVPVNALLFGASNEVVERAERLFTAGYRTFK
LKVNAANSANAIASIQALHSRFGNAIALRLDANQSFTLNDAIAFGKALPD
GSIAYIEEPLQNAERIGDFYAATAIPAALDETLWQNPALLDIIPTKALAA
LVLKPNRLGGITVAEKFARYAAERGLLAVVSSAFESGISLGFYAWLAASC
NAEPAACGLDTSRYLAFDVLTTPFPTNTPLLDAAQCYRNSLQVNLATLKP
CAMPQSFRILMEN
>Cag_0110 hypothetical protein
MGKANNQPNKSKPKVSVSIGRLLMVLIGANFIMLLAPNFGMLNSFLYIPD
LYTWPAFVLGFVLIFFGFKGLYKKP
>Cag_0028 peptidyl-prolyl cis-trans isomerase SurA
MKKNSLKHIVLAGTLLLVAAPHVLRSEVVDRVVAVVGNDAIFQSDINRRA
AMLRAQYPDLAKEQSVPRNILQGMVEQKLLVTKARLDSVTVESAQVDATT
LERLRQIVARFPTKQAMERQLGMTLPALRESIREELTNQQLADKLRRKKT
ALASVTYDEVMAFYRDNRGQIAPADEQVSVSQIIKYAAVTPESRKEAAAV
MQSIQQELQAGADFGELARKYSQDPGSATSGGDLGFVRKGQLVARFEQVA
FALKEGEVSEVVETRYGLHLIQMLNRDDNSIHVRHILRRMERSTDDFKEA
NSTLASAYESVVSGKESFADAAKRLSDDDVSAARGGQLVAAGASRQTVPL
NKLFPQMRSLLGALKVGDVSRPQLIEPPQGEPFYAIFKLNERIPAHPVDP
QKDYSVVEELALDAKKEQLFSEWMESLYKEVYVHRVNM
>Cag_1017 HhH-GPD
MLKEFLITHNKELEIEKSLFSGQSFLWKKHQSNLDSFVTVMDKRLVIISQ
LSPYTIRVHCDSEVLYGQKISAFISHYFTLDVPFQKIFSSSFKSNYSEVW
RLLDGYKSIALLRQHPFETLISFMCAQGIGMRLIRQQINRLCERYGEFYE
AEMEGEMLCFSGFPAPEQLACLNAEELSYCTNNNRERAANIIAVARKVVE
GRLDLSSLSYPNMAFEEVQARLTQERGIGLKIADCVALFGLGYFEAFPID
THVHQFMAQWFKVPAASRSLTPATYRQLTLEAREILGSHYTGYAAHLLFH
CWRCEVKKLCWF
>Cag_0400 Chlorophyllide reductase iron protein subunit X
MKARTIAIYGKGGIGKSFTTTNLSATFARMNKRVLQLGCDPKHDSTTSLF
GGISLPTVTDVFAAKNAKNEQVAISDIVFRRDIEGFPQPIYGIELGGPQV
GRGCGGRGIISGFDVLEKLGMFQWDIDIILMDFLGDVVCGGFATPLARSL
SEEVILVTSNDRQAIFTANNICQANNYFRTIGGESHLLGMIINRDDGSGV
AENYAQAAGINVLMKVPYNMEARDRDDSFDFAIKLPELRDKFQKLATDIL
EKRIAPSNATGLDFNDFVRLFGDVKNEAPRPAKADELFASQPAGNNASTT
THSTQESDQQKMERCIACLEPIQQQLYRLAELEKKSLTDIASLTNLDETT
ISETLTRARKQLKRMFFEG
>Cag_0885 methyltransferase, putative
MMSSSYSLEKFNREAATWDEKPRRRLVAKAVAHAIIAHAKPQPTMRALEF
GCGSGLVTMPIAPLVGSLVAVDTSPEMVKMVQQKAEEAALTTLTTLVDDL
FAEAEAYREPFDLIFSSMTLHHIADTATVLQRVAQLLVTGGVLALADLEL
EDGFFHDDPHEEVHPGFERSALEAALSAAGLQVRSYHIAHTIHKCNRAGV
DAAYPIFLLVAEKV
>Cag_1089 Peptidyl-tRNA hydrolase
MLHCRHACKQQKRVKIILPLTSFFSIVMKLIIGLGNPGTEYDGTRHNIGF
AVADALAAHHNASFSKEKGRYLSAKIKLGSESVGIIKPMTYMNHSGHAVV
AAMNFYKVARTEIVVICDDLNLPVGTMRLRPKGSAGGQNGLKHIIECFGS
NEFARLRIGIGSQNMPKGGFSSFVLGKFSEQEKSEIAIMTVSARDCAIDF
ALHGLGHAMNHFNTSKL
>Cag_1871 Small GTP-binding protein domain
MKFVDSASVFVQAGDGGRGCVSFRREKFVPKGGPDGGDGGRGGHVWLETN
SHLTTLLDFKYKNKYIAERGVHGQGARKTGKDGVEVVIQVPCGTIVRNAA
TGEVIADLTEDAQKILIARGGRGGRGNQHFATSTHQAPRHAEPGQKGEEF
TLDLELKLMADVGLVGFPNAGKSTLISVVSAARPKIADYPFTTLVPNLGI
VRYDDYKSFVMADIPGIIEGAAEGRGLGLQFLRHIERTKVLAILIAVDSP
DIEAEYQTILGELEKFSATLLQKPRIVVITKMDVTDEPLALQLAGEQTPI
FAISAVAGQGLKELKDALWRIIVAERAVPTNQVPQGGE
>Cag_1697 hypothetical protein
MRDAQIGHLDVGVGSLMLNYGFWIIVRATLAVAPNKTIHTKQEFAQFFCV
GADSISALSSVRAKMDFAPTPYAPEIWINRF
>Cag_1829 Translation initiation factor IF-1
MAKEESIEIEGVIVDALPNAQFKVKLENGLEVLAHVSGKIRMHYIRILPG
DKVKVQISPYDITKGRITYRYK
>Cag_1141 conserved hypothetical protein
MVASPAFGQMRYTALRLTLVALTFGTVQELNAAESSNFRQQMSMADQELW
KARYPQADSLYNVLLRQNPSNTEVNWKLARLQISLGESLPPSQQPVRLRH
YRQAENYARTAIAIDSTEAPAHIWLAASLGLMADKIGPQEKLKRAEEIKR
ALDTAVRLNPDDATAHSLLGTYYYEASKIGWFRRMIGNTFVGTMPQGNKE
LAEKEFRRSIALDPRMIRNYHDLAKLYLDMGRKAEAITLLKTALNKPILV
ESDKRRLEQIRELLRKHNGDGE
>Cag_0431 polyphosphate kinase
MELRPFSDPSLYVNRELSWLHFNRRVLEEALRADLHPLLERVKFLAIFSS
NLDEYFMIRVAGIENQLEAHITDRTIDGHTPAEQLAAIRAIVQEQLALRH
NCFTNDLLPALKAEGIEFVRVSDLTAAEQHRVSLYFRKEIYPLLTPLAFD
TGHPFPFMSNQSLNLAIELEDEEKNQLKFARVKVPALLPRILHVEQMTGS
HRISSTTRLVWIEDVVASNLAQLFPEMRIVQAHLFRLIRNADIEIEEDEA
GDLLKTIEEGIRSRRYGNVVRLDIAPTMPQAMQELLINNLGIQAHNVYAS
DSPPGLDCLMELLALDRPDLKDEPFTPANPLTGNPRQHNIFNAIKKKDRL
LYHPYDSFQPVVELVEQAAADPNVLSIKQTLYRVGSNSPVVRALMKAAEA
GKQVAVLVELKARFDEGNNIVWARALEDVGVHVAYGLPGYKIHAKLTLVV
RREQHKLKRYLHLGTGNYNPSTGKLYTDYSFFTANEELANDVAELFNALT
GYSKYSGYQKLLVSPLNTRSKIIEMIEREIAWQEQKGNGRIVMKMNSLVD
PKTIRALYNASCKGVRIELIVRGICCLKPGIANVSENIRVVSIIGRFLEH
SRAYYFFNGANEELYLGSADLMPRNLDDRVETLFPVLEPTLIQRVLGDLE
LMLADNVKAWQVQPDGSATLIQSNEAPINSQATFMAKATPQRSRHL
>Cag_0150 hypothetical protein
MEREQASTLRTIFSMPVIVAALGYFVDIYDLVLFSIVRVPSLKSLGLSGQ
ELIDYGVYLLNMQMIGMLLGGFLWGWLGDKKGRLKIMFASILMYSLANIA
NGFVTTLPMYAALRFIAGVGLAGELGAGITLVAEILPTKIRGYGTMLVAS
IGVSGAILANYVATTFEWHNAFFIGGALGLLLLAARFKVSESGMFQAMAD
HRGTNRGNMAALFTDRSRFLRYLNSIMIGVPIWFVVGVLITFSPEFGEKL
SISAPVSAGNAVMYCYLGLVFGDLSSGLLSQLLKSRKKVVLLFMVLTVAG
VALYFTQHGQTPQFFYMVCAFLGFASGYWAIFVTVAAEQFGTNLRATVAT
TVPNLVRGMVVPITMLFQYFRGMFGMELGALVVGVICIVGGFLSLMALSE
TFHKDLDFYEEFL
>Cag_1240 conserved hypothetical protein
MNHGSRRDVAIIIQNQKFKIHNHYQMARTLISFDWAMKRLLRSKANFDIL
EGFLSELLGEDITILDILESESNKENKSGKFNRVDLKVKNSKGEFIIIEL
QYDREYDYLQRMLYGTSRVITEHQKESESYDTVVKVISVNILYFELGQGT
DYVYHGKTSFVGIHDHDTLQLDSRQKERFGKLQIHSLYPEYYIIRINNFD
SIARNTLDEWIYFLKNSEIPDNFKAKGIQKAKESFSVLRMSVEEYQAYQA
FQDELRDEASYVETKRIDALYQGREEGFAEGMEKGKEIGVLEGMEKGKEI
GVLEGKLEIARRLLASGISKKEVAAITGITVDLL
>Cag_1224 Uncharacterized low-complexity proteins-like
MTTPDPQVKPSTLPNDPLTLLEVANHSSERLAVQHTAFIAACVYVLIIVF
GTTDLDLLIGKGVRLPFVDVEVPIVGFFAFVPFILVLVHFNLLLQLQLLS
RKLFAFDATVPQDDGIGGLRDRLHIFAFTYYLAGNPSRLVKPFLAIMVSI
TLVLLPLFVLFAMQLQFLAYQDEVITWMQRFALWLDIALINIFLPTMLHP
KDDWKSYWRNVIACYVPHRRVWLSFLLLYVGTNICLFASKKEILLIGIAL
LVLSLLLLPILRGWKATHKVQKIIIPILIIVTFAIIALLFLVEVRDWIEI
TITSFISTETIREKVFPLSFILYALIIVLTVLWQQSAPRGSFALVVTLFL
GTLFPLAFMVDGEHLEKIIAKGENATFLSNVLQDKRRLNLSEQHLFAKAL
KPEIITLISDGKWKEALPQIEPINLQGRHLRHAELNQAMLLGADLRFADL
QGAYLSDADLQGAYLSDADLQGAHLRQAELQGAHLRQANLQGAYLRQADL
QDANLSYTNLQGADFIGADLQGADLRFAHLQGANLFGAHLQGAYLFVAHL
QGAYLSGAHLQGADLSAAHLQGADLFGANLYAADIRRANTTLVDAQNIRL
EPLSEKEATELRTTLKPLIKDNEDYNEVAERIKKATAPHGEIPYFESILA
EKNTPLRYKKCYNAENSAERRAFTKQLHPYLVSLASQSPEIARGIIQQIP
ISEPNTSSRKGLAAELAKHLNDPKCKGLYELRDDEKEELRNWKEE
>Cag_0455 hypothetical protein
MKFTIEVDQEIDGRWIAEILEIPGVLKYGNSQHEAIAQAEALALRVLAER
IEEGEQLVEPISITFAA
>Cag_1372 hypothetical protein
MFINLLKFMPSRSVQAFRDNIVDVDRLIVSHAQLRDGSPGKKGLGHITRS
GIVMLCAAWELYLELICVEAAKYFCLKCQSPDQLPIRVQKELSKMAKESK
HELKPLEFAGNGWKNVFITHVEDLCNTINTPKAGPINELFNRSIGLELIS
DSWSCGKDQINNFVSIRGDIAHRGRHADYIKISLLQDYRALIYNATIETD
NTVSEYLALKTPGKHKPWRVTS
>Cag_0976 HAD-superfamily hydrolase subfamily IA
MNIKALIFDLDGTLLNTLEDIANTLNATLARHHFPTHSLDECRFLVGAGL
RELIRKALPSEAAADDNMVDKLLSEFIEMYRTSWNQLTRPYEGIIEMLAA
IAERNLPMAILSNKADHFTQQCAEELLPRPLFSVVLGHRDGMAHKPDPAG
ALFVAAELGVEPTSVVYVGDSSIDMLTATRAGMYAVGVCWGFRPESELRA
HGAQSLIHHPLELVTLIDTLREANA
>Cag_0515 dTDP-4-dehydrorhamnose reductase
MNILVTGSRGQLGSELQALSVRYPQHSFFFYDLPELDITNSEQINHICNA
HHIEVIINAAAYTAVDKAESDAETAFRVNSDGAALLATYAKENHALLLHI
STDYVFDGTSSVPYKESDPATPLGVYGRSKWEGEERIRAINPSHLIIRTS
WLYSMYGANFIKTMLRLGGERSEVRVVFDQVGTPTWAADLAEALLSMLSS
IYKGKHYSATYHYSNEGVASWYDVASAVMEMSNLSCKVLPIESHDYPVPA
PRPHYSVFNKAALKSDWNISISHWRTSLAAMLHSMKTSHHS
>Cag_0848 Histone-like DNA-binding protein
MGNTTTKIDLVTTIARNTGLTKYETEAVVNCLFESIIESLKAGRRIEIRG
FGSFNIRQKNVRKARNPRTGEKVMVESKQVPSFKISREFKLAVSESLKSS
EL
>Cag_0916 C-type cytochrome, putative
MKKMFLISIGMVTALPLLAADGKAIFERSCAACHSVMPPPKAAPPIMPLA
FHYQSTFKSKEEGVKHMAAFLKNPDKAKAIDQQAIQRFGIMPAQALSDEE
LKAVAEWVWDQYNPASGMGRMGGMRGQGQRWQQP
>Cag_1773 conserved hypothetical protein
MKPLRQQRWGATTQGIAIWLIALLWFLPIGMLQALTVPALTSRVNDYANM
ISPNVRAELEAKLAALETTDSTQLVILTVPSLEGDPIEDFSIRVAEAWKI
GQKGTDNGVLLIVSQADRKVRIEVGYGLEGKLTDLQAGRIIRNNIAPAFK
MGEYDLGFVQGTNSIIAAVRGEFIASDKKSQKSNKPSMPLLFVILFVFYI
LSQLMRGHRQSGPMAYGGPGGFYGGGFGGGGSGFGGGGFGGGGGGSFGGG
GSSGDW
>Cag_1054 probable activation/secretion signal peptide protein
MVPKIITSLVTGSIMLSASLQAAPYVPDAGSLQQQQRPAAVSKQKKQMVQ
DNKGKSEPSKPFVIKPSANAKVPVKRFTFSGYEGTVSRSELQDMVKPYVG
KNLSMEQLHAVSANITSELRAKGWLASATLPPQDVTAGTVHITINSGKTA
MTSITGDESVRICERPLRQIAEKTCPSGSPLNTDDQERAVLLMNDIPGIA
ATTSLSKGMLAGTTDVNYLIREGALLSGVLWGDNYGNRYTGTWTQNAVLN
INDPIHYGEQFSLNVGHSAGMWRGGVNYRVPMPFLFAGLTGHTGVSGMQY
ELLEDFEVLDYEGSSINVDAGLSYALLRSRKANLTSDVSYTYKGLKDSMG
NTDLRDGTIQSVTFGLSGNYRDDLFFGALTTADLSITNGSLEEKIRDISL
SNSEGGYTRLNMGLARYQRFSEPFVLDLAFSAQRALNNLDSSEKFFLGGP
QRVRAYPLGEAAGDHGALFKADFRHRISVPEEWGDMFVNAFYDAGHVTLN
KDRYASDSATITATGRNDYWLQGAGLGLRYDISENFTLQGCWAHTIGKNS
GRSVDGNNSDGKSDNNRFWVQGLYYF
>Cag_1259 hypothetical protein
MSQKHEAWQIDVALANQAAFRHFKKRHEREYISCFNNLNKIKRLLEEGKK
LSELHYHPSFFRHETDGIFRIGQSGVSGAKESRLYIYPDNQHRIIYILEI
GTKETQQADIAAAQKAIQQIFLR
>Cag_1396 conserved hypothetical protein
MQLFFLLLFLLPSIAFGKQPLQCGIDRLDASQFRELAGLRVGLITNAAAL
TQRGEQNYKAMQRCGVNLRFLMAPEHGLAANVEAGKKVGHSVSADSLPVY
SLYGATRKPEQRHLHDIDLLLFDLQDVGVRCYTYISTMKLAMEACAEAGI
PFMVLDRPNPIAPLQPQGFMLQAGYESFVGLVSVPFVHGMSVGEIALLLQ
RAHYPNLDVRIITMTGYQPHCFGDELEGFTFRSPSPNIRDVATALLYPAT
VLLEATTVSEGRGTQAPFQQFGAPFIKSNELLQALERYHLAGVKLRAVRF
TPTASKWKGEECKGIGITVTNRRSFSPFAMSVALLRELQRLYPAQLGLDI
KRNATFFDRLAGTPRLRELIVQQASLQDILAESEREVARFRAVRLYR
>Cag_0750 hypothetical protein
MNFGAWDGKHFSNCNHLIANQWKGCFIEGNIDRYRELVATYSENKDVVCL
NFFIKYQSRLLLIEFNPTIPNDVIFIQEKSNNVHQGSSLLALIILGKEKG
YELVCCTTCNAFFVKKELYSFFNLKSNSIYSLYQPLCDGRIFHGYDSKIF
VVGMSKLLWSNISIDSSDFQVLPKSMRYFNDAQ
>Cag_0580 glutamate synthase (NADPH) small chain
MTSISLTINNISVSVPQGSTILAAAEAAGVTIPTLCFLKELEERGACWMC
IVEIKDKNRFVPACNTAAAEGMVIETENPTLSAMRRQNLERIIVEHSGDC
NAPCELACPAGCNIPDFIAAIERGDNAKALEIIKEDIPLPAILGRICPAP
CEEACRRHGVDEPLSICALKRYAADRDSEQAERYLPPCEPSSGKHVAIVG
AGPAGLSAAWFLLRKGHKVTIFEAAPQAGGVMRYGIPRFRLPESVIESDV
APLLAMGLELRCNTRFGRDVTFDNIRTQYDALLLAAGTEEAASMGIAGEE
LEGVISGITFLRNVALGTQSTLGSKVIVTGGGNTAIDAARTALRLGAEHV
TILYRRSRADMPANASEIGEALAEGITLREWAAPLSIHAVNGALEMQAIA
MQAGELDASGRRKPVPIAGSKFTLQANTIISAIGQQLNPALAEAAALTTT
RNGLAVNPDTLQSTSDASLFACGDCVTGSDTAIRAVAQGKLAAHSIHSYL
TNQPVEAPTQPFNSSYGNREQAPKAFYAQKEAAPRVALPELPLSERQGNF
HEVAIGYNNELARTEAARCLRCKCNAINNCRLRDLATHYLFGKVEQHPEH
LGFYKAANSAISMEREKCVDCGICVRLLEEHNGNVEIAVMRQSCPTGALS
MPM
>Cag_0937 Phosphoribosylformylglycinamidine synthase I
MSAITVGVVVFPGSNCDHDTAYALASFAGVKPVMLWHNNHDLQGAQAIVL
PGGFSYGDYLRAGAIARFSPLMKEVVEFARKGYPVLGICNGFQVLLESGL
LEGALSRNRDKKFLCCQTTITPVNCSTRFTEDYHQGEVLRIPIAHGEGNY
FAPPEVLESLQEHNQIAFQYCNAQGEVTVEANPNGSLLNIAGIVNRAGNV
LGLMPHPERASDAALGSVDGRLLFESLFRSLTGV
>Cag_0878 conserved hypothetical protein
MRIKSLSYLDSHWELQNLELSKVNLIVAKNSTGKTRVLQTIDLLVKMITQ
KRDLNWGGQWDVTFENCKGDKIQYQFSTSYKKQGVTFEKMLVNGKLVLNR
VREFDSGKAHIKNVISKVPFDEVYPPDNKLVIHSNRDVKKYPYLEDIANW
ADQSYGFKFGNISPYSLLNQQEYDLLTAVEDIPTLFKSLKKEDKIEVISN
FNSIGYKITDVTLQDKGEISIISVKEKDVDKLIPHFKLSQGMFRALAVII
YIQYLISRKKPATIIIDDLCEGLDYERATKLGELIFTKCLDNDIQLIATS
NDSFLMEVVDLKYWNVLLRTGKKVQGISAISNPEIIENFKFTGLSNFDFF
SSNFLKSQSI
>Cag_0958 conserved hypothetical protein
MLTAYCGLDCEKCEAFLATQENDDAKRITIAQKWSAQYHADIKPEHINCN
GCKSDGVKFFYCTNMCEIRQCCISKGVDNCAKCSDYICDILSNFIKVAPE
AGVVLAKLRAS
>Cag_0800 conserved hypothetical protein
MQRRHPFIILALLVALALTGCSDFRSVQREELPSSETTLLAEHTELYREV
AAAVPIIHSLDGYADVWVKTPSEQHKVFCNIRLQRGGASRMVVSAGFIGL
PVADVFLTRDSLYVHDMLRNRLYVGSNSAASLEKMLNIKSSYQLLSESLL
GLVTLHEPPSAVTAVKQGGGMLLLTVKSNDSEKEVVIDPIARTLNGMMFK
EPNSDAVTEVRFRNFEAVMVEGQRALVPKEIEVARYSGVPNSTPTHTLVI
AYDERRFNTLQQPLRYTPPKKAKVLVIKG
>Cag_0146 Monofunctional biosynthetic peptidoglycan transglycosylase
MVLAAILLFLVIDIGRYAVYPNIGRLVDENPTKTAFMEYREAEWQREGLE
DKRIRQRWVPLKQVSPNLIKAVLIAEDDKFWKHEGFDYKAMEHALEKNIR
TKKISMGGSTISQQLAKNLFLSPSKNPIRKIKEAILTWRMENTLSKRRIL
ELYVNVAEWGDGIFGIGEASRHYYGVAPSQLSARQASRLAAVLPNPIRYS
PIGSARYVRNRSNIIHAIMVRRGIVLPDYNEVMELPMDSTAVDSVNIGIP
FNLLEQAINADSTLSATPSVPATEVVKQEEKSEVGASVEGNGKGVGAP
>Cag_0638 NADH dehydrogenase I chain H
MTVTALSQLSLPLFMGTTLNAWSDALAGFTPWGLPVGLLIIAAIPLVFIA
LYALTYGVYGERKISAFMQDRLGPMEVGKWGILQTLADILKLLQKEDIVP
LSADKFLFVIGPGVLFVGSFLAFAVLPFSPAFIGASLNVGLFYAVGIVAL
EVVGILAAGWGSNNKWSLYGAVRSVAQIVSYEIPASIALLCGAMMAGTLD
MQKITILQSGELGFAHFYLFQNPIAWLPFLIYFIASLAETNRAPFDIPEA
ESELVAGYFTEYSGMKFAVIFLAEYGRMFMVSAIISIVFLGGWNSPLPNI
GAFELNTWTSGAVWGAFWIIMKGFFFIFVQMWLRWTLPRLRVDQLMYLCW
KVLTPFAFVSFVLTALWEIYVP
>Cag_1966 conserved hypothetical protein
MLYEDTDNVTPNGGQECPPSFSPSCLPFLNPDCEIAMTHHRLPHWQQGDV
WVFVTWRLADSLPKVTLDEWTETRKIWLSLHPEPWDEKTEKEYHQRFSLQ
RDEWLDQGCGSCLLKDTVNAKIVVDALLHFNGLRYQLASFVVMPNHVHVL
FRPFGKYSLSEIVKSWKGFTAREINKRLGTKGVLWQDGYWDRLIRNERHF
FKVVAYIRHNPINAIQKEGGHSCPPFQCFVE
>Cag_1629 conserved hypothetical protein
MEKKISKFSEEQLNKIISEFKKESKLYEYRTKDYPFEVIHAKFGDEEDLT
KSLYVPAYQREFVWKPDKQSLFIESVLLGVPLSPFLVSDNINDNENAGRL
EIVDGSQRIRTLIAFYENKLRLIKLDKLKEINSAKFKDLPKTLQSYLYNR
DFRIIVVENAEIEIRKDLYKRINTTSEPLSDAEIRRGSYSGDFYDLVIEL
SKNVLFKEICPISEQKNDRGEGEELVLRFFAYIDNYLQFKHDVAFFLNDY
LDDKNEKGFDESFYKSAFLSMVNFVKENYPLGFRKEENSNSTPRVRFEAI
SVGTYLAIKENDSLQKPNMEWLYSSGFKIQTTSDASNNPDRLKNRIEFVR
DGLLGKLDTNRLQNG
>Cag_1535 conserved hypothetical protein
MEQPNSDNIVKTAVGVAGGSALLAPALPLALPALPLAMPVIHGLAGIALI
GAGVFAVVQAAGAISSLDNPFQPKKPK
>Cag_0525 Amidophosphoribosyl transferase
MCGVFGIYNSKTPAEDAFYGLYSLQHRGQEAAGIVVAEYNKVKKKTIFKQ
HKGMGLVSEVFKDESVFQKLSGYAAIGHNRYSTTGSSSAVSNIQPFSLIY
RSGSLAIAHNGNLTNSRTLRRELTELGVIFQASSDTEIVPHLAARSREQE
PVLQIRDALSQVQGAYSIVLLANNQLIAARDPFGFRPLALGKKVDPLTGE
LAYVVASETCAFDIIQAQYIRDIEPGEILLIDHLAVANEQPTSYFLPTTT
NKARCIFEYVYFSRPDSFIFGQSVDKVRRNLGKNLAYESTIRQEADEKEL
TVVSVPDSSNTAALGFVRESIKLGRPARFEHGLIRNHYVGRTFIQPGVQS
RDIKVRSKYNIVRGVLLNRPIILIDDSIVRGTTARMLIKLIREAGPKAIH
LHISSPPITNPCFYGMDFPSKRQLLTHLFETNEAGEQEIDKIRDYIGVDS
LKYLSMEGLLNSVPKFEGETCSYCTACFTGVYPIKEHDMTADKDEND
>Cag_0530 Hydrogenase formation HypD protein
MKYIDEYRDPALAQTLLEEIKRTVTRHWTIMEICGGQTHSIKRYGIDQLL
PKEIELVHGPGCPVCVTPLESIERALAIAARNEVIFTSFGDMLRVPGTSR
DLFMVRSAGGDVRIVASPLDALELARCNPQKEVVFFAVGFETTAPANAMA
VWQAAREGIENFSLLVSQVMVPPAMRVILSSPDNRVQAFLAAGHVCAVMG
YEEYEPIAQEFHLPIVPTGFEPIDLLAGILQTVTMLEGGESGIVNRYGRV
VSREGNPMAKQMLADVFRVVDRPWRGIGMIPASGLALRAEYVRYDAEQKF
DVSSICALESPLCMSGSVLQGLIKPDACPAFGQECTPEHPLGATMVSSEG
ACAAYYNYGLY
>Cag_0198 Acyl-(acyl-carrier-protein)--UDP-N- acetylglucosamine O-acyltransferase
MQSFIHPTALVGQGAQLGEGVTVGPYSVIEDDVVIGSGTTIQAHVHINAG
ARIGNNCKIFSGAVLAGEPQDLKFSGEKTLLIVGDRTVIRECVTLNRGTK
ASGQTVIGSDNLFMAYSHVGHDCVIGNHVVVANGVPFGGHCEVGDYVVVG
GLAGVHQFTRIARCAMIGGISRVSLDVPPFVMASGHESFRFEGLNLIGLK
RRGFTTDQITLIRNSYRIIFQSGLLLANAIEKVKAEVPQEPEVVEILEFF
TSGKHGRKFIRQFNQ
>Cag_0749 conserved hypothetical protein
MKKLIKLHIGGNEAKEGWEILNTVGKDNVDYIGDIRNLSQFESESCSVIY
GSHVLEHVSQREMLPTLNGIKRLLIPGGKLMISVPDMDILCKLFVHQNMN
IQGKFHVMRMMFGAQIDSFDFHYIGLNFDMICYYLSKSGFVRITRVKDFG
IFNDTSTYAPFGVPISLNVICYK
>Cag_1647 Elongation factor Ts
MSQISAADVKNLRDITGAGMMDCKKALDETGGDMQQAIDFLRKKGAALAA
KRADREAHEGMIQVKLANDCKRGVLLELNCETDFVARGNDFTSFTAALSE
LALSQCVASAEAMLPLALGAAYDGETVDSAMKTMTGKLGEKINLKRLAFF
DMPDGVVEGYIHPGAKLGAIVSLKTDKPELVGELAKDLAMQIAAAAPIVV
DRSGVPADYIAKEAEIYRQQALEQGKKEEFVDKIVTGRLEKYYQDVVLTE
QVFIKDDKLKVSAMLDQFRKKNQATLDVVGFVRYQLGE
>Cag_0012 Camphor resistance CrcB protein
MVDKAAHILLVGVGGFLGSVARYLVALWMAPITAVFPFATLTVNLLGSFL
IGFISELALSTSLISPSTRIFLVTGFCGGFTTFSSYMIEHSALLRDGEHL
YAALYLFGSLIGGFIALYLGIISARWMAG
>Cag_1014 TonB-dependent receptor-related protein
MNKKVFVVVMAGLLCSKGLMAADGQPMSVMGEEMVVTSSRFEEPKKNLTS
NITIIGKDEIAQSSAQDLGELLAEKNLGQVQKYPGTMTSVGIRGFRSETH
GNDLMGKVLVLIDGRRAGTGNTAKIMTENVERIEIIQGPAAVQYGSAAIG
GVVNVITKKGDNVPSFFVEQKGGSNEFVKTAAGVQGKIGKLDFASALSRS
EAGDYKTGSGKTYFNTGYDEQVLANLNVGYEIADGHRIGFGYHSFDVDKA
GSPSYLSLNDLQSYTSNNNHAIDVSYEGALTNKRWLWSTRYFSGMDSYQY
VDPSTSYTSSSDVEQQGAQAQISFTEKSLLITAGVDWLNYELTSTLAPKW
SKYNNPAVFLLAKYGVFEDRLLLSAGVRYDDYKVDLQPAEGTSRSTDNFA
HQVGAAWQVNDVVKLRASYAEGFRMPSARELAGNIVSFGKTYIGNPNLNP
EVSETWETGIDVVWREITSSLTWFSTDYTDMIETQLTAPKTYLYKNIGST
SLSGIEAEFAWKSSATTWNIEPYVNYSYLLEHKDNATGDDLLYTPEWNAS
TGVRLQHTNGLSAALNVTATGSSNVQDYESNSGKVITKGGFSVVNLSASK
KFTLDKQERRAITIKAEVDNLLDRDYQYVKGYPMPGRTFVIGLRADI
>Cag_1565 Sua5/YciO/YrdC/YwlC
MAVTLPFLMHTLLTDSPAEAAALLNGGKIVAFPTETVYGLGATINHSEAL
KQIFVAKGRPSDNPLILHIAHPDNIGKVAATVPRYAERFIEKFFPGPLTL
VLPKQPQVPELVTAGLQSVGVRCPAHPMAQALLALVNSPVAAPSANRSGK
PSATSWETVYDDLNGRIDGLLRSAPCTIGLESTIVDCTANTPLLLRAGAI
TLEQLQAVVPSLRVNQPTHNNEPPKSPGLKYPHYAPHATIELLAVGAVPL
TLPPNSAYIGIAPPPNGATLVMQCASIEAYGNALFQFFRTCDAHAIRTIY
CELPTSDGLGRAIGDRLLRAAGRM
>Cag_0002 DNA polymerase III, beta chain
MKILSSIRQLQEPVAKVAQAIPSKSVDGRYDNIHFTLEPNALTLFGTDGE
LSITAKIEVESTDSGHIGINARTLQDFLRSMYDTPVTLSIERQEISDHGM
VEVTTDKGRYKIVCLFESKPERYDKVYDITLDLPTSELLGLVQKTLFACS
IDGMRPAMMGVLFELEGTTITAVATDGHRLVRCRKESSLDIAEKQKIVLP
ARVLSILQKLAQHESITMCVSTDRRFVRFISGHMILDAALIVEPYPNYNA
VIPVEHDKNVVINRQSFYDSVRRVGRFSSIDDIRLILENDRLTVMAENTS
DGEAAQEELPCSYNGEPMTIGFNAKFVEAALAHLDDEEILIELKSPTTAV
IFTSSKIEDRDKLIILVMPVRINS
>Cag_1858 transcriptional regulator, XRE family
MSKRSNVKTLDQFVDEQYGKKGIVKREKFEKGYEAFKLGFLIQQARLEKG
LTQEELAEKCGTNKGYISKIENNIKEVRVSTLQKIVELGLGGHLELSIKL
>Cag_1394 tRNA isopentenyltransferase
MATSAMHHPPIVVLAGATASGKSALAMAIAKKIGAEIISADSRQIYWELT
IGAAKPSPKELQEVPHHFINEKHIGEPFTAGDFAVEAWQRITAIHQRDKR
VVVAGGSTLYVEGLLKGFANLPSANEAIRQRLETELQTLGSEALYQRLVK
LDPTQAATLDATKTQRLIRSLEIIESSGTSVTALKAAQQPPPSHFTFLPF
ALFLPRETLYQRINQRVDDMMANGLLHEAEALYQTYCDTWQERNLSALRT
VGYQELFAYFEGRHSLDEAINLIKQHTRNYAKRQITFFSNHLALRWIEQA
EMEEIVEQHGF
>Cag_0182 Fic family protein
MKSTYDPPLSITNTVIHLIADISAQIERYAIRMEQDDGLLWRKINRIKTI
QGSLAIEGNNLGTDQITALLEGKQVIGTMREIQEARNALKTYDAFSTFDP
YKQVDLLRAHGLLMEALVDNHGKYRRGNVGVFAGDSPIHVAPPAHIVPKL
MDDLFEWLTFNNNHLLIKSCVFHYEFEFIHPFMDGNGRIGRLWQSLLLAK
LHPIFEYLPVETMVFKNQQRYYQAINDSTNVADSAIFVEFMLGEILTTLK
NRQGEPLQCLPTNSMSGAVNGVVSGVVKTVYDFIEQNPGCRKPQIAKKTE
IPIKTLEKHITKLKAMNKIIFVGSPKSGGYYIKIGEQKTVWTTLAVAQNQ
TQ
>Cag_1933 succinate dehydrogenase, iron-sulfur protein
MNFTLHIWRQKNATAQGRMVTYTVSNISPDCSFFEMLDILNQQLILSGEE
PVAFDHDCREGICGTCSLYINGRPHGPLKGVTTCQLHMRSFRDGDAITVE
PWRAKAFPTVRDLVVDRSALDRLIQAGGYVSVNTGGVPDANSIPVAKTDS
DAAFDAAACIGCGACVAACHNASAMLFVGAKVSHFALLPQGRVEAERRVS
HMVACMDELGFGHCSNSYACEAECPKEIKVVHIARMNREFLVASLFGD
>Cag_1517 HemX protein, putative
MDNNLPLFVVNQLVTILYLATTFLYGAHFFMDMAIARQFKQPALITTVIV
HVFYLGFLTSTQGYQLSYSTLNLMTMVGFTLTSIYLLTEFTTKSDKSGFF
IISFAAGAQLLSSLLITQVEQSSQTFSGLSSGLHLLMAIFSFSSIAIAGI
YSLLYLLLFRQIQQNRFELFFQRLPNLEVLEMLCMHAVSLGFLFISLTLL
SGLWAQSNTNVAIAVMEPKFITMAVIWLIYGAGMFIKPFKGWDMKHMAYL
LIFLFLFVTLLIVLMTLSSPTFHSRLL
>Cag_1833 Ribosomal protein L30
MSDNDKTVTITQVRSVIGCTQKQKATIQALGLGRPNYSVVKPDNACTRGQ
IRVVQHLVKIEEN
>Cag_1299 dihydrolipoamide acetyltransferase, putative
MAKDHYFTWQLSAEISAKIRYQEYGHEHHGKTPILFLHGYGAMLEHWDLN
IPHFAEQHKMYAMDLIGFGKSQKPNVRYSLELFAQQIQTFLLYKKLESVI
IVGHSMGAASSLYFAHHQPEPIKALVMANPSGLFADTMDGVASMFFGLVA
SPVIGDVLFTAFANPMGVSQSLTPTYYNQNKVDDKLIRQFTQPLHDVGAQ
YSYMSPSKRPLDFRLDHLPKPCNYQGPAYLVWGADDMALPPQKIIPEFQQ
LIPHAGAFIIPKAAHCIHHDAHEAFNQRLAFILQELEG
>Cag_0394 cytochrome b6-f complex, iron-sulfur subunit
MAQQGNFKNPARLSALGDGASASSSGTVAGGKPRGDGLSGINFERRSFLG
KVVAGAGAAVAATTLYPIVKYIVPPTKVVVEENEKVVGKASEVPEKTGKI
YQFNKDKVIVINDNGKLTACSAICTHLGCLVSWKAGENLIFCPCHGAKYK
QTGEIISGPQPRPLTIYKVRVEGEDLIISKA
>Cag_0264 ATPase
MPNTILKLERIRKDLELSRSIRQTIIPNLSLSIERGEFVTITGPSGSGKS
TLLYLMGGLDKPTSGSVWLDGDELTAKNESEMNRIRNEKIGFIYQFHFLL
PEFTAVENVSMPMLINGRRSRKEIRDRAMMLLDLVELQDKYTNKPNQMSG
GQQQRVAIARALANEPKVLLGDEPTGNLDSRSANNVYQLFDRLNRELQQT
IIVVTHDEAFANRASRRIHLVDGQVAYDRQLRTEASVTNEAASLVS
>Cag_0085 conserved hypothetical protein, phage-related
MSYELEYFHPRIQKEIADWPKTIRMDYARLVELLLEFGADLKMPHSKAMG
DGLFELRAKGKEGIGRAFFCFMKGKRIVILHTFIKKTQTTPQRELDKARQ
RMKEVINAKE
>Cag_1648 Ribosomal protein S2
MLRAGVHFGHLARRWCPKMKPYIFMEKNGVHIIDLQKTAELANTALKALE
AIAQTGREIMFVGTKKQAKVIIAEQATRSNMPYVSERWLGGMLTNFQTIR
QSIRRMNSIDRMATDGTYDMITKKERLMLGREREKLMRILGGIATMNRLP
AALFVVDIKKEHLAIKEARTLGIPVFAMVDTNCDPELVDYVIPANDDAIR
SIQLMVKAVADTIINARELKVEQEVLATMDEAEVEGDKEDISE
>Cag_1675 ATPase
MKQELLVECKNVAVEFGGVRVLEDVTFSLNKGELLGIVGPNGGGKTTLLR
LLLGLEKAASGTISLFGKAPGMCPRRIGYVPQRLLFDRDFPLSIQELVLM
GRLSSKKMGERYNHHDYERAKAALLQVGLERHAKRWCGELSGGQLQRAFI
ARALAGDPELLLLDEPTASIDPQMKTTLYDLIESLKAHLTMILVSHDTEA
ISHHVTRMVSLNVTLTPLQQPFPACSHPSACYL
>Cag_1507 transposase
MKDTVLFQQALCLPMPWFVKSSAFDIEQKRLTIQLDFQKGSTFSCPTCGQ
HDLKAYDTAEKQWRHLNFFQHECYLTARVPRISCPTCGVKAITDLPWARR
DSGFTLLFEAMIIALVPSMPCKTIANYVGEHDSRIWRIIHYYLDEALEQQ
DLSAVTKVGLDETASKRGHNYVTSFVDLESSKVLFVTEGKDATTVEKFHK
HLLAHKGKAENIKEICCDMSPAFIKGVTTNFPETHITFDKFHIIQVLTKA
VDEVRREEQKERPELAKSRYLWLKNQVHLNQSQQVKLEKLQLKKLNLKTA
RAYQVKLNFQEFFKQAPAYAQSFLNQWYYSASHSRLEPIKEAARTIKRHW
YGILRWFTSNITNGKLEGLNSMIQAAKARARGYRTTNNLIAMIYFIGSKF
EFTLPALTHSK
>Cag_1741 Anthranilate phosphoribosyl transferase
MESKQLLQKLLAGEHCSKEEMQDCMNSIMDGEFSDSVIAALLALLQKKGV
VANELAGAHASLMAHATTVALSTHAVDTCGTGGDHGGTYNISTTASLIAC
SAGVRVAKHGNRSVTSSCGSADVLEALGFTLELPPEATISLFKKTGFAFL
FAPLYHPSMKRVAHIRRELGIRTLFNMLGPLLNPAQVKRQLVGVFSEELS
ELYADVLLQTGARHALIVHASTEEGVILDEPSLNGTTFVTEIEKGVVRKH
TLRPEEFGIAPAPLAALQGGDKEHNARIIQSIADGSASAAQRDAALYSSA
MACYVGGKCACLNDGFIVAKEALESGKTQAKLKEIIAYNQALVTEYHVAK
S
>Cag_1022 hypothetical protein
MDNTFLSNGGIVTTDFGGSDSGSSIALQTDGKIIVAGESSGGGDGGFTVV
RYNADGSLDITFDGDGKVTTDFGGLEYATSVALQGDGKIVVAGYKGISSS
GGGDFALVRYNADGSLDITFDGDGKVTTDFGGWDEAESVTIDSNGKIVVV
GYTGISSSGGGDFAVVRYNVDGSLDATFGAGGKVITNVGGEEYAHSVIVQ
SDNKIVVIGDTGISSSGASDFALVRYNNDGSLDTTFGVSGKVTTNLDFGD
FVGGATMQSDGKIVVVGESYSLADMAVSGDQDFVLVRYNTDGSLDTSFAD
DGTLVADFGGGESATSVAVQADGKIVVTGDSFPAGGSGDSNVIVVRYHTD
GTLDTTFSENGFVKTVVGESEGNSVVVQSDGRILVAGQSNGDFSLVRYNS
NGSMGLGIDFDGTPIGTPSSYESFADLYFDETQATIAGYIGSALVTLAPP
GVVHCFDEDHDGFADHFTRTWVDGNGTQSISGTTVWLDNNIFKSSGSALI
GGIPYTVNQYGRAAYDADGDVVGMYFLTVNPVFTLTADTTVGNELVATFT
IPNQAETWSLLDSDLNGVVDHVNRWNSWIDQNSVTQIRNFTYLLTWSDIT
HFTARQVGVITGTSFDALGRPLGITFSNSTPPAHILPITWLDTKGDDNVV
ATFDIPSTIFGQLLDTNDDNLPDQAVFIETSSSGQKDTATATIQGWSSWS
DISTQQVTMEIQTSTAPWNFFTGTINGTSSNPTTVIMPSYFMGNNVVVPE
TTFPSMTNNSLTFDLGTTGASLASSGSITLWSSATGTITIPVTSLTFDGS
HLTIPLSGTDVATNQPYHLYPNSVDNFRVQIPAGVVIGEPTIDKAWFVGE
WNNIGYELSPMMIVYNGDGTTDADWVLGTSGNDSVAAGAGDDLMKWSAGN
DTIDAGDGYDKLYMPKAVPSANYITKTDSQGVLHIGEVNAATNTIIADAY
RITRLAAGSFQIQKMDSTGTTVTQTMLLNNAEVLHIGPPSNYTSVALTIN
YANEFIYGTPWRDTIYLNASNISTLSQIWAYSSTDTLAIDVGAGYSKIEV
VREGSTSLLKGTLIADGTVVDLGSFSKALPSQYNYTATMSIGTGESAHSF
TINNIEAYRFTSGDVILTVDPIPPTVISFTPSDNATAIAVGSNIVLTFSE
TVQAGTGNIVITDGTDVRTIAMTDSSQVSIAGNTLTINPTADLAKGMHYF
VMLDAGSIEDLAGNDYAGTTSYDFTTIVGGVITTDFGGDSFGCGVTIQAD
GKILVVGGGSNGDIALARYNMDGSLDTSFGNETGKLTTDFGYEDAALSTI
IQSDGKILVIGESVINGGSYGKCIIARYNIDGSLDTSFDGNGKVITDFLD
GLNVDGFYTTEAILQSDGKILIVGGGYQSGNSLVTLFRYNSNGSLDSSFG
NNGMVITPSISSLNSSDFPYGVVQQVDEKILVAVRSDNNAAITLIRYNSN
GTVDTSFNADSMQITGLKENDMDGGLVLQVDGKILVSGSSNGNIILVRYH
SDGTLDSTFNGNGNIVTDLGGNDGVGAITLQPDEKILVSGYSNNELALLR
YNTDGTPDTHFCNNGVVLTNIGSDSFHEITFAGWGITVQADGKILVTGQS
NGDFALVRYNTDGSLDTSFDGVSEPTPPTHDLSGHITFWKTGNAISNVQA
TLATMPMQPASDDVAFRNLQHQSNGGYTVELWATTTQTELQSIQLAMQFS
DNVTAQWQQSSAVPTGWLSVINNTQAGHLEIGTIGQATMQGDEEIMLGTL
TFSAPDNPNNFTLAVTSGWLGDNSIAPTSILCTATDSEGNYSFETIADGW
YQITGESNTAKLADAVTAQDALAALRMAVALNPDGGNNLDEVSPFQYLAA
DINRDGKVRANDALNILKMAVGIESAPADEWIFVAESAASKTMDRSHVDW
SFAEQPIDVYGDMELDLVGVVKGDVDGSWGMVG
>Cag_0536 Hydrogenase expression/synthesis, HypA
MLISGIRIAQCLLLVNNCKYSPANTMHEMSIALSIVEAVEEQARKEGAQK
IIALELVVGKLAAIQVESLTFCFAAAAKGTLAEHAALIIEEPEGIGKCEE
CGKEFPVNFYYAECPQCRSLRINIVSGEEFRIKAMEIC
>Cag_1874 conserved hypothetical protein
MLYKYLFLMKFISFLKSSALALFLTLPLALPAEAGWDPAAEDRARVSVEY
FKTQWTELDRYFSQAYGYAVFPDVYKGGLFFIGGAHGKGYVFEQTRLVGT
STITQLNAGPQLGGQSFSEIIFFKGQEDLERFKQGNFEFGAHMSAIVVNQ
SIATNTDYSNGVAVFVFPKAGVMVEASVGGQKFSYHPN
>Cag_1415 DNA helicase II
MSNFLHDLNEVQRSAVEATSGPVMVLAGAGSGKTRVITYRIAHLINNEGI
APRNILALTFTNKAAGEMRERVDTLLHHGASRGLWIGTFHSIFARLLRNS
IDRIGYDRNFSIFDADDSRSLIRQSMAELDISADAVPLNTLQSIISRAKN
SFVMPAEFQRNANDYNQQKAAQVYSLYCKKLKENNALDFDDLLIKPLELF
NAHPDVLHELQELFRYIMIDEYQDTNRVQYLVAKMLGARHRNIFVVGDDA
QSIYSWRGADISNILNFQDDYHDAQTFKLVENYRSTGNILKAANSVICRN
QRQIKKELVSHRHAGEPLTVMEAFNERNEAEKVADRIRTMRMSGTNDYRS
FAIFYRTNAQSRVLEDIMRQQRIPYRLFGSVSFYKRKEIKDAVAYLRFIV
NERDSESLLRIINFPPRKIGDVSIAKLRDFAEVRHISLYEAIHRAAEAGF
PARLLNALASFTSVIEALREMATRGTVYDVLNELFTLTSIPLLLQAENTP
ESLARHENLQELLSMARDFADHNPDGGSLGDFLENISLASDYDETQESDN
YVSLMTVHASKGLEFPVVFITGLEERLFPLHTYEPEELEEERRLFYVAMT
RAQEKIFLSYAKSRYQYGQLHQSIASTFISEIDASIVQSEGGRLLSDRRA
PREATTPQNHAAAPAMRRPTTSGMAPSSASSPTAESSSPSISNGTLVHHP
LFGQGVVLEVQGKGSKQKVRIAFRNAGEKTLMVQYANLKIQTS
>Cag_0860 Fe-S type hydro-lyases tartrate/fumarate alpha and beta region
MEQFKASMLALITETSANLPSDIRRALAAAIQREDAASQAGLAMSTISVN
IDMAVDNISPVCQDTGMPTFFIHIPKGVDMLPLKRDIEEAIAEASKTGKL
RPNAVDAISGKNSGNNLGEHVPVIHFEPWDKDEIEVKLILKGGGCENKNI
QYSLPQEIPGLGTASRDLEGVRKCLLHAVYQAQGQGCSPGVIGVGVGGDR
TSGFELAKKQLFRMLDDRNPNSELDALEQEIMEKANNMNIGPMGFGGKTT
LLGCKIGMSHRVPASFFVSVAYNCWAYRRLGVLIDPATGAITHWHYREAD
EIKRMAQGEGMPLTGREVVLQAPVTEEQVRSLKVGDVVLVNGTMHTGRDA
FHHYIMHHDLPAGLDTNGGILFHCGPVIMKNEDGSYRVTAAGPTTSIREE
PFQADVIKKLGLRVVIGKGGMGPKTLAGLQEHGAVYLNAIGGAAQYYARC
IEKVTGVDFLEEMGVPEAMWHFEANSFPAIVTMDAHGNSLHQQIEEESFN
ALASFK
>Cag_0257 ATPase
MRLKSMRLENFRAVEHAVIEFGNRLTLLIGANGSGKTTILDGIAIALGAA
LTYLPTLSGRSFKKGDLHQRHNSIAPYTRIALETTTGLKWDRIQRRDKSK
STSKLVPAADGIRALEQFLDATILEPMNQGSDYLLPLFIYYGVSRALLDV
PASRKGFTKKQHRFDALVHCLHADSRFRSAFMWFYNKELEENRLQKEKKS
FEVTLRELDVVRSAITAMFPDISEPHIALNPLRFVVRQQGELMDIAQLSD
GYKTLLGVVIDLSSRLAMANPHLDDPLAAEAIVMIDEVDLHLHPSWQQHV
VGDLLRTFKNTQFIITTHSPFIVEAINNHIKRQQIEGLPINNNEVNQLLP
LRSSDVKAYLMSDTIEMLMNNDVALLDDKLLEYFNSQNQLYDKMRDLEWE
HKG
>Cag_0555 conserved hypothetical protein
MSIWMLHHKHHRHSIRLPEYDYSTCGAYFITICTQNRACWFGEIINGEMI
LNNVGKMVKDEWLKTEQLRTNVQCGAFVVMPNHLHGIIVINETVGAIHEL
PLQMSQKQRRNMILPKIIGRFKIQSSK
>Cag_0265 conserved hypothetical protein
MALLESIRQSAPALHRAFASLADGEQHLQRLVELDGYWERLKQPPARVVL
PASALPCNAAISGEYDLIYAGGTLSMFHAAVMARRYGHSVMVFDRHTPAT
STRDWNISWEELLRLRDVGLFSEAELDSVVVRRYRDGWVEFYQPDGKQKR
LTIEHVLDCAVETSTLLGMAKVKLLEVPNAAIFGGYTFQRCYQLPDGVIV
EIIDSKGERLFYKCRLLLDVMGILSPIAMQLNEGRPQTHVCPTVGTIASG
FEGVDMEVGEILASTRPADVENGTGRQLIWEGFPAKGSEYITYLFFYDSV
ESANNKSLISLFDTYFRLLPEYKQMGKNFTIHRPVYGIIPAYFHDGVSCK
RTIAADNILLLGDAASLSSPLTFCGFGSLVRNLHRLTAGLEQALAANQLS
QEQLTTISAWEPNVAAMANLMKYMCFNPETDSPNFVNDLMNEVMIVLDSL
PHRYRQAMFRDEMKIEELVEVMLRVAWRYPKVLSATWTKLGVTGSIGFFK
NLAGWALGK
>Cag_0855 conserved hypothetical protein
MDAKEKVLEVLTQESEPLNAGKITEISGLDRKEVDKAMKALKDEGAIVSP
KRCYWTAAGK
>Cag_1755 outer surface protein, putative
MKKRVAGIVAGCLALACYATPAQAQMPYISGAVGLATLSDVNNVAQGAFE
DGHRLMGAFGIDSGSTRIEAEIGVQNNGVKTLADDIKITTFMGNLYYDFE
LPMAPIKPFAMAGAGMVDVEQKQLGQDTSFAWQVGAGVGFSIIPMVTVDL
QYRYFATASDAQLGATDYSIDASHVMLGLRVGL
>Cag_1096 Glutamate 5-kinase, ProB-related
MAQLLSSRYKKIVVKVGTNVITNKNGELDLEVLQSLTSQIATLQKAGIQV
ILVSSGAVGAGRSLIKLPPTFPTVAARQVLASTGQIKLISTYNALFEQHG
LLTAQILVTKSDFRDRLHYLNMQTCFHSLLQQQVIPVVNENDAVSVTELM
FTDNDELSGLIASMMDAEGYLILSHVDGLFDMKAGDGTIIKVVDPSTKHF
HQYISPGKSEFGRGGMLTKCHIAHKLSRLGISVHIANGRTPNILLSILNG
ENVGTTFLPQKAAHAPKRWVANSEGLEKGAVTINKGAEVALVADERANSL
LPIGIVSVEGQFLKGDIIKVCSIEGEVLGYGVAQYSSEKTLTLMGQKNQK
PLIHYDYLFITQ
>Cag_1391 putative type II restriction enzyme
MNQWIELSIEYANQRSYLDDLFSVYSTIPDSIRTINEKLWSNVERAFYEK
DNLSLIKELLLLDLFPIKDSYIAYLKKDITAIDRNPRTINRICGRLYEMG
LNKIYEKSSEPKETNRQIGPMFRNWMRKKSLGIEPVDLSTFMNNEDDAVL
DASDKVMMDFAREYLGYQHNKGLDLIARFNGTYVIGEAKFLTDFGGHQNA
QFNDAINTIEAKGVNAVKVAILDGVLYIKGKNKMYKAITSFYKDHNIMSA
LVLRNFLYSL
>Cag_0349 IspG protein
MYHYRRRFTREVPFGKIALGGYQPIRVESMTNTHTMDTAATVEQCRRLYE
AGCEIIRLTVPTEKDAENLRAIRDQLRRDGITTPLVADIHFSVRAAMKAV
EFVENIRINPGNYAAKPKFSAEDYTEAEYQAELEHVRQEFLPLVEKARSL
GASMRIGTNHGSLSDRVVSRYGNSPEGMVEAALEFARMCEEVGYYDLLFS
MKSSNVRVMIQAYRLLVAKADAELRYAYPLHLGVTEAGDGDEGRIKSAMG
IGALLEDGLGDTIRVSLTEDPVNEVPVGFAIVKKYNDFHLIKGDAGCIPL
KHVVESRNPSATSTTQSQELPPCNPYSYQRRVALPTTVGAYTLGGDEVPR
VETEVHTLVCYRDAVYKEVAARLALGKSRETLCSEYVSVEVFCGTDIDAL
DELLLRLGDDAERVVAFTNDVSLYSYLYDKVSKVRFDIAEDAHLHSEFYT
YFNNDRRAVLEFCFLHERTTDTVPAAVLSHLASKLQQQGVERVMFSVLSP
NPLFVYRRLVQQFNEKSLHYPIVVRYRDEYPNDDNTNRLESLITIAVQTG
TLFCDGIGDALALHTSLSVDDEINLAYNILQGARIRMSKTEFISCPGCGR
TYFDLEKATAAIKLRMGHLKGLKIGIMGCIVNGPGEMADADFGYVGAGKN
RISLYVGKECVEENLPEDEAVDKLTMLIKAHGKWVDPV
>Cag_0708 transposase
MKDTVLFQQALCLPMPWFVKSSAFDIEQKRLTIQLDFQKGSTFSCPTCGQ
HDLKAYDTAEKQWRHLNFFQHECYLTARVPRISCPTCGVKAITDLPWARR
DSGFTLLFEAMIIALVPSMPCKTIANYVGEHDSRIWRIIHYYLDEALEQQ
DLSAVTKVGLDETASKRGHNYVTSFVDLESSKVLFVTEGKDATTVEKFHK
HLLAHKGKAENIKEICCDMSPAFIKGVTTNFPETHITFDKFHIIQVLTKA
VDEVRREEQKERPELAKSRYLWLKNQVHLNQSQQVKLEKLQLKKLNLKTA
RAYQVKLNFQEFFKQAPAYAQSFLNQWYYSASHSRLEPIKEAARTIKRHW
YGILRWFTSNITNGKLEGLNSMIQAAKARARGYRTTNNLIAMIYFIGSKF
EFALPALTHSK
>Cag_1910 TolB protein, putative
MLPMPFFRKHVTCALLLCTPMLALPHPIIAADTEYIAIRKAGSTSIALVL
DTFEVTTGTSPALARQATTLVRDGLDFTGLFTLLQPPLNVKTSSLFSSST
INFKALDSIGGAFYAVGTLSSSGGEITLDGQVFEVATGKVLFGKRYRGTE
SQLRALSHAFSGDVVEFLTGKKSVFGSQIVFISNKSGSKEIYSCDFDGAN
VHQLTNFRSIALTPALSPDGAYLAFTDFTGGKPALAIRELATGKTTRVAK
KGNSIDPAWRNSRELATTFSFEGDQELYLLDSAGAVKQRLTSSSGIDLSP
TFSPDGRKMAFVSARSGNPQIFVYDFSSGKSQRLTFSGRYNTQPAWSPIG
DKIAFSTWESGGEINIFVINTDGSGLTQLTTQSGENESPSWSPDGRMIVF
ASNRQGVKKLYVMMADGKNQRPLLAIGGEQTQPSWSLFSR
>Cag_0607 Phosphoesterase, PA-phosphatase related
MMWFHQHQEATLRTIFKQITTAGESHWYIILGLLLFIIFRKHLPKVASQG
ALLASSVVVSGIAALLFKTTFGRARPKLFLSDGIYGFNFFEIEHAWISFP
SGHSATAFSVAMVLALCYPRWRWFWFAGGALIAFSRLILTQHYLSDVIAG
SILGAFSTLLLYHHIFKATLHAPTNAKEL
>Cag_0011 conserved hypothetical protein
MAVGNKRNDWFDATGHSKGDWMFQGNFQRTINYLSEHQKILLFNPFCHSV
DLLQGSENVYKWLFRVNDPQNNPFEVIFFVEQLEELLLDVPSEINCSDPS
TLTPEMIELYTTGKKITWRHYDAQEEVDDPSRYLFEGKVFADMLMEAKDN
DRTRVQIDLRVDVRFVLYPAFRIIPDPVLHAMVNGGMSILMQTATNRMFQ
AISKDFHSITPT
>Cag_0298 Glutamine amidotransferase of anthranilate synthase
MIFVIDNYDSFTYNLVQYIGESGAEVEVRRNNECSVAEVLALAPEKIVIS
PGPGTPNDAGISIELIHAVKGTIPLLGVCLGHQSIGAALGGNVIRAPHIM
HGKTSQVYHDGNGIFKNVANPFTATRYHSLIVERESLPEALTVRAWTEDG
IIMGMDSVALKLYGVQFHPESIMTAEGKKLIGNFLEL
>Cag_1650 Ribosomal protein L13
MSMSKTLSFKTYSAKPAEVERTWYVIDAEGLVLGRMASEIAKVLRGKHKP
QFTPHIDTGDFIVVTNAEKVLLTGRKSEQKTYFTHSHYPGGVRIDEVKDV
IRTKPERVIENAVWGMLPHNNLGRQLFRKLKVYAGTNHPHASQSPIEMKI
N
>Cag_1559 transposase for IS1663
MKSNSLSVKTSPVKYSFGLDVSKAKIDVSFCTLDDQQQVKVYGSHSFSNT
NKGFVELLLWCHKKCKETLPTVYILEATGVYHEHVAWFLHDHDCAVSIVL
PNKACHYKKSLGLRSKTDSIDAFGLAKMGAEQNLPIWETPDKTLRELRII
TRHREDLVTDKTIILNRLEAFEFCHNGSALMIKQLKKQLSLIEKQIEEID
QLVKETVEENAELKARFDKILAIKGVGLITLATIISETDGFSLITNQRQL
TSYAGYDIIENQSGNHTGKTRISKQGNSHIRRILHMPAFLVVKYEPQFAN
LFERVYERTKIKMKAYVAVQRKLLILIYALWKNGTVYQSTAQPIIASKLC
A
>Cag_0667 hypothetical protein
MKRNIAFFITSNVENSKSFKLAKSHFVKLSDRFGFNVYKTKYLNLNLFII
YEDELEIEVAENEISFPIGNLKKQSLNYDRFLKVRINSSSILIENDYAGT
IPVYYSTRDYISLSNIEPCVVLDSKTSSKDISYENIYGFMRYMHFIWDET
AYGHIFTMLPDSVYSFILPDFSIESKYLQTVKSSKENINLSDEEVASKLN
QLNDELVYRSLSHYDQIILPLSSGYDSRMILASLSKQKELKDKLYCFTYG
SDGSVEVEAGRRLTSALGVKWAFIDLPLEFLTKDYLYDIHDVFGSSLHMH
GMYQLEFFNEIKKRIKKEKNSCLTSGFMTGVPAGQHNSLLGITNDSSKLT
EHMNKFSQSKYWTDIQMEMMKAFKNKNFINKAEERFRMAFNRFDGEIYQK
AVMFDVWSRQRNFIGYYPRTLEWKIPTVSPHMTADYANFFMSLSKKHLDD
RYAVELMFSKNYEELSKIVSNSNGLRSINSKFENTMFFISRVFRKFKINN
FLPKKYANNDFEFSLPSVRHSGKDAIFPLLAKDEFVNKIIGQIISHDAIY
ELYAKAHSGDVPSYEKLVGLQSLAMSLFLIKDGV
>Cag_1921 Twin-arginine translocation pathway signal
MFFQCVMTNHQLSRRDFAKLLLSSTAGALLGVGVPSSRTYAATNRVVIIG
GGFGGATAAKYLRKLDPSVAITLVEPKCQFYTCPISNWVIAGLKPMHAIA
QNYNALRVRYGVNVVHATAVAIDALKNSVTLHSGKKLFYDRLIVSPGIDF
RWNAIPGYSQKVAESVMPHGFQAGEQTLLLRKQLLAMPNGGTVIMCPPNN
PHRCPAAPYERASLIAHYLKQHKPKSKVLILDCKEKFSKQELFLQGWERL
YSGMIEWRAATAGGKVEAVNSAAMTVTTEFGDEKGDLINIMPPQQAGRIA
FEAGLTDAAGWCPVHPITFESTLHPGIHIIGDACHAGDMPKSAFASSSQG
KVASSAIAALLQGRVPVAPSLVSTCYSLLKPDYAISVANVFRLTIDGIVD
VKGSGGVTPLDASVEHLQHEADFAWGWYENITRDTWG
>Cag_0980 conserved hypothetical protein
MNTFIYQQENWPHFTWQNEEIVNLLSEARHLQGKLIGKMESLGFDLRNEA
LLDTLTLDVVKSSEIEGEYLNSDQVRSSIARKLGIEIACSVESDRNIDGV
VEMMFDATQNCYNQLTTERLFDWHAALFPTGRNGMYKINVADWRKDTTGP
MQVVSGAMGKEKVHFQAPDSSLVEKEMNLFMDWFNSQVTTDLVLKAAIVH
LWFVTIHPFEDGNGRIARALTDMLLAQSDKSHLRFYSMSAQIRIERKEYY
EILEKTQKGFLDITEWIKWFLSCLINSLKASESVLLNVLFKASFWDKQSK
TLINERQRKLLNKLLEGFDGKLTSSKWAKIAKCSKDSAIRDINDLIDKNI
LQKESAGGRSTNYALKQ
>Cag_0668 Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation)-like
MTLKLLACGDVVNFSAKQDFVDNKLTQIIQNSDIAICNFEAPIFHQDMRP
IKKAGPHVYQSKESVQYLKNVGFNFVSLANNHIYDYRQKGIFDTIQELKK
FDLEFIGGGTTFEEAYKTTIIEKNGITIGLLAGCENEFGCLYEEQNRGGY
AWLFHHKIEDNIRELKAKCDFIVFISHAGVEDIELPIKEWRDRYQRLCDV
GVDVIIGHHPHVPQGYEKHNKSLIFYSLGNFYFDTTSFKNKSDDSFSLLI
EFEKNKNIGFEIIYHKKINGQTVLVDKKDVSFDLDYLCGILEDGYLKRND
EISIELFNKYYYRYYQEALGVIACNLNFLSQIKWLIKKIIFSKKEIDNKN
LMLLHNIRIDSHRFIVQRALSLLSERI
>Cag_0036 Glycyl-tRNA synthetase, alpha2 dimer
MSNSEQQRVQRAALSPDKVMSKLVSLAKRRGFVFPSSEIYEGLSACFDYG
PLGSEMKRTIKELWWNAMTRRHQNIVGIDASIMMNPRVWEASGHVASFND
PMIDDKTTKRRYRADHLIENHIEKLRRDGKEEALQRVQATYEAVGSTEDS
NRALYELILAEGIKAPDTGSADWTEVRQFNLMFQCNMGALSDRSGIVYLR
PETAQGIFVNFHNVREASRMKVPFGIAQIGKAFRNEIVKGNFTFRMVEFE
QMEMQYFVKPGTQLEAFEAWREERFAWYVNSLGITPSKLHWYKHDKLAHY
ADLAYDIKFEFPFGIEEIEGIHSRTDFDLKQHQEYSGKNMEYIDQTTNER
YIPYVVETSAGCDRLFLALLADAYTEDVVDGEERVVLRLAPRVAPVKAAI
LPLMKKGGLGEKAMALCNELSSSYLVQYDDAGSIGKRYRRQDEIGTPFCI
TVDHQSLEDESVTVRYRDSAAQERIALSRVQEFLATNLI
>Cag_0461 electron transfer flavoprotein alpha-subunit
MKQTALFFLEQREGVIRKASLQLWNRLLALSSSEPSFRVVALLAGAANLD
ACNGRLAAHGELLHVSNELFNRYHPERYAQLIASVADREAATSIFFADTT
LSRELSPKLSLMLHASLLSGCTSLERIMQDGAASRALYAGTVQGSFMPLT
ERHLYTVLASSTLPSAAYSCSPIIHPFPYNEVEMVHDFGMLLKLLTMRQG
ALDVAEASVVVAGGRGMGSAEAFGMLEELAVLLGGAVGATRQAVDAGWRP
HSEQIGQTGKSIAPALYVACGISGSPQHFAGISGAGTIVALNSDPHAPIF
SLAHYGLVGDVHQLLPKLIFLLQECLQKK
>Cag_0685 conserved hypothetical protein
MAVTLKIHELAHQELLDAIAWYNEIQSGLGKRFQETIMLQIQKIKQHPTW
FPRETIEVFKAYVPRFPYKIIYSVNDEAITIWAIAHLHRKPSYWQSREKS
>Cag_0896 methyltransferase, putative
MPKTVPNSPHTAQSTQWFEAWFNHPLYMEVYRHRDHNEATQCIRTILQHT
ALEQATPATTTVLDIACGAGRHAIELARRGYNVTGNDLSTTLLNEAAKAA
KQEKLPLQLTNYDMRHVPTHQRYQLVVQLFTSFGYFDSKAEDGAVVQKVW
ELLHKNGWYVLDLLNPDYLAANFIAESQRQVGELTIKEKRTLEANRVRKE
LCILSPSGETLHFSEAVWLYSADEIVDILHNVGFHTTEIVGNYDGSTFNA
TSPRMMLFCHKA
>Cag_0485 hypothetical protein
MKLGSPATGNDFFGREQELRDLWRYLESDHIRFPGVRRLGKTSILKRLEA
DAAEHGLLAKWVDVSNIDSAKGFVALLEQAFPENTIKRFLSDKTKQVADW
FKIIRKVEVTLPDEAGGGGFGIELGEVLLEWQHAANHLHHRLSNQPLLIL
LDEFPVMLEKLIQRNRQEAEQLLTWLRIWRQSQGACRFVFTGSIGLQSLL
ERHRLGETMNDCFAYPLGPYKPSEARDLWKYFAQNADENTWQVTDLVIDH
ALSRVGWLSPYFLCLLLDESIRAARERQEEWPSKASGAASIEVEDVDDAY
EQLLAERSRFHHWEKRLKDALAPAELDFCLSLLTHLSRKPEGLTLNQLSS
RLAKREPDPDCRAQRIQELLVRLTDEGYTSSPDSNKRVQFLSFPLRDWWN
RNHVR
>Cag_0710 hypothetical protein
MKKSAIKKALCKLDKETIEKQMEALMELVAEPRYICRKCARVASTKRHLC
KPVAITNSNGSKKRAAKVLNNGVVPPNALT
>Cag_0399 Redox-active disulfide protein 2
MKQIKILGSGCAKCNQLADVVKNVVAQEGIDASVEKVEDIQQIMAYNVMT
TPALVVDEQVVCKGRIPSQSEVKEMLTAPAKGCCCGGGKSTGNPTGCC
>Cag_0763 Exodeoxyribonuclease V, RecC subunit
MSAFLSIKSIHDKNSSLTFFLISNQFQSGTMALHLYTSNRMEMLVDSLAE
VVRQPLASVFEHEVIVVQSRGMQRWLSMELAGRFGVWANGRYPFPNAMVQ
ELFKQLLPSVAQSDAFKKEVMSWRVMRLLPHLLEMAEFLPLRRYAADDSD
GLKLFQLSEKIADTFDQYTLFRPDMLALWEAGGGVAEGGEAWQPLLWRAL
VEGAGLHRGQLRELLFRQLSRSSSKISELPERITLFGISYLPQFHLELFA
AVARLTEVHLFLLSPTQEYWGDIVSRKAMARLSEAEQALRSEGNPLLASL
GRIGRDFSEMVLEMSDEALDSQEFYDDPPEDSLLHALQWDILHLQGAGEM
DETPRLLQPHDRSVQIHACHTPLREVEVLYDAILGLLEAHPHISLRDIIV
MTPDIESYSPYIATVFGTAREAGKEGKGVVALPFSIADRRMMHEGEIASA
LLKLLALHGSRLTASMLFDFLASPPVSRAFGFDAEALRLIRGWIEGSGIR
WGMDEEDRRERNLPAYRDHSWRAGLERLLLGYAMPEEEQLFQGVLPYGDI
AGSAAEMLGRFAEAVEALERFVSSSEPSRTLEAWRQQYAMWLTTFFAPDE
DSEREFATLATLGEELAEYGINAGFEENISPLVFFTWLRSRLEEQEQGLG
FMTGGITFCAMLPMRSIPFRVVMLIGMNDGAFPRQSRAPSFDLITRQPQK
GDRSLRNEDRYLFLESILSARELLYISYVGQSIRDNSEIPPSVLVSELLD
AVRRAFVLPNESSIEQHLVVRHRLQPFHHDYFSEHSPLASYSSENYYALI
ASEQSLQAVPPIRSFISTALSEPTAEWRTVQLEQLLHFYDNPSAFFLEQR
LGIKPEGLLLPLQDSEPFAVESLERYRLQQELLEAQLRGQPAEALLPLFK
SRGMLPPAQHGELLFATVMQEVDDFAATLRQHLAGEVALAPLEVDIEVGE
FRIVGRLDGIWANAMLRYRPARMKVRDRFRWWIEHLLLCALQPTGYPLTT
HMLMSDGEWSYPPIDNPHQHLTTLLQRYWQGLCEPLPFFPRSAYAFVLKG
MDANHHLDVGKGIDAAYREWRDDTFTNRKGEGSDSAIQRCFGAAANPFSD
TFIELALELFTPMMEAMGAMGDGKRSG
>Cag_0124 conserved hypothetical protein
MKFIAIMGHEETRPQVRALFQKYQVHLFSNLSIKGCSCEQKGGEQPTWWP
SNEMPTSYTSLCFAILEDEKAEALMTELEKNPIAIEKDFPARAFLMNVER
TA
>Cag_0511 metal dependent phosphohydrolase
MIAEHLLFHSDGGFIRIPVWGHIPLSKPLKSILSHPLFLRLKGIRQLSFS
QQVYPGATHTRFEHSVGVYHLMKLILQRMVTSSLAQKLQTEHFRFDDASC
RLLLASALLHDIGHFPHAHIIEEQIPRVGNEVVFSHHEELCRYFLEEEHP
NHPSLATLLMEEWRVDPNDVVALISGKHRLSKLISGTLDPDKMDYLMRDA
HHCNIPYGSIDIERLIESFVPDPERQRFAITEKGIAPLESLLFAKYMMMR
NVYWHHTSRALSAMLRRLLQDIAEAELLPAATLRELFYRNADDRVLYELK
LLLPEATHPLVALLEDVLMRRVYKRAITVQPYLQSSGKEDERWFLYSNNS
ALRRSMEVEICELLNKRYQLNLHGYEVLIDSPSRKDIFDYADLQELRVYP
TRSEHIHYAMHCASEYVRFDELNESVFQSNFILSFERYTKKFRLLCRPDL
VAHIVELRHDIMSLLAHDYPLFHSTVSSSATEHS
>Cag_0058 Cell division protein FtsA
MLKSNIAVGLDIGTTKVCVVVAEKDESGKLNVLGKGRSNSDGLQRATVVN
INKTVASIKAAVADAERESSIRIRGVNVGISGAHVHCIHSNSEISINQTG
IVNESDVRRFLEKAKTNIRYLDIDHEIIHVIPQEFIVDDQEGLLDPIGMA
GTTMRGSAYIVVGLKTKIRNIRQCVEKAGLEISAITFEPVASGMAVMKER
EKKSGVVVIDIGGGTTEVALYIDGAIRYSEVIKVAGNDVTHDIAHGIRAL
YEVAEDLKIKHGCAHSKLMGDDEELLIEGIEGRPQKSVPKSSLTMIIEAR
MLEILELVRDSIKRSGYYEYLNAGAIITGGGSLLPGTEELAHEVLGLDVR
KGTPEGVSGGIKEAVNNPMYATVMGLVAHSFENNLSAAHSYDPVIEPMPK
EELPSQQEEVVVPQEPQSPAGKKIVDRFKNFWDNL
>Cag_1323 conserved hypothetical protein
MLAHAQQIYDEALTLSPIEKVELIEHLYFSLDSKNSRQELDKLWAEEAED
RLTAYENGEIKTTPASEVFAEINSMRPQ
>Cag_0267 hypothetical protein
MNTLSLTLPESLHKSAREIAEKENITINQLITSALAEKISALGAEDYLEM
RARHASKAKFLNAMAKVAKIEPPDYDRL
>Cag_0402 2-vinyl bacteriochlorophyllide hydratase
MPRYTPEQLARRNASKWTTVQAILAPIQFLMFIAGLTVTYLYKEGIWIDD
FTWITIFVTLKTFMLVLIFVTGGFFELEVFGQFAFVHEFFWEDFGSAIAM
IVHIGYFVLFFMGLDESTLIWTALLAYLSYLINAAQFVIRLLLEKHNEKK
LKQQNAL
>Cag_0106 aminotransferase, class V
MKKRLFTPGPTPVPENVMLRMAAPIIHHRNPEFMEILARVHADLQYLFCT
TRPVVVLSCSGTGGMEAAISSLFNEGDKVLTINGGKFGERWGELVRTFTG
NNVEEMVEWGQAIQPEQLLALLKAHPDAKGICLTHSETSTGTATDIKALS
ALIHEHSNALVLVDGITAIGAHEFLFDAWGIDICITGSQKGLMMPPGLAL
VAISERAEAAIKARKSSKNYYLSLKKALKSHAGDDTPFTPAASLVIGLDE
ALQMIKAEGIENIWKRHESLASACRQGCQALGMKLFSSSPSFAVTPVWLP
EGVDWKAFNTALKVNNGITVAAGQDAYKDKIFRISHLGYYDELDMLTVIG
GIERAFKEINHPFTIGAGVQAVQQAFLGN
>Cag_1016 conserved hypothetical protein
MKPFSGTVLVAGATGRTGAWVVKRLQHHAFDYRLFVRSGEKALELFGAEV
IDKLTIGSIENTEDIRAAVRHADALICAIGGNAGDPTAPPPSAIDRDGVM
RLAQLAKAEGVRHFILISSLAVTRPDHPLNKYGQVLTMKLAGEDEVRRLF
SEAGYCYTIIRPGGLLDGAPMEHALISGTGDQITTGVIQRGDVAEIALLS
LINPQAINLTFEIIQGEEAPQQSLDAYFPQA
>Cag_1977 MazG
MPATIDELKAAILKEHERSVPEGFQRVLDLVRVLRQECPWDRKQTAESLA
HLLLEESYELVHAIDQQETDELKKEIGDLFMHLCFQVQLADEQGHFSFNE
VFDALCKKLIHRHPHVFGSTEATTEKEVLQNWEKLKLSEGRKSLLDGVPS
AMSELLRAYRVQKKVAGVGFDWQSDEGVIDKIVEEIQELKQAATQNEREE
EFGDLLFTLVNYSRFIGTNPEDALRKATNKFMQRFRTVELLVAESERPWQ
EFTPEELDTLWQQAKEK
>Cag_0043 conserved hypothetical protein
MPPAIKIIILANVAVFMLQRLPWGGELLSAFASLWPIGTGNFYIWQPVSY
MFMHGGLTHLFFNMFALWMFGAEIENYWGTRQFTIYYFICGIGAALINLI
ATMHSPYPTIGASGAVYGVLLAFGMMFPNRYIFLYFFFPIKAKYFIAGYA
LLEFVSGLGSREMGSGSNIAHFAHLGGMLIGFIYIILKRNDWALDDVVQK
MRSLRSSGGKKSSPYQANRNSSNSNKLTPTDDEINAILDKISAHGYASLT
DEERRKLLRAGGNG
>Cag_0591 proline iminopeptidase, putative
MSYFASTRCRLYYEDSAEGDPSALSKPTIFFVNGWAISSRYWKPLVSILS
DRYRCIIYDQSGTGQTLIKGYNPTFTIQGFTDEASELLEHLELHHSRNVH
IVGHSMGGMVATDLCMRYPDALVSSTIIACGIFEETPFTSVGLMMLGGLI
DVSMNLRSIFLMEPFRSMFINRAVAKAISKEYQDVIIDDFTKSDHAATNA
VGKFSIDRNVLRTYTRHVLAIQAPLLCSVGMADQTIPPEGTLTLYEKRKA
KSELQTSLARFEDLGHLPMLEATELFAQVLDKHFQQAQQLL
>Cag_2024 Phosphoribosylformylglycinamidine cyclo-ligase
MQMDYKKAGVDISAGEEFVRMIKPQVRQTFTPNVITDIGAFGGFFMPDFS
RYRKPVLVSSIDGVGTKLKIAIELDRYNTVGSCLVNHCVNDILVCGARPL
FFLDYYACGKLTPAIAASVVTGMVAACRENGCALIGGETAEMPGMYNAED
FDLAGSIVGMVDHERIINGSKMQAGDIMLGLASNGLHTNGYSLARKVLAG
RMHETISEANETIGEALLKVHRTYLPIIEPLLESPDIHGLSHITGGGLMG
NTMRIVPEGLKLEVDWQSWQEPLIFDIIRREGNVPEEDMRRTFNLGIGLV
MIVAAESVERILANLQSRGENGYIIGQVAKS
>Cag_0014 Holliday junction resolvase YqgF
MPLYQRIVAIDYGTKRIGVAKSDPLGMFAQPIGTVDRAGLSKLLSPMVEA
GEVQLVVVGYPLNRHGEQTAMTEVIDRFIESLRLEFPALPIETINEHCSS
KSAMQLLVASGTSRKERKTKGRLDTAAACLLLSDYLEQQK
>Cag_1144 conserved hypothetical protein
MKCPACHTELLLAERQGIEIDYCPSCRGVWLDRGELDKIIERSASYSSEK
RYEKEAYHQHDKESHYRYHNDDRDDYIDPKSGKRKRKGGPLGFLEDLFD
>Cag_0706 hypothetical protein
MNIDHIASEALKLDPKSRAILAETIWESLEDPYLEFSDVTDEEAISLAMK
RDEEIETGIVMPLSHKGLMNSLRRHED
>Cag_0214 putative cold-shock DNA-binding domain protein
MGKVEGTVKWFNEEKGYGFIEQQGGKDVFVHYSAINGSGRKTLVEGQKVT
MEVVQGEKGLQASDVTPLG
>Cag_0880 conserved hypothetical protein
MLHCMKIYLDVCCLNRPFDDQTQDKIHLESEAVLTIIRHLEKKDWEWISS
SVVLYEVQKIPNRDRKQRILRLCDKSSEVILLNKEIYRFAEILNKKGIAS
YDALHLACAHFANVDFFLSTDERLIKKAQKNIDIFNMVIDNPLYWLQTIW
>Cag_0062 conserved hypothetical protein
MFFGLQAMNFKKVTMSEPQNERFPEYFGRSVRAMSDYIGIGLQIAVSFAL
FVLGGYWVDARFGTSPLLLFVGVLLGMVGMVLVLMKVIRQANAKK
>Cag_0410 conserved hypothetical protein
MPTIEAICISHEKGVIKEQVPSALLRTNWGIEGDAHAGEWHRQVSILASE
SIERMKALMPEIAYGMFAENLVTTGVDVTALRVGDQLRIGSEVVLCITQI
GKECHNGACAIEQATGKCIMPTEGLFCRVLHGGTVQAGMLVEVMPQAL
>Cag_1934 Succinate dehydrogenase or fumarate reductase, flavoprotein subunit
MTMFDAKIPEGALAQKWSRYQASCKLVSPANKRKLDIIVVGTGLAGASAA
ASLGELGYNVKVFCYQDSPRRAHSIAAQGGINAAKNYQNDGDSVSRLFYD
TIKGGDYRAREANVYRLAEVSSRIIDVCVAQGVPFAREYGGLLANRSFGG
AQVSRTFYARGQTGQQLLLGAYSAMSRQIALGHVTLYNRRDVLDMVLVDG
KARGIIARNLVTGEIERHAAHAVVLASGGYGNVFFLSTNAMGSNVTPVWS
AYKKGAMFANPSFTQIHPTCLPQHAEFQSKLTLMSESLRNDGRIWVPKRL
QDAEKLRKREIMPSSIPDAERDYYLERRYPAFGNLVPRDVASRAAKERCD
AGFGVGRTGLAVFLDFADAIKRKGRDKIDSLYGNLFHMYEQIVAENPMET
PMMIYPAVHYTMGGLWVDYELMTTIPGLYAIGEANFSDHGANRLGASALM
QGLADGYFILPSTIGNYLSHEIHTPRFNPDSAEFTLAAEAVRDRLQRFLK
QGGNESVDSFHRRLGRIMWDYCGMARRESGLLQAQELLAALRHEYEHGVK
VLGGLNEYNPELEKACRVADFIELGELMVRDAQARNESCGGHFREEFQTS
EGEALRNDEAFSFVAAWEYKGKHAEATMHREALNFQEITPTQRCYK
>Cag_1367 hypothetical protein
MKLTLYIIYTIFFIAMGCGIYIGFQTADGLVDNNYYHNSTNYFQTKAREE
KLGIVINKPDTLTIGTNTFTVAVTSHGKPFEQGNISLLLGNVSTNNNDTT
LTMQETAPGIYQTTISIPYKGKWFTRLELYHQQQLITTKQWFFSVQ
>Cag_0861 conserved hypothetical protein
MINLAEALCHPEAYPHAPQSVEMVQTHCSWVFLAGAWAYKVKKPLDLGFL
DFSTLELRRHFCYEELRLNQRLCSTLYLSVVPIVAVRQQIKVIDKENNTD
EHWNEEENNEHGTIIDYAVKMVRFDRTQELDRLLAHHKLDVKQMEQLART
IAAFHNSLPAAPMDSALGHPDTIIKPMLHNFTLLEDIVVESEEQQELATL
HQATLSDHQRLYQRLLQRKADGFIRQCHGDLHTGNMVMWQGRITLFDCIE
FNPTLNTIDCISDLAFLFMDLRHSGETALAWRLLNGYLMETGDYHALALL
PFYERYRAMVRAKVTAIHASQSKDAPEVSSLMAEHRSYVAHATNCTKHNQ
PMLLIVCGLSGSGKSTLAASIASELPAIHLRSDVERKRLAGLRPLERSPK
SDLYSHSMTNNTYAHLLGLARFCLLEGYCVVVDATFLRQSNRALFTTLAN
ECNVPYRLLHCTAPKQVLMERVQLRNLEGNDASDADAEVVAMQLEQQEAL
TDDEKKITITIDTTHPINATALTGMYQLKREH
>Cag_1105 Bacteriochlorophyll/chlorophyll synthetase
MPIEPLSTIALIIRFLKPVTWIPVIWSFLCGAVASGAFGWHDILGAKFLL
GMLLTGPLASGTCQMLNDYFDRDLDEINEPNRPIPGGSISLKNATILISI
WAVLSVIVGYLIHPLIGLYVVIGIINAHLYSANPIKLKKRLWAGNTIVAV
SYLIIPWIAGEIAYNPQVTLASLQPSLIVAGFFTLSSIGTMTINDFKSIE
GDRQVGIRTLPVVFGEQRAATIAAVLINLGQLFAAIYMVLIGQPTYGIVV
AALILPQFFLQFPLIKEPAQWDVRYNAIAQNFLVAGMLVCAFAIQAARL
>Cag_1863 glycosyl transferase, group 1 family protein
MKPYKLLWFSEIQWDFLSTRKQRLLARFPDEWHILFIEPFTLGRKHHWLP
VKRGRVWVVTVPFLKTIPFRFGALLKRPLVRTLAGLPGIAIMHLWTLLLG
FSSSQRIIALSNPYWGKVASHLPCRFRCYDANDDHLAFPSTPSWLPDWLQ
RYLSTTSLVFSVSKELTARLPLSSSTKVVELGNGVEFNHFATPRQNKPSQ
LAALSGKILGYAGAMDWLDVDLLEKVAQTYHQYHLVLLGPAYEHGWMERQ
LGLQALPNVHYFGKIEYSELPAWVQAFSVALMPLVANPLKQVSHPNKLYE
YLATGVPVVAMNYCSAVEAAADVVHVAQSYEEFVQLVPIALADNRREARQ
AFAKQHSWDALAATMVHELQHAWQESAP
>Cag_0155 Pyruvate, phosphate dikinase
MPNPEKTNEQAAKKQYLYSFAGGASDGDASMKNLLGGKGANLAEMANIGL
PVPPGFTISTEVCTYYYDNQKNYPPNLFEQDIPDALSKIEYMLGKKFGDP
VNPLLVSVRSGARASMPGMMDTILNLGLNSQTVDGLAKRSGNPRFAWDCY
RRFVQMYGDVVLDLKPTNKKEIDPFEKILEEKKHERGITLDTEFTVEDLQ
DLVSRYKAAILEKTGRTFPEDPHEQLRGAIGAVFGSWNNERAIIYRKLNH
IPGYWGTACNVQAMVFGNMGEDCGTGVAFTRDAATGENIFYGEFLMNAQG
EDVVAGTRTPLKIEQLKANKPDIYDQLEEIRSSLERHYRDMMDIEFTIES
NTLFMLQCRVGKRTGMAAVKIAVDMFNEGLIDEKEALMRIEPEQLNQLLR
PVFNMKEKQAAISGGRLLATGLNAGPGAATGKIYFNAVDAYEASKKGEAV
VLVRIETSPEDIKGMAVAEGILTERGGMTSHAALVARQMGKVCVAGCGAL
QIDYQKGEMRVAGKDIVLSEGDYISIDGSTGEVIAGKVATKNSEILEVLI
DKTLKPEDASTWPIYKQLMKWADKYRKLKIRTNADQPDQADIAVTFGAEG
IGLCRTEHMFFGGERIDAMREMILADDITGRQKALEKLLPYQREDFYGLF
KCMGERPVTIRLLDPPLHEFLPHTDAEITALAAKIGKSYEEVNMRIEHLH
EFNPMLGLRGCRLGILHPEIAVMQVRAIIEAACQLKKEGLEVIPEIMVPL
VSTVKELEITSEIIHKTARSVFAEQGIGVQYLVGTMIEVPRAAITSDQIA
TVADFFSYGTNDLTQMGLGMSRDDSGQFLPTYQQQEIFARNPFESLDIDG
VGRLMSISAKDGRAVKADLKLGICGEHGGDPATVEFCHKLGLNYVSCSPF
RVPIARLAAARAALQD
>Cag_1204 Secretion protein HlyD
MNNIVKPVLFIALCSSLTLSGCGDKPEMGRDGAKNQEQPALVVQQLQPSD
AVVVSRYPALLEGKVTVEIRPQVDGVLRSILIDEGAFVKAGQPLFAIDDA
VYRAQYQGALASQHAAEARVVAAKLEVERLQPLVQNQVIAEVQLETARAN
YKAAQAAFDVAVAATRSAKVNLDYTVIKAPINGFVGKITLRQGSLVTKNQ
AAWLTMLSDVSEVYASFSISENELLRFRQQYGQSDGRLGSGSNNVPATLV
LADGTRYPQQGRLATLSGQFDALTGAMRVRAIFANPQALLRSGGTGSVEL
SSSYNNVILVPQSATVEMQDKVFVMRLMQGNKVQKQAITIAAKSGNNYVV
TGGIQAGDTIVLVGADRLQDGMVITPRYETSPAQSATINSAVKRP
>Cag_0199 glucokinase, putative
MSRWALGIDFGGTAIKAAVISEGQGLVEDCRVPTNSSAGPEAIFSQLAEL
IGAMYHKGCATCDAANFAGVGLGAPGVVDVERGVLKYPPNLHGWGLVPLR
EELQQRLQQEHGLQVQIHLDNDANVAAFGESRYGAGQPFPNFLMVTLGTG
VGGGIVLNRSIYRGSYGTAGEVGFMIVDVDSPHTHAGIHGTLEGMLGKKS
IVAMACSMMHNAATTSTMGNYCNNDFSRLSPRHIEYAAREGDAVALAVWE
RVGHLLGSALASVTALMDIRKFVIGGGISGAGSLIFEPARQQLLHSTHPS
MHEGLELVPAFLGNKAGMYGAASLCF
>Cag_0829 transcriptional modulator of MazE/toxin, MazF
MSLKRGDLVTVALQGAYGKPRPALVIQANLFAMHPSVTILPVTSELRQTP
FFRIDVEPSGMNNLKKRSQIMIDKVQSVPREKVGAAFGTLDDATMVTVNR
ALALWLGFA
>Cag_1925 sulfur oxidation protein SoxA
MKKRALYPLLASLLLFSPIVQPMAAKATDYQAKVNADVKEFQNYFAKQFP
DVKFADYANGVYAIDEDARQQWLEIEEFPPYIPAIEEGKALFNKPFANGK
SLGSCFPDGGAVRHNYPFYDEQRGKVVTLELAINECRIANGEKALDYKGG
DMAKISAYMAFISRGKPIDVKVPSEGAYKAYLKGKEFFYAKHGKLNMSCA
GCHMQYAGQRLRSEILSPALGHTTHFPVFRSKWGEIGTLHKRYGGCTNDT
GSKASAMQSEEFRNLEFFETIMSNGLQFNGPASRK
>Cag_1501 hypothetical protein
MTFFNLLFFDTRHHNSTMLKKSFSTMWSSLSLLFAGLWLVVRIILEYFGI
ISDGNDRTTGIKDLREEYKKANYR
>Cag_1032 hydrogenase/sulfur reductase, beta subunit
MTKILASDKLDNCFAEWRKAGRTVIAPVKQETSSRYALVEHAAECDFGML
LPDRALKELFFPQSEPMVRYTVKKQAVETVDFAPPTTQRIVFGARACDTS
GLAIDDPLFGWDYKDDYWFQRRNNTVIVTIACSHADDFCMCTSLKLSPDS
SKGADVLLRPLQNGKGWQVEEITDRGREAFQAIAPLLQESNEAAAPLANV
PVKFDLDKCVAWLQNPENFESKFWKEISMRCIGCGSCTFLCPTCHCFDIQ
DEGDLYNGIRRKNWDSCSFTLFTMHTSGHNPRSTQSTRWRQRIMHKFNYF
PGKFSINSCSGCGRCTRQCPVDMGITETLQAISTLQP
>Cag_0644 Proton-translocating NADH-quinone oxidoreductase, chain N
MFELPSGAEIQSIIATLKASVGFFIPEIYLSLLFMILIVVDLVVKGRKST
LLSVFSLVGLAGSLYFIYQQHAMQAGEFFFGMYVLDGFAIFFKYFFVLSG
MLAVVITMADEQFEREISSMGEYYALVVAMVVGMMMMASSSDLLMIFLSM
ELVSFTAFILAGYFKSNMRSSEAALKYLIYGAVSSGLMIYGFSLIYGVTA
QTNLIAISRELALHGADSFVMLFAALLVLAGFGYKIGAVPFHFWAPDVYE
GSPTPVTAYLSVASKAAGFALLMRFVYVALPHTNNVQAATLGIDWFTLLV
ILAVASMIYGNVVALWQKNVKRLLAYSSIAHAGYALLGVIVMDKLGTQAT
LFYLLSYLLMNFGAFFVVVLIANRTGSESLDDYRGLGKSMPLAGAALTVF
LISLVGLPPTIGFIGKLMVFSALIAKGSIFMWLALIGILTSVISLYYYML
IPLNMYLREPVQHASSNTGTQPRLVAQLFMGALMLLTIYFGLFFAPLSDF
ARYSASIFGLQLQ
>Cag_0064 H+-transporting two-sector ATPase, A subunit
MERSSIFQPKRFVKVFAVLLPLLFNGYSVSGANPTEAPVAQHTEVTANAA
ESAHVAEAGAAHGAEGHGEEKAGDVIMHHILDSNVIAFDPFGEVDLPKIV
VAGFDISITKHVVMLWVVSAIVIALFAAANGQYKTGSAMRAPKGIANALE
ALVDFIRLDVAKANIGKGYEKYLPYLLTVFFFILVCNLLGLVPYGATATG
NINVTLTLATFTFFLTQAAALKAHGIKGYLAHLTGGTHPALWIIMIPIEF
IGLFTKPVALTIRLFANMTAGHIVILSLIFISFILKSYIVAVVMSVPFSI
FIYLLELFVAFLQAFIFTMLSALFIGLASAHEGGHDEHEASAAHH
>Cag_1274 transcriptional regulator, XRE family
MFTMDDLGQALRERRKLLGLSQLEVAKRNNISRITINRLENGRLPELGIR
KIMMLCLSLGLELTLKEASPRPTLNDLIREQESHYA
>Cag_1875 TPR repeat
MSIQFVKHNPAFLDAGHFLEQVVARRADVAHLLGHLGSCEPIGTVRHLFI
TGQRGSGKTFVVRRVALAVEQHNALRSRYYPLFFSEESYSVSSSAEFWLE
ALFHLARQTANEQFAQTYQALREEVDEERIRQIVLPLLLDFADNQGKTLL
LIIENFSMLLADMASSREGEVLAQTLLQEPRFQLLATGTFTFDSLEVPFG
KYFSSITHHALEPLSNADCNALWQLYSGTPLADGQIRAMNILAGGNARLL
VTLARVANGRTFSQLPEILALALDEHTEYCKSYLDVMAPVERKVYLSIAE
LWAMVSAREVSLAARIDINKTSAYLNRLINRGAISVERQAKRNKLYGVTE
RIYSIYYLMRRHGWQSGRVRALLDCMLAFYDPASFPDRLSDSERERCTSV
AEALTMLPEREHVEQCHQNFLDAKRHSRFVPSLAYFVRFVQKSDVSHADE
SSSPMLGESFRQAFELLETESYTEALPIFDAIIMVSRHSESEQAIGQRYG
AMIGRGVALGNLERYEEGFRLLDEVAATCQERSVRRRLKWGLLALLGKAS
VLERAGRIDEAVTLYDELVSRYRRQQELECSTLVAAALLHKSLLVSKSKG
GEEEIAMCDTLLELYSERVELPLVELVCAAWRNKAIAFEALNRNDDALLA
YGKLLALCRQRSEPHMMQHTAHALQNMGVVYGKMHRYSDAEHCFMEVQSL
APQQARAHLMLLKLLVKMEEQQHAVLSELRNYLAATSLALRALPQTIELF
ITCAVAGYAAEALELLVASPLAVSLEPVQAALQHATGDEVRSAPTIVEVA
NDIVAAIEARRNA
>Cag_0571 cytidylate kinase
MYRSVTLKALQQEVLDDIRRTPELVSTLLQELQIAFDGDRVVLNGTDVSS
AIRDNRVSREVSFISSLKPVRDCLRDQQQALGRQGSVVMDGRDIGTVVFP
NAELKIFLIANARERAKRRYAELLAKLPAGSTMPTLEELEAEIRQRDLDD
ETRQHAPLRKHPDAHEIDTSGLTISEQVERVCQLAQALNNLP
>Cag_0938 Phosphoribosylformylglycinamidine synthetase PurS
MPYIAKIKVTLRQSILDVQGKAVDHALKNLGYRTVESARIGKYIEVTIHE
SERSEAERVAHEISEKLLSNPVMENYSLELEPLA
>Cag_0320 3-deoxy-D-manno-octulosonatecytidylyltransferase
MNAIIIIPARLGSTRLPEKMLADIEGEPLIVRTWRQAMQCCRASRVVVAT
DSVKIAEVLTTYGAEVVMTSPEARCGSERIAEAARQFACDVVVNLQGDEP
LISHETIDLALEPFFSPNPPDCSTLVFPLQPDDWAQLHDPNQVKVVLNRE
GYALYFSRSPIPFQRNQLTSTQCYRHVGLYAFKAEVLQCFAALPPTMLEE
AESLEQLRLLEHGYRIRCMVTHDDQPGVNTAEDLELVRTLFKQRHQEA
>Cag_1453 TPR repeat
MSDATEYQENLREAERFFHYNKHGFLFAVSNDELVQRNLNSSLQQLLRGK
GKTLLTYTWDTNPEALHPVKQLRQFQQNHLELNGLILNGLEPALEHNPNF
LVQLNFGREGLAELRIPLLFWVSNRTLQRVNREALDLYNQRVSANLYFEH
DPTLQQSDNSALRYIAQETVRANKSLAGVEERMKLLQQQLDEAEKQHVEP
KTIANEIVLELLELYSQILGAEPLIHTLLNKYEAFIDRENPENCFKLARV
LYEIGMRTEAMDLYQKALQTLRELAVRNPDIYLPHVATTLNNLGALQYTT
NDYTAALASFTEALTIRRELAAKNPDVYLPDVADTLNNLGALQSDMNDYA
AALASFTEALTLYRELAVKNPDVYLLYVAGTLNNLGILQYNTNDYAAALA
SYNEALTLYRELAAKNPDVYLPDVATTLYNLGNLQYNTNDYAAALASYNE
ALTLYRELATKNPDVYLPDVAMTLSNLGNLQYTTNDYAAALASYTEALTI
RRELAVKNPDVYLPDVATTLNNLGALQSETNDYAAALTSFTEALTLYREF
ATKNPDVYLPYVAGSLINMAVWYYKAPQANQQQSLAFVTEALHIALSLVE
KIPNVQLNIDSAYNLLRAWGIEPEEFVKQVGTSGASGGE
>Cag_1786 conserved hypothetical protein
MSETFRQLLDESIQLELNLAKLYTAYNDLFEEDEDFWWDLAMEERGHASL
LQHEKNSPQQEPFFPENLLANDLDVLKAANKRILDLVAACKSTPPSRLEA
LRTAYELETSAGESHFQRFMESPASSFAANIFQQLNQGDRDHAERIQQYI
DELE
>Cag_1851 50S ribosomal protein L3
MGAILGKKIGMTRLFNEKREAVSCTVIEAGSCFVTQIKRTDKEGYDAYQV
SYGERKEQKVSKPLQGHYKKAGVTPGYTLVEFSFAELNQELAVGSQVSVE
GFNVGEKVNVLGVSKGKGFAGVVKRYNFGGGSRTHGQSDRLRAPGSVGGS
SDPSRVFKGTRMAGRKGSANITVKNLEIFKVMPESNLIVVKGSIPGPNNS
YVKIISSKK
>Cag_0276 conserved hypothetical protein
MFISKRFMKRKVYIETSVISYVTARPSKTILGAAHQQLTLAWWETRSQYD
LVVSELVLRECGAGNPDAAKKRLTVLHDVPLILITEQALKIANSLIEKGI
VLAKAAEDALHIAIATVHGVDYLLT
>Cag_1085 conserved hypothetical protein
MKRQPHTLPLSTTPITPCNTLHTLYTVAYWLTQQSNEAEELVRAAYQHRQ
GDTTLHSLLDALRHHAHRVGATTNPLEAEALDIRFTVLLADVARLKHSEI
AAIVQQPLAVVRYWLLIGRQQLASEQTLRASA
>Cag_1561 zinc protease, putative
MNQIATTSFPPSYTLQVNARAKYPRLKMVPHKGLVVVIPVGFSKKHIPDL
LKQHEEWIRKIEHHFEAHRQTAEEAFEVLPTTITFSTFNEAWQLNYHQAA
RNSVRLTMQGEGQLLLSGNIGETALCRQVLTKWLNRRADVLLSPRLTQLA
ASVGMCFSTTTMRCQQSRWGSCSSKGAITLNSKLLFLPEELVRHVMLHEL
CHTLHMNHSAAFWAEVARFDPQWKHHKREMKDAWKFVPRWLTAL
>Cag_1912 TPR repeat
MKKHILPFLALPFVLLNACASKQELNVVQYDVTRLKSEASNLKNEAQAIK
SQTAVSYADMQQVRNDIARLNGSLEEVSHRITEQTNKNNNVFKRLGTEDS
LLVHQLSGLETKAAVLEKKLATLDSRLLALEGIVGTGTEALRKDSVATTS
IKPAVASTLAVEPTPATVNDASMFQEGVTLFGKKNYGAARQTFMALIKRF
PTSLLVGDAQFYIADSFFSEKRYEQAIVEYQEVIAKYPKNSKRPAALYRQ
ARSFELIGDVANAKTRYKDVVNVYPTSPEAALAKKKL
>Cag_1402 DNA modification methylase-like
MKFPDDYINTIICADSLTVMEQMPDKCIDIAVTSPPYNLKNSTGNGMKAN
TKSGKWAGNALQNGYSHYNDNIPNDEYAEWQYNCLKAMYRLLKDDGAIFY
NHKWRVQNGLIQDRTDIIRDLPVRQIIIWKRKGGINFNPGYFLPTYEVIY
LIAKPSFKLLPKANAYGDVWEFTQEMKNNHPAPFPVALIDRIISSTSAQI
ILDPFMGSGTTAVAALQLQRNYIGIDISPDYCEMAKERILNLNPAKRFIK
KNGLETISLFEKIV
>Cag_1216 conserved hypothetical protein
MQQKLLLISTIGPEHPEKATLPFVIATAAQALDVKVVMFLQSNGVILAKK
GEAETIAAPGLVPMKELLDTFLEMGGTLMLCSPCLKERYITPNDLIEGAE
IGAAGTLVSEIMSATSVVTY
>Cag_0532 conserved hypothetical protein
MHPETAQIVHAIESLKQAPNFIKDYIFPIANAFFTSLLGAGIAYFTLRYQ
EVIQIEKDKMDTVNKWTLLVDEARSSLLAIKSNYHGNLTDSPIQRALAIR
TVLFTATPINEEYLHLFFLIPKATEKKCEYQKWSQISIIRTMVLNYNNLL
KLWIKRNEVERPIKEKLLQKYSQNAYADVNTDQIIECIGAANFASLVDLT
EHVIKLTDDLLVEFDNFLHEFPNYAKTLINTKRLKRYGSILTYSNNSNKK
LLSMLEKSPTSNYESVIKLFGMTVEQLREKYKTGYEL
>Cag_0046 chorismate mutase, putative
MSTDSSTKNAAHWQELEAWRKKIDAIDRQLAGLLCQRLECARNIISLKAQ
IGEEVLQPEREKEVLNNVLSQADSPITEKALSAIYQAIIEESRKFQHEWK
NSSQQLER
>Cag_1554 conserved hypothetical protein
MAENNLQAWEKVLEYASVPLHGTMSRKIRKGVKLQINGGDVYEDAVLFIS
DLFLRVTQESDGASINTYYDMKAIASIRTYSTKE
>Cag_1079 ATPase
MTQHALRFTNVLAGYRHHAVLQEVSFSIFEGEFVSLIGPNGCGKSTLLKT
ASALLKPQKGKVELFGSDVGTLKPARRAALLGVVPQKLETPMAFSVQDIV
MNGRISALGRWGRASATDYDLVERAMIYTDVISFRDRFFTELSGGEQQRV
VLAMVLAQEPKIIMLDESISHLDINHRQEVLQILMKLNREQRITILLVSH
DLTLSAAISDRFILMEKGRVVKVGTPAEVLEPALLSHVYNCDLQVQHDLY
TGNLQVFGAMEMVRRRTALGKTLHVIAGGGSGIELYRRLLIEGYEVTTGV
LNRLDSDAEAARALAIPMVLDQPFSPISDAAIAEARSMVECADALIVTAV
PFGSGNLINLTLAEAALRAGKAVYLADNIAERDYTDGKAAHQQAMQLVAQ
GARFWRTIPELLAHLQQHHNR
>Cag_1585 Adenylylsulfate reductase, alpha subunit
MGEKNEFKYCKKPEVVHVDTDILLIGGGMACCGAAYEAAKWATPKGLKIT
MVDKASVDRSGAVAQGLSAINTYCGENDPADYVKYVRTDLMGIIREDLVY
DLGRHVDDSVHLFEEWGLPVWKRDEDGSSMDGSKPAPKLTEGGKPVRSGR
WQIMINGESYKVIVAEAAKKALEYNRKATGVDQNLFERVFISELIFDKND
PSKVAGAIGFSVREHKAYVFTAKTMLLACGGAVNIYRPRSTAEGQGRAWY
PIWNAGTTYALAAQAGCELVLMENRFVPARFKDGYGPVGAWFLFFKCKAT
NSLGEDYCATNLATANKDFGKYAEDPHKLTTAMRNHMMMIDMKAGKGPIL
MRTHEAMAALGESMTPKQMKHLEAEAWEDFLDMCIGQAVVWAGNNIEPDK
SPSELMPTEPYLLGSHAGCAGIWVSGPGDISGIPSEWSWGYNRMTTVEGL
FTAGDGVGASGHKFSSGSHAEGRIAGKAMTAYCLDHADYKPTLGRDVDEV
LGEIYAPMETFGKYKDYSTDPNVNPNYIRPKMFQARLQKIMDEYVAGVAT
WYTTSKTMLDKGLEHLSLLKEDAEKMAAEDVHELMRCWENYHRVIAGEAH
ARHILFREDSRYPGYYFRADHFYVDDENWKCFTISKYDRESKNWTLSKRD
YVQIIPD
>Cag_1556 conserved hypothetical protein
MESTVRPLGTVMQVLEELGHKVTYAYDDLVFTEHNDFLLQFTNHAPELSL
FFNTSCPRQQAEKVEQQLIPAADRVGLSVITKGRYSVTGNEDEENLKIEF
FNN
>Cag_0339 Endoglucanase Y-like
MRRFWTFLLISCAPFFLSSCSSVPQNPEKISTEAWHYYRDTFIKNGRVIR
PQNNNDTVSEGQAYTMVRAVLMKDRKTFDECLAWSEKVLSRKNSDGDYLL
AWHYRDGKVTDTTAASDADIDYAFSLIVASKIWQAPRYLELAKEVLASIL
EAETTRHQGRLYLLPWPANKNKPGDLLAQNLSYYAPSHFKLFYETTSDPR
WLELVDTTYYLLGRLLHPGELPEGPIVPDWIAINDAGAFVHLPGKDVRYG
WDAVRVPMRIAADYHLYGDKRAFEVLSWLAVSFEEEFRQQSKFLLQRDST
LQVRNNALFYSAMYASLEATESPSAPKLLQRIRKFIRQEKQGLFYNHPDD
YYINSLCWITEYYEQNKKHLQARSKKVALPLQTHESTANTASLSLQAP
>Cag_1784 Alpha-glucan phosphorylase
MQKKIIALRELSKNLWWSWDTETKQIFEELSPLLWECTNHNPVELLRQIS
DDELMARSDCSFGEKIDKAHARFLAYMNDQNTWATNHVPDLVKKPVAYFS
AEFGIHESIQIYSGGLGVLAGDHLKSASDLGINFIGVTLFYREGYFHQHL
NHDGWQIEDYPLQSPEDLPLTKVTLDDGKDLIISINIAQSEIFAQVWQLK
VGRATLLLLDTNLAVNEAPYRDLCCRVYGGDQTTRINQEILLGIGGVKVL
KHMGIDPAVYHMNEGHSAFLTVELLANEIADGYSEAEAIERVKAKCVFTT
HTPVPAGHDRFSRDMIHYALNKYLFKIGIPLDNLMLLGSEDKTNVHGLFT
MTVLALNLSRSANGVSKLHGAVSREMWQSLYPGKSVDEVPIGHITNGVHT
RTWSFGATEMFWLKFVDSLDTMTLSREQAEAVLATVTDKDLWSLRYNLKR
ILIDYIDRYLLNRLARQSHGQGSFSADYYACHSRKNTFSPDVLTIGFARR
FATYKRAPLIFRDLERLDKIINHPIRPLQIVFAGKAHPHDDAGKDYIRQI
IHHARNPQFSGKVVFLENYNLGVAKRMVAGVDVWLNTPVRPMEASGTSGE
KTVLHGGLNFSVLDGWWPEAYNGENGFAIGNGESFENHEEQDSFDAEQLY
SVLENDIVPTFYDRNEHNIPVAWIQKIRNAIATIAPQYNTHRMVRDYALQ
YYVSK
>Cag_0612 glucose-6-phosphate isomerase
MASRVESAAWRALQAHQQEIAPLQMRQLFAEDSARFERFSLRWQGMLCDY
SKHRITARTMELLLDLARSAEIEAARNAMMQGQPINLTEKRAVLHTALRK
PQSVPFAVDGQNVAADVAEVLAQMEGCCNRIISGEWVGYTGKAMTDIVNI
GIGGSDLGPYMVTEALRPFAHGKLNVHFVSNVDGTHLSETLKKLNPETTL
FIVASKTFTTQETLSNAMSARAWFLQCAVDAAHIAKHFVAVSTNRAKVVE
FGIDEANMFRFWDWVGGRYSLWSSIGLSIAMYLGFEKFRQLLAGAYAMDE
HFCSAPLESNMPVIMAMLGIWYSNFFGCTSHAVIPYDQYLHRFPAYLQQL
DMESNGKRVTRDGTVVDYATGAVIWGEPGTNSQHAFFQLLHQGTQIIPAD
FILPLKTQNPIGAHHDMLASNCFAQTEALMKGKSEEEVERELVAAGADAA
TIAALLPHKVFPGNRPTTTLLLDELNPFTLGSLIALYEHKVFVQGIVWNI
NSFDQWGVELGKQLANVILPELQASSNAAAHDSSTNALINAYRAQRYC
>Cag_1739 Uncharacterized protein involved in tolerance to divalent cations-like
MDSATYHCMVITTLPNRPQAEQLAELLLTEHVAACIQMVDIRSIYLWQTE
LCNEPEVLLLIKTTESAYPNLEGIITQNHPYEIPEIIKLPIHGGSTNYLN
WLTAMTTGCSTNKENARPNNSTPTS
>Cag_0585 conserved hypothetical protein
MNVVYSAEAVDDLVRLREFIAVHNPQAAHRISNELVSRIEQLCAFPEMGK
QIPQFPTPSIRDFIFGNYIVRYAIHSDAITILKIWHHYENRIK
>Cag_1318 hypothetical protein
MLQAKVSLSPPLYEFLKNYKDFGFKDKSSMVQNALERLKNELEVLKMQQS
AQWYAELYDQDLETQELTESAITGWPE
>Cag_0050 UDP-N-acetylmuramyl-tripeptide synthetase
MTAMNCIPCCNNLPSLTVGAFITALQAVEVTEQAGACHMEHCITAVSSDS
RDMIEGALFVAVRGFCTDGHRFIESAIERGACIIVCEELPQSITHGCLYL
RVIDARKALAHVAALFYGNPAKQLRFIGVTGTNGKTTTSRIITEMLTAFA
VPSGYIGTNLCRIGKRDIPLERTTPEAHELHALFALMVEAGCKAVVMEVS
SHALLLQRVYGLRFDAALFTNLTQDHLDFHHTMQAYAEAKQLLFDQLQPD
GVAIVNTDDAYAPLMLQRVAPSQRVCCTLQANVPSVLQCPQYGQDFQAEV
RHASLAATEMELRFPKGETELLTVGLAGSYNVMNVLQAAATGYCLGLQPA
AICRALAAVHSVDGRMERVGRPNLPYQIFVDYAHTPDALQKALETLKALK
GEEARLMVLFGCGGNRDRLKRPIMGSIAASLADVVILTSDNPRDESPDAI
LDEIEQGMQGAAHQRIADRAEAIRTAISQLQAGDVLLIAGKGHERYQEVA
GKRTYFSDQEEVGRCL
>Cag_0818 hypothetical protein
MSKRVVTDTMAIVLRLEQRKLPQQVRNIFLKAEQEECTIIIPTMVFAEIG
YLSERGRIDVTLDDVRTYCTQHPNIVESALTQAIVAHSFTINDIPELHDR
LIAGTASYQQLPLLTNDPIITQSQHLTVIW
>Cag_0728 Fructose-1,6-bisphosphate aldolase, class II
MQKKKISYKELGLVNSREMFSKAVAGGYAIPAYNFNNLEQMQAIIQACVE
TQSPVILQVSKGARSYANETLLRYLAQGAVAYAEELGSSIPIVLHLDHGD
SLELCKDCIDSGFSSVMIDGSHLSYEENVALTKQVVEYAHQYDVTVEGEL
GVLAGIEDEVSATHHTYTQPEEVEDFVAKTGVDSLAISIGTSHGAFKFKP
GEDPKIRLDILAEIEKRIPGFPIVLHGSSSVPQDLVATINQNGGKLKDAI
GISEEQLRLAAKSAVCKINIDSDGRLAMTAAIRKVLAEKPEEFDPRKYLG
PARDSLKKIYIHKNINVLGSNGKA
>Cag_0337 ATPase
MSALLELVDVQRSYTIGESEVAALRGVSLTIERGEFVAIMGASGSGKSSL
LHILGLLDTPDKGSYRIAGSDVATLSDDARASLRNHVAGFVFQQFHLLRR
MSIEDNVRLPMLYSGVERSSNLQAEAIRRLEMVGLAHRLHHTPSQLSGGE
QQRVAIARALIRQPAIIFADEPTGNLDTRNSLEIMKILQGLHEEGKTIVM
VTHEPDIAAYADRIITMRDGVIISDERKATATVLPSSDSPFVLPTTAIAA
WWHYKRLSGFVAQAMQAILANKMRSLLSMLGILVGVASVIAMMALGEGAR
VAMQEELKAMGSNMLSVRGGSAKIRGTSQGAGAVTRFTVKDVEAIAALRP
LVRNASGVVNGSARVVYGNRNWSSSLLGAGYDYGTMRAALPTVGRWFTAE
ELQRRDKVAIIGVTVARELFGESNPLGKTIKINRINFTVIGLAPAKGFVG
PQDEDDIVMIPLSTAMYRVLGRDYLSGIFVEVAEAHRVDEARQRIAAFIR
QRHRLPVDDDSFYIRDMTEIQKMLSSTTRTMSLLLGAIAAISLVVGGIGI
MNIMLVSVTERTREIGLRKALGARNSDIMLQFLVESAGMTLLGGVLGLLV
GIGVALGLTLVAGWAVKISLFSVLLATLFSAVTGLFFGLWPAQKAAALKP
VEALRYE
>Cag_0945 conserved hypothetical protein
MKFPHCDYAPSEDGFYSQCASCPIDASLAGCALEECGDGVYQSEGETPHG
GSKALFYFTDNDGNRVAKRQAHHVEVHEMNAENQIIAVLYGMVDPEGIIY
LKKSSASSSNN
>Cag_1214 conserved hypothetical protein
MKITEAPNCPHCGATMQKCAPPPFNFGDGLGWCTPYMYVCFNDECKLYAN
GWNNLKNNFNKTASYRCICYPDNGVFEAMCVFSPDGMKGQIIEE
>Cag_0325 conserved hypothetical protein
MTVMNFSTANQTYRQLMGNGLIYSVPRFQRDYSWTDQEWEDLWADILELL
TSDGEQAHYMGYLVLQSKDSKNFDIVDGQQRLTTLSILALSVLAHLARLV
EKNFDADNNRLRQEQLRNSFIGYLDPVSLVPRSKLNLNRNNNAFFQNYLV
PLQKLPQRGLKATEHLMRKAFEWFNERIKQEYGNRKDGAAIAGILDSLAD
KLFFTVITVTDELNAYKVFETLNARGVRLSSTDLLKNYLFSVVHHENNNE
YELNSLDERWEILVSKLGSESFPDFLRTHWNSHHRFVRHADLFKTIRANV
LNRKHVFELIRSMETDADFYVALSSHEDQLWNPTEQKFIEELRMFNVRQL
YPLLMAAWRSCERQDFTDILRACAIISFRYNVIGNLPTHEQERVYTNVAE
QISKEKLTSFYDILIAMRSIYPNDEAFSNAFSDKQLKTTQSRNKRIVRYI
LFKLETLLSGTEYDFDSDRYNIEHILPEHPENNWEQFSDRDYEQSVYRLG
NMTIMNTAANRDIGNTSYEEKRPRYAESEMMLTKKIAEENQSWTIERIGT
RQRSMAKQAKTIWRISQLD
>Cag_0917 cytochrome c-555
MSRFVSAAFCALFIVSMGSTAMAATYNAAAGKKVYDASCATCHKTGMMGA
PKLGDKANWAPRIAQGMDKLVSKSVAGYKGAKGMMPAKGGNAKLTKPEVG
NAVAYMVQQSK
>Cag_0655 hypothetical protein
MTMLREIIKPTTDFYSVHIPKKYINQEVEILVLPFSYKNRQEIEDNVSCD
VFSKTSGILKPKNIDPLQWQEEIRNDREI
>Cag_0031 hypothetical protein
MNETIHHQILEQKPSEKINLVTIILESLDKPEPEIQKIWVDESQKRFDAF
KAGKIKLYTY
>Cag_0997 conserved hypothetical protein
MFQSTRPRRARQRGRRKIRLKKVSIHAPAKGATFHAYLYVSVPMFQSTRP
RRARLSSMRYAYELGLVSIHAPAKGATAQKLRLFYVERVSIHAPAKGATC
FVIISNHLILPFQSTRPRRARLIDIEEEGKQ
>Cag_0524 transcriptional regulators, TraR/DksA family
MVKKHEEELLSNEVEGTNEPMLTKTYLSDEELEHFRQLLMKRRDEVMRDL
DILRSSLSDANVEESVNSNYSLHMADHGTDTMDREQRFMFIARDEKYITY
IDQALDRIRNKTYGICLKSGKPIPKKRLEAVPHTSVRIEFKQGKK
>Cag_1516 Glutamyl-tRNA reductase
MNIISVGVNHKTAPIEIRERISLSEVQNKEFITGLVSSGLAHEAMVLSTC
NRTELYVVPAMHEVTGEFLKEYLISYRDARKEVRPEHFFSRFYCSTARHL
FEVASAIDSLVLGEGQILGQVKNAYRIAAEAQCAGIMITRLCHTAFSVAK
KVKTKTRIMEGAVSVSYAAVELAQKIFSNLSLKKVLLVGAGETGELAAKH
MFAKNARNIVITNRTISKAEALAEELGTNQVLPYESYKEHLHEFDIIITA
VSTKDYALTEADMQPAMARRKLKPVIILDLGLPRNVDPNISKLQNMFLKD
IDDLKHIIDKNLEKRRRELPKVHAIIEEELIAFGQWINTLKVRPTIVDLQ
SKFIEIKEKELERYRYKVSEEELQRMEHLTDRILKKILHHPIKMLKAPID
TTDSIPSRVNLVRNVFDLEEPNQSHF
>Cag_0188 conserved hypothetical protein
MNADNFRLQLGSKEYVPIIIGGMGVNISTAELALEAARLGGIGHISDAVV
TYICDQLFKTSYVSRKRKQYAAYSNSPDKSAVLFNLEELAEAQKKYIENT
IARKKGDGAIFLNCMEKLTMNNSAETLKVRLSAAMDAGIDGLTLAAGLNI
RTLDLISDHPRFRDVKIGIIISSVRALSIFLKRAVRLNRLPDYIIVEGPL
AGGHLGFGADNWHMFDLKTIFTEVVDFLAKEDLHIPVIPAGGIFTGTDAV
EYLQMGAGAVQVATRFTISKEAGLPCDVKQHYLNAREEDIVVNMASTSGY
PMRMLVQSPTLRYAIRPNCEGLGYLLENGGKCSYIDAYYEALENRKEGEP
LSVKIKTCLCTGMANYDCWTCGQTTYRLKETTNRLPDGKWQLPSAEDIFL
DYQFSSDHAIQRPKPEA
>Cag_0877 hypothetical protein
MIKTAVFVEGQAELIFTRELILKFFEYKNIWVECYTLFNDQELNPTEYSY
KSDTVNFYFQILNIGNDNKVLSSILKREKYLFGGDKAFHKIVGLRDMYSK
EYRDIVKKSTIDSMLNQKFIENHNNTILEKSRYHKKISFHFAIMELEAWL
LGIQGLFEKMDHRLTNEKIAEACRIDLFKADPETAVFHPAHLINDILHII
GASYTKKKDEINKFMSYIERDDFARLLNSEKCQSFTSYCNALVIK
>Cag_1081 D-alanyl-D-alanine carboxypeptidase
MLRIALIPMLRHHFTSAFRYRHRSTIQHSSLFVATLLLLLLQALPLFARS
NDELLVEQSHDRISAYMLKEYGTPSMLMGKNISTPLPPASLTKVLTSIMA
IESGRLLQDVVITRESTLVEPSKAGFTVGERIKLIDLVKAAMVSSSNDAA
FAIGIYLSGSVDAFVDAMNYKARQIGMRNSHFTNPAGYDRGQYAGNVSTA
EDLMRLTEYAVRNSTFNQIARMDRAVFVEQSTRKVYNLRTHNKLLAHYPH
SVGIKTGWTTRAGGCLIARAVKGDKDLLVVMLNAKISRWDTAASMFDLAF
NDRLPTSQFVASNGGQNLEQSERVIKGEQAALLAAAATPALLHAGGKALQ
AKQSGVVSKLSKEKKLSRKDRLALKKQKGKLSKKERLALKKKQKLSKKEK
LALSKKEKKLSRKERLALKKQKGKLSKKERLALKKKQKFSKKEKVAQTKQ
MRKAKRNELLANKVDKKKSKKEF
>Cag_1972 glycosyl hydrolase, family 3
MRRRLRMVALVLLWLCMAAPLAMAAASPDSLSIKMGQMVMVGFRGTSLAE
SPQIVAAIQKRHIGGVVLFDYDVPFASPTRNITSPSQLARLTQELQEHSA
IPLFIAIDQEGGKVNRLKASRGFPVTISAAKLGALNQPDSTRSAARQIAK
TLRAMHVNMNFAPVADLNRNPNNPVIGKVERSFSADAARATTHIRLTADT
YRAEGIIPTLKHFPGHGSSTTDTHLDFTDVSNSWSKEELEPYRSLIADGY
EDAIMTAHVFNATLDPTYPATLSKPTLDGVLRQQLGFKGVIISDDMQMGA
IAAHYGLESAIRLALNAGVDILLFGNNTAYDEAIAEKALAIIHALIERGE
IQPSRIEESYRRIMALKQRYGVVR
>Cag_0550 conserved hypothetical protein
MARHKSQRKGIRLDMTPMVDVAFLLLTFFMLAARFRPPETLSVTPPASHS
TQSLPDADLLTITVSRNHALYLSLSSKRDREALFNRTIRPRLQARSVSHS
AIADSLRHFRISEQMPLQANELGQLIAHAKAANPELQAVIRADGEAALAP
VNEIMQAFRRAGITTFHLVTMPSKEAR
>Cag_0195 sodium:solute symporter family protein
MQPLDTALVLLFLVANIAFGLWQSKSNKSTGDYFLGGHSVPWIVAMLSIV
ATETSVLTFVSVPGLAYRGDWSFLQLPLGYIVGRVLVSMFLLPLYFREGV
SSIYEIIGRRFGTGMQKLASVAFLITRILGDGVRFLATGVVVQAVTGWSL
PLSIVLIGVVTLIYTISGGLKSVVWLDSFQFGLYFLGGVISISYLLQQLD
APFPTLFATLHEAGKLQVFQFSNDLLVNPMAFGAAFLGGVFLSFASHGVD
FMMVQRVLGCRSLSNARKAMIASGFFVFFQFAIFLLAGSLMFLFMEGREV
EKDREFAFFIVHHLPTGLKGILLAGILSAAMSTIASSINSLAASTVTDMA
GGKVSLTGSRWISFGWSLVLIAIALLFDESNKAIIMVGLEIASFTYGGLL
SLFLLSRSSRAFHPVSLAVGFLASMAVVVLLKYVGLAWSWYILLSVLLNV
LLVYGIDIVTNTISPKRL
>Cag_1825 Ribosomal protein S4
MARFRGSITKVSRRLGIALAPKAEKYLERRPFPPGQHGQGRKGKISEYAL
QLREKQKMKYLYGILEKQFRNYYKKAVAQRGVTGDNLVKLLERRLDNIVF
RAGFSPSRAGARQLVSHGHVTVNGKKVDIPSYLVSPADVMEFRQKSKNMS
AVTDSLNKTPESRIPAWIQVDKAHQKAVFLSVPERADIQEEFNEQLVVEL
YSK
>Cag_1205 Na+/H+ anti-porter
MQMHIALPLQNPVLQFSLLLFIILFAPLLFRRFKLPGIISLIIAGALIGP
HGFNLMLRDSSIVLFGTVGLLYIMFLAGLEIDVADFRKNSSKSILFGLYT
FFISIILGIGVGYYILNFPFLSSILIGSIFASHTLIIYPTVSKLGITKNK
AVTLAIGGTIVTDTLALLLLAGVVGLASGELNTNFWLRLLLGVLLFATVI
LWGLPIVARWFFKRFDDSVLQYLFVLALVFFSAFLAEAAGVEPIIGAFFG
GIALNRLIPRTSSLMNRIKFVGNALFIPFFLIGVGMLIDIHAFFKGYETI
LVAFVITVSATVAKYSAAWIAQKMYGFSIEQRRLIFGLISAHAAVALATV
MIGYGVIIGKDVTGQPIRLLDESVLNGTILFVLLTCTLATFVGEKGAYAL
AGQQELAEPEHLLPSDAPTNRFLLHINHMGTVKELVNVSGMMVTGKMPYS
LYGAYIATTREMEVNHEYRAEKMLERAVISAGAADRHVTPLLRYDTSIAN
GLAGVICEQKISDLFVCVEPSSDFPDTLSINKPDNILGRCGVTTFIYRPT
QPFATIKRTIAVVPDGAEADPGFRHWLERLHGLIQHSRSLLMVYASTQTT
DAIKATEIMPIIVEYKRFRGWENFTLLSQIIRKDDLMVLVMSRRNNPAWH
HRMATIPAMLHRYFQQNSYLLIYPLQNNMSTALEEENAVTTYTQHLGNVA
KVSGWVKQLVRGKQSS
>Cag_1709 conserved hypothetical protein
MNGYRKRVVDDELNELIAALPAIALEGAKGVGKTATAEGRCRTIFRFDDP
AQRAIAEADMGVVLNQDTPLLIDEWQRVPSVWDAVRRAVDRDQTSGRFLL
TGSASPTTPPTHSGAGRIVTLRMRPMSLAERGVGVPTVSLRELLFGHRPE
IVGKTNIALADYVREIVHSGFPGIRPLSGRALRLQLDGYLRRIIDTDFPE
QGYMVRRPEVLRRWLAAYAAATATTASLETIRDAAFGGDKEKPSKTTTQP
YREILERLWIVDPLPAWLPSRNHLKRLAQPPKHHLADPALAVRMLGLDES
ALLAGEESMLSIPRDGSLLGHLFESLVTLGIRVFAQAAEAHVSHLRLHGG
RQEVDLIVERGDQRVIAIEVKLSSTIKESDVRHLLWLREQLGDELLDMLV
IHTGTQAYRRPDGIAVIPAALLGA
>Cag_0184 conserved hypothetical protein
MEQQKLRELLQALHQELEQLQSVDESTTAVLTTLRNDTQRLLSNKEEPME
EEEGSLSERMQQALEHFEEKHPSLSISIQHVLDSLARMGL
>Cag_0959 Heavy metal translocating P-type ATPase
MSTETYSVKGMHCASCAAIIEKKLSKVDGISKASVNLATETAQLSFDKES
LSTEALNNILNNYGFSLERETAPEESRTNPTVKLQAKQAAKVEELRQQRQ
NVQRTLPLALLLFFWMVWESAAMSLHFIPHPPLLMHQFTLPLLVAATWVL
LTVGKPFLRAIPSFIRNGAANMDTLIGIGTLCAYLYSATITLLPQAKSFL
HASEATYFDVTIVVIGFVLLGKYLEARSKMKTGEALEALITLQAKSALVE
RDGITVEIPIGQVRRGDTIMVRPGEKIPVDGIITDGTSAVDEAMVTGESL
PVDKKAGDGVIGGTMNKQGSLTFIAAKVGSDTMLARIIRMVEDAQSSKAP
IQQLADKAAAIFVPTVLILAALTCILWLTLGTLWLGFSTALSYAIMGTVG
ILVVACPCALGLATPTAIIVGIGKGAAEGILIRNAESLEKLSRVNTVVFD
KTGTITSGKPSLHAVVPFDGKSTSNTLLTLAAMLESRSEHPLAQAIVEAA
RLQQIAFGEVTNFSAQEGVGVTGMVDGRAITIRKPQQNEEHGREAIEALQ
AEGKSIVVMEENGVALGVLALSDTIKPEAASAVATLKHKGIRVLMLSGDH
QAAANAVAAQAGIDEVIADVLPHDKANKVMELQAKGAIVAMVGDGINDAP
ALAQADVGIAMATGTDVAMETAGITLLKGDINKIPQALNLAHATMRVIRQ
NLFWAFFYNIICIPLAAGAFYPIWGILLNPAFAGLAMAGSSVSVVSNSLR
LKRIQVKEKQN
>Cag_1707 Pantoate-beta-alanine ligase
MQIISDPHAMQAISENVRLQGKRLAVVMTMGALHEGHISLVTLARKSADT
VIMTLFVNPTQFSAGEDLERYPRPFEQDVAHAEAAGVDYLFAPTSAEMYP
AGHQTSVQCGALAERFEGAHRSGHFNGVATVVTKLLHITKPHTAIFGEKD
AQQLAIIRQLVADLLIDVEIIGAPIVREADGLAKSSRNIYLSSNERKRAT
VLYGGLCHAKARLAEGEQNLKLIATEVEALITATEGCTIDYVAFVDEATF
LPIEQADASKRYRLLLAVRLGSVRLIDNMVMGTKDSYKPC
>Cag_1109 Diaminopimelate decarboxylase
MLQSTYFSYCNGALLCDGVPLQTLAEQYGTPLYVTSERSLIESYEAFERA
FQPLHHFTCYSVKANYNLSVIKTFAALGCGCDVNSGGELYRSLQAGVAPD
QIIFAGVGKKYEEIAYALESGVLMLKAESVSELHVINRIAAEQGKIASIA
LRINPNVTAETHPYITTGDSKEKFGIDEADLADLFALIRQLPHVRLIGLD
MHIGSQIFDPEYYVAATTKLLALFNLSKSMGFALEYLDIGGGFPVTYTDT
KHATPIERFAEKLVPLLQPLGVTVIFEPGRYLVANASVLLTRILYRKRNH
VGKEFFVVDAGLTELIRPALYQSHHEVQAVQRQEASVIADVVGPVCESSD
FFARQRELDAVPEGGLLAVMSCGAYASVMGSNYNGRLRPAEVMVRRNGEV
VLTRRRESFEQLIQNEVL
>Cag_1844 Ribosomal protein L16
MLMPKRVKYRKTQRGRMKGNAGRGTSVAFGSFGLKAMEPAWITSRQIEAA
RIAMTRYMKRDGKVWIRIFPDKPVTQKPAETRMGSGKGSPEFWVAVVKPG
RVMFEADGVSKEIATEAFRLAAKKLPIKTKFIIRPDYEG
>Cag_1115 3-deoxy-D-manno-octulosonic-acid transferase
MPISCHHMQSFPLYQTLFPLLAGVAQGVKALHPQIAAFFEVRQHLFTTLQ
QQLATMPNNGFRLWVHAASVGEFEQARPIIAALQARHPNLRLFISFLSPS
GYNARKNFPNAAAVFYLPLDTAANARKLVALLKPDALLLMRYDFWPNHLL
AAKKYGTTLVLAAAVLQPQSAYFNPLLRRFYKKLFHLFNAIYTVAERDTQ
AFKEHFGYRNAITAGDPRFDQVVARSRNRAAVANLRAHYEGRKVLVAGSV
WEADEQLLIAAWQELNPRPSLIVVPHQTEPEKIAHLCSLLDERNLSYARI
STFPESFQPEQQILIIDQIGYLAELYSIASIAYVGGGFGVNVHNTLEPAV
YAIPVLFGPNHHNSPEAAALLEAGGATVVQQQSELHAALQCLCSNESERQ
RQGSAAGTFVQARTGATAMVVEYLEGVANVVKWQGS
>Cag_0497 Triosephosphate isomerase
MRKKIVVGNWKMNNGVAASEQLAADLLAALGSSFTGCDVGIAPTFLSLTT
AGKVIAGSTIQLVAQNCHYENDGAYTGEVSAAMLKGAGCSSVIIGHSERR
QFFGETNATVNLRVKKALAEGLDVILCVGETLAERESGVTGDIVSAQVRE
GLVGVSDISNVVIAYEPVWAIGTGKTATSAQAEEVHLMIRTLVTELYGEP
AAQALRIQYGGSVKPSNAVELFAMPNIDGGLIGGASLNAADFVAIVQAAA
GK
>Cag_0813 Peptidase S41A, C-terminal protease
MSRIVTVALMLVVLLFGIFLGTRISGRRVESSGAGQQKIVEAYNLMRQFY
VDEVGGDSLAGAGIEGMAGLLDPHTVYLEPEKVTYEQAEFEGNFDGIGIE
FDIVNDTLLVVTPLSGGPSAAVGLASGDRIIGIDGVSAIGITQRDVLKKL
RGKQGSMVQLDVFRPLDGKRMDFSVTRGKISTSSIEAAFMVNQQVGYIRL
SRFIATTADEFRSSLQLLKQQGMKRLLLDMRGNPGGFLEQAVAVADEFLS
EGKLIVYTKSRKGSLPDERYEARSGDTFERGDVVVLIDRGSASAAEIVAG
ALQDNKRAVVVGEPSFGKGLVQQQLPFADGSALRLTVARYYTPSGRQIQR
VYRKGVAGREHYFEESMSNISPNKLFDDPDTLLYYENNNVSVYNTSTLPS
LLLSLKGKKGENNRLTDLRDAGGIIPNYWVNARSYSSFYQELYRTGLYDE
VARKLLDDPHSLVQKYRDSLERFMTSYTEEPNFEALLAKACQSKGVRFNR
VALLQDRHAIVLALKGRMAHQLFGSSGQIKFYVKTADPLVRVATSVPLST
R
>Cag_1280 conserved hypothetical protein
MDIKAIAQALGMAIFRYPALWRKLHHEPASNDGSMLRNYAVPIIALVQLL
KFPLIGEPRPAMFLGIVSMLVDSAVLYVLAGGVLALLPIPRTEEAKGQVM
TVFCYALTPCWLAELAYGHGVWSILIALFALLHALASSREGLVRLLSLEV
QSASGALTRSAFFMVLISSISFFILSAATLLVSF
>Cag_1794 hypothetical protein
MAENNTNPIAEVVTSVMQTVSTVAQQPVNLLNTGLTTAQQVAEPLFGAVT
GLLSNVRDICCTVSQALIDALTPKK
>Cag_0717 ATPase
MTIILRLLGLVKPHLWWMVLAALISFATTGSGIGLLMAAAYIIGKAAQQS
GMAALELGVIGVRLFSLSRALFRYAERLISHNIAFKVLSRLRLWFYDALE
PLAPARLMRFKSGDLLQRIMEDIQSLQNIYARGFAAPLTALLVALLMAAL
LGAFSPLAGGLMLLFHALAGIGVPLVTHQLSKGVSSGIMRQKAEEQLLAL
DLFEGVGELQLYGKLPQHLAALQNANQRKLTLERKNAFIERLHESFIGLL
MNGALLVMIVTLMQTTTNGALSGIALAVATLAIMASFEPFFPLPASIQHL
EADQHAGERLFEILDAIPEITPPASPLPFPTNHTIRAEKLSFSYPESTTP
ALHNLSFTVQQGEHIALVGPSGAGKSTITSLFMRFWNPSSGSLTIGGTNI
ANIDPEEVRRNIGLVSQKTYLFAETIRHNLLLAAPHATDGELKAALTAAG
LGNFTSQLDEWCGQHGMKLSGGERQRMAIARLLLQNAPIMILDEATANLD
GITEQEVTETLKRLAQGKTLITITHRLKGMEQYAQILVLEKGRLVEQGTH
ESLMAAHGLYRKMWELQHDGGSIPPTPLLHTAS
>Cag_0814 conserved hypothetical protein
MFAPLLTTQSFTTMSIIEERILAAGYQLPQPPAAAGLYVPAMKVGNIVYT
SGQLPLKDGALVELGGRGAVSDATLNEAAEASRVAVLNGLAAIKSVAGSL
EAIQQIVKLTVYVASSDGFSMQHMVANGASSLLNTIFQENGVHVRSAIGV
KLLPLNASIEVELLVLLSEQ
>Cag_2022 NLP/P60 family protein
MNFQPFPKNTPWRFPRISGLWLVSTALILLISISGCQSYWVRTKYHIESK
YSSKKRKSHSARLAPNGGYQFAQPLPMRLSQSAMASFLSEVEQLRGTRYR
SGGCSPDGFDCSGLVCYLMKRHFGLLLPRSSADMATHGVMISSTSALQKG
DLLFFSIDGRTIDHVGIVTGWNRFVHAATSGVREDYISSSYYRERFAFGA
RLLIAE
>Cag_0235 Holo-acyl carrier protein synthase/Phosphopantethiene-protein transferase
MEVGIDIVSLERIEVLYSRYGVRFLEKILLPEEVALCLQKPQIVASVAGR
FAAKEAIVKALGSGFAQGVHWKSFAILNDAAGRPYVKVIDDECLPPSSVV
KLSISHDRHSAVAVAIIE
>Cag_1222 Methylmalonyl-CoA mutase N-terminal domain/subunit-like
MIEQHSHALFNDFPAISEAEWLAQVAVDLKSDTAFESLCWHTLDDFTLKP
WYARESAPPPLTIPTVKATNAWLNCKRIVVNNAATANGEALYWLERSNAA
LEFHCSNATLTNAESLQQLLAGIDANAVALYFSGNLPSTAELLKKLASTA
HLAENKGGLLNAIAQHENVAALFQAGSQLPNFRWLTVDTMPYHQQGATPA
QEIAVALAHVSNLLHRATDAAIAAEDAAQRMEIIMAVGSSHFVELAKPRA
FRYLLAQLLHAYGVPQANLPIALPHLFARTSPRNNSLLDPYTNVLRLTTE
AISAILGGYDALHISDFDNAALSVKSDVAERITSNIHLILREEGNLHHVV
DPAKGSRYIEAVTHQLIHTAWELFLAIEENGGINSEAGANFLADKISQAE
AKRQKAVAQRKNSLIGVNRYTTPLTAEQHANFEALMQANAQAAASGNLAA
PFEQLRLAMQAYTAQHNTTPTVATLMAGDMAISRRQAAFCDDAMQCGGFA
LGGKVVVGDDVEKSCIDVLMQQPAMVILCIAEKNPLPTAEQLCHTLRAHN
KEVLIVMAGKPPAEHERLLTAGVDSFLYTGCNVLNMLESYQRKTGVQ
>Cag_0478 VpsC protein
MKNPNALKWSDLKTGIFFILGMGFAAYLGLVISKNSSLFSGVTTVKIRSS
DIQSLAENNFVSVSGKKIGTVSKMEFVKSNDSLFVVASLRLRNEYAPLVT
KDARATIKSLGVLGDKFVDIKTGKGAPVNEGDFLALDTDEGMASLTTNAS
NAFVKINQLLDQLNSGKGMAGRIISDEQMGKELAETVTGLKTATNELAQL
TAKASRGNGLLPKLLNDPAMAKNTEETLEHLNQTSRKTELLLTKLNDDKG
TLAQLSSNPALYNNLNQAVSSLDSLLTDLKAHPSRYVKFSVF
>Cag_0251 ATPase
MNSILSSHKLKKSYNKHPVVTNSSIEVRQGDIVGLLGPNGAGKTTTFYMI
VGLVRPDSGEVRLDDKPITHLPMYKRARLGVGYLPQEASVFRNLSVEENI
AAVLEFTALSKKERQERMEKMLEELNITHIRKSMGYALSGGERRRTEIAR
ALALNPKFILLDEPFAGVDPIAVEDIQHIVAGLAKRNIGVLITDHNVHET
LSITNRAYLLFDGSIFMKGTPEEIADNPEVRKMYLGEKFTLERY
>Cag_0873 TPR repeat
MSNELLDEKLRLLVKALRADTFHFVLIINNHPSVYNDVVEWLKQHITDRE
IRELRLTGKHYREVSDVLQAAKQDIVTIPDFDELFTKENDDVRVALNQRR
DFLAFQRMNLVCFLSPDTFRLLPKKIPDLWSLRSLELDIAYDIKEPLFTI
PTTPFISSLGGTTIAEKEAEIRRLTYQLSQIDPANIALRKELEAQLVTLQ
MEVPQRFEEATSLHDTSQKNIIAAEVTATQDETVTNISTSILRSIFAVLP
SEPITLIALQELLPNIDNLETALQNLVAENVLNYNSTTKSYKCSPVVQEV
TRKQQSDHLFADIEQLISRLIDRLAYEPNTGHVTEVSYETAALYVRYGET
ILRNCTDVEYQLAALADRIGNYHTATGNLDKALSFHAECLRLSKELYEAY
PNNVFFKNGLAISYEKLGNTHTSLGNLDKALTYYEQYYKLSKELYEAYPN
NVSFKFGLAVSYSKFGNTHTSLGNLDKALSYYDNETRLFEELYEAYPNNV
SFKNGLAISYSALGQFYRDHRNDSDIVKNYFQQAEKVWAELVSSSPQHAE
FKQNLSWVKNQLQSLHS
>Cag_1623 conserved hypothetical protein
MKVIGINGSPRKDGNTAKLINMVFEPLQAEGIECELIQVGGTLIRGCLSC
YQCVKLKDKRCSTKNDSFNEIFEKIIAADALILGSPVYFADITPELKALI
DRTGFVARVNGHMLRHKVGAGVVSLRRGGAIHAFDSINHLFQISSMFTVG
STYWNVAFGGRTGNEVEGDVEGVENMHDLGSSMAFLLKKLHCCE
>Cag_1322 conserved hypothetical protein
MIIEFLDPAKVDLLEAVKYYNNEQENLGFEFSEEVERSLSRVVNFPHAWC
KLSVRTRRCQTNRFPYGVIYQKRGNLILIISIMHLHQEPNSWRKRIPKKE
Q
>Cag_0759 conserved hypothetical protein
MNIAQIENNLQQLVKTFNEESFIYDLLLAYGQPKATIKRIKDGGLNLSKV
EGEIAWKKKLFFKTVKGVDLHELIADLTENGKAIKHDPRFIIVTDYKNIL
AIDTKTQDTLDIEILEIAKHFDFFLPWAGMEKFQHQNENLADVRAAEKMA
KLYDDIKKDNPISTSSIAEQKAEIHNLNVFLSRLLFCFFAEDTGIFEKAQ
FTNAISSHTQQDGSDLNSYLDNLFDIMNMPNKQRENLPAYLLAFPYVNGG
LFRSKHYAPKFTHKSRKAIIDSGELDWSAINPDIFGSMIQAVITPEHRGG
LGMHYTSVPNIMKVIEPLFLNELYEEFDVANSTSSVSVKNKKLNALLLRI
WNIKIFDPACGSGNFLIIAYKELRKLEILIFKEIAKNNHNYSAQLSGISL
GNFYGIELDDFAHEVSILSLWLAEHQMNQIFFKEFGRTKPALPLTETGNI
VHGNATRLDWENVCPKEEGDEIYILGNPPYLGSRNQDDLQKADMKTAFKD
DYKSLDYIACWLLKGADFIENFHAQFAFVSTNSICQGEQVALIWPRVLKE
NLEIGFAHNAFKWSNNAKDKAAVIVIIIGVRNKSNNSKKIYEGNHVRVVS
NINPYLTTGNSSYITRRSKPISPQLQKIVYGNLINDNGNLVLNESEKCEL
MRKYPLTENYIKLYIGSKEYLRGETRYCLYIEDDELQNAIQIPEIKKRLD
AIRVHRANSSEKSTKALAPFPNRFYFQSYNNTESIIIPRTSSENREYIPI
GFLSSNIIISDAAQAIYEAKPWVFGIISSRMHMTWVRAVAGRLKSDYRYS
SAICYNTFPFPPITETQKKELEKHTYRVLEERENHSEKTLAQLYNPDKMP
DGLREAHHQLDLAVERCYRAKPFETDEERLEYLFKLYEQMIEEEKSKGSL
FESEAKPKKKKK
>Cag_0457 DNA recombination protein, RuvA
MYAYFRGTLISFTADEAIIELQGVAYHFLISATTSRQLPNSGSEVQLFAH
LYVREDAMLLYGFYSEEERQLFRLLLQASGVGPKLALSVLSGLPVHEVHD
AIVSNIPERLYGISGVGKKTAARIILELRDKILKLSPVLPTATARRPHNA
AQQLRDDAITALVTLGFPRAAAQKTVTSLLDENSNCTVEEVVKSALLLIH
NAQL
>Cag_0590 Penicillin-binding protein 1A
MKSLLYKSLFLLLAYVLSVSATAPSAYAMRSLLGLPSVEELENPNPELAS
LVYSEDGVLIHKYFNKNRTFVPLRSIPRSTRYALIATEDAEFYNHWGVNV
RRVFVAMGENLFRAPKRWHGASTITQQLAKNLYLTQERTFSRKFKELITA
IELERTYTKDEILALYFNTVYFGAGAYGIESAAQTYFGKSASQLTLPESA
TLIATLKNPTAYNPAKNPAGSISRRNLILGLMEKNKFITPQQAAKAKRTP
LTLKYTPLNQQGLAPYFAEYIRQTIKPATILGDLNLYRDGLTVRTTLDSR
MQKYAQQAAVEHLASLQAAFDRSWRWPENLKNQIIRESERYKELVGSGMS
DGQAMARLKADNVWLHNILREKTRIQVALVAIDPNNGHVKAWVGGNSLSP
DEYKYQFDHVWQARRQPGSTFKPFVYTAAIDKGLPANFQVLDQPLQLSSG
DGIWSPRNSDGSSGGMTTLRSALTRSLNQVTVRLAYEHLSPAEIISYAKR
MGINSPMPNDLSIALGTAAVSPLELAGAFTPFANNGIWSEPISILKVEDK
HKRFITSQKPNSRFAIDSTTNYVMVSMLRDVINRGTGASVRSYGFTAEAA
GKTGTTQNMKDAWFAGFTPQLVAVVWTGFDDERIKFTSMEYGQGARAALP
IWAKFMQRCYSDPTLKLGSRYFHIPETVIAVPTSSAQNNMAADLLGGNVS
FEYFTPKGFEYYQSHPELAISSPPPMAPIDSSSNGVSAVTMPALKPVAPV
PSAAKPKVEKPH
>Cag_0482 Bacitracin resistance protein BacA
MSLFEALILGIVQGLTEFLPISSTAHLRIIPALAGWNDPGAAFTAIVQMG
TLAAVLIYFRSDIVCIVKAVVDGLLKGKPLNTPDATMGWMIAAGTIPIVF
FGLLFKDAIETTLRSLYWISAALIGLALILWLTEVRLKQRVAKHLPLKSM
EEIGWKEALLIGVAQSIALIPGSSRSGTTITGGLLLNLSREAAARFSFLL
SLPSVLAAALLQLYETRHSLLASPSELTNLLVATIAAGVVGYASIAFLIT
YLKEHSTSVFIIYRIVIGVAILGLIATGAIQP
>Cag_1211 conserved hypothetical protein
MPNNSPLTIFGASCKSGTYLLLIELVEPCRIVFGNFRKGEMFSLSAGMYL
YVGSALGNRSGYPLARRLIRHASRSGNNPPHAMRQTLLHYFSTTCNTTFT
PSPKKLRWHVDYLLEQPEATLSEVVIIQSPLRLEYALATLLEAMQETSHI
APHLGAQDCRDSTHLLRLRNREALFTELQQAIPAMLENT
>Cag_0863 Ribosomal protein L27
MAHKKGGGSTKNGRDSNPKYLGVKAAGGSTVAAGTIILRQRGTVIKPGVN
AGIGRDHTIFALVDGIVSFRNGRNNKKQVSVEPCC
>Cag_1068 Lipoate synthase
MQTNQLSMAQPIGRKPEWLKIKMASGASFAATRQLLNRHSLHTVCRSALC
PNLQECWSRGTATFLLLGNTCTRSCTFCAVSKASAPPAPDSDEPQKIAEA
IASMKLKHAVLTSVTRDDLPDGGANHWIATMQAIRQRTPNVSLECLIPDF
QHKKAALDSVMQATPDVLNHNIESVPSLYSTVRPQANYRASLELLRYAKE
QHGLATKSGLMVGMGEERHEVEATLHDLAAHGCDMVTIGQYLQPSAAHLP
VARYVPPQEFEEYSTIAKNAGIRYVHAAPFVRSSYHAETFPNNLTITQNL
>Cag_0851 branched-chain amino acid ABC transporter, permease protein
MKQMLFIAAFGVVAAIVPLLVGDNNPYILGILISLLMMAALSSAWNILGG
YTGQYSFGHAAYFGAGAYTTLLLVLNFNFNPLLSIALGIITSVIIALITG
SIVFRLRGHYFGLASIAVAEIVRLAVLNFEFTGGAQGLLLADLAMWDIDL
NSKLPFYYGMLIVLLATLAITAYTVKAKTGYYLQAIREDQDAASSLGINI
SYYKNKSLVISAALTSLLGSLYGLYIRFIDTHAVLDLRLSIEIILTAIIG
GVGTLWGPVLGALVLVPLAEVLRSNLLGDMLVKAGIVSETSAIGIFLKEN
LAHAHVLVYGIITVLCILYLPKGILGLLRRAKK
>Cag_0904 conserved hypothetical protein
MAETDTPYLRFADISLDISRGNYSAARQKLEVLEPLMPESYHLNLLYARA
LAGMERYAQACKYLQACCTLAPANEVAWYELATMQALAENDTSNAMESSS
AYDPVVDELEQLSAALMKAGPILASDSSEPTSIAEQKQPFADDTEIAVPT
ESLATLFIAQGAYKKAIRMYSHLIQLKPNNARFYQDEIDRLLDRL
>Cag_0289 Acetyl-CoA carboxylase carboxyl transferase, beta subunit
MSWFNRVKPSISSTAKRDVPEGLWWKCEECGAALHKKQMEASDHTCPQCG
YHFRISPYKYFSLLFDNQKYVEFDDHLRAADPLHFVDTKKYPDRVSDTIE
KSGKTEACRNAHGLCGGETLVISAMDFSFIGGSMGSVVGEKISRAVDKAI
ELQSPLLVISQSGGARMMEGAFSLMQMAKTAAKLSLLSEHRLPYISLMTD
PTMGGITASFAMLGDINISEPKALIGFAGPRVIRDTIKRDLPEGFQRAEF
LLEHGFIDRIIPRRELKSDLTTLLSLMKL
>Cag_0381 transcriptional regulator, XRE family
MDDLQRAITKRKSIDTEFAQTFDKGYEEFKIGVLLKEARKNAGLTQEALA
IKLNTKKSAISRIENHADDIRLSTLYHYAEVLGKQIKVSIQ
>Cag_0600 hypothetical protein
MVITLTPEFEQALHKIAEHNGTTVELLVLKTLQENLLFCKPQKTLFRKSE
KTLADFLVGYVGVFDSDELVKSGAQMSTNVRKQFGDILLEKRQQQKL
>Cag_0141 H+-transporting two-sector ATPase, gamma subunit
MATLKDIRVRIKGVKSTQQVTKAMKMVAAAKLRRAQDRAIMARPYASKLK
EMLASLSAKVDTSVNPLFAVRSEVNKVLVILVTSDRGLCGAFNGNIIKLA
YKTIHEDYAAQYGKGGVSMICAGTRGYEFFKKRHYTLTKGYPAVFQNLDF
AVAKEIADMASNMYLRGEVDRVVVVYNEFKSVLAPQLKSEVLLPITSGDA
SAKENSGGEYMYEPNPAAIIDVLLPKHLRTQVWRIMLESNAAEQAARMAA
MDSATENAKELLRTLNISYNRARQAAITTELSEIVAGAEALNG
>Cag_0579 penicillin tolerance protein LytB
MKIHLDRTSSGFCIGVQGTINVAEDKLQELGQLYSYGDVVHNEVEVKRLE
ALGLVTVDDAGFKQLSNTSVLIRAHGEPPATYTIAAENNLAITDTTCPVV
SRLQRTTRLLFQLGYQIIIYGKRVHPEVIGLNGQCDNCALIIKHADLSDP
KELEGFEPSAKTALISQTTMDIPGFYELKANLEAYVARVNGSAVEPWMAI
RDIDITADMSKVRTMPRYVFKDTICRQVSSRNQKLHDFSLANDVVVFVAG
KKSSNGQVLFNICKAANPRTFFIEDIEEINPEWFAAHEGKAVESVGICGA
TSTPMWHLEKVANYIEATYANSESIIAQ
>Cag_1179 hypothetical protein
MQIPLNQFENYIDETILKRGLSYFKNGYVHEPEEIKQGEFEALVEGTEDY
TVRLTIEKGIITDYSCTCPYDYGPICKHITALIFYLQQEELGLEVQPPKS
KTTTTKAKKTTKRKTIAEQVDELLDKVNHDELKQFVKKTVLADKKLRQNL
ILHFFHLTTNEDSKDFYAQQIKTILQIAKDSDGYISYSAVHNVCNITDQY
LAAAQNEIANNKYNKAISICTAVIEEMTKGLQFTDDSDGYIGDAIFEALE
ILQNIAKSNPPETVRIQLLEYAISAYKKSCFDGWNDHFDMIHFATLLVKK
DVEINNLIDILKNSIHTAHYKEKVQEIIYELLVKFKKLEEAETYLEQNIS
NSSFRLKVLETAYQNHNYSKAKLLANDAIKQENGISNSTWYEWLLKIAQA
ENNIKSIIEVARYLLLQGYDSECDTYYALLQQHIAPEEWNGFVEKMIAEI
KKKPYHLHYWLLPKFYIKQEQWERLLNFVQSTERIDILMTYDKYLVNNYF
TEIMDMYKKYILHSLNRAAQRNEYQKACEYIQRVVTLGGVRTAEIIISYL
KNNYPRRPALMEELSHIKLR
>Cag_0454 phosphoglucomutase/phosphomannomutase family protein
MSLMISVSGIRGVVGESLTPSHLVNFAMAFASWAGKQSRNGKARLVIGRD
TRPTGATISALVTNTLLLCGCDVVDLGVATTPTVEMAVVAEQADGGLIIT
ASHNPVAWNALKMLNRQGEFLSAEDVAALLRIAEAGDFVTARWNELGNVT
EAHEYDAYHIEQVAQLPCIDVESITRQRFKMLVDCVEGAGSSIIPALCKR
LGIAEVIPMACGGSGIFPRNPEPIEEHLTSTIAAVREHQCDGALIVDPDV
DRLALLCEDGSLFGEEYTLVACADFYLRYKKGAVVNNLSSSRALRDIAEQ
HGVACYSASVGEANVSAMMKAEQAVIGGEGNGGVILPELHYGRDALVGIA
LIMQAFTRWREVHPNGTLSQFRRSFPDYAMSKQKIVLDQSQRSSLPELFA
AVAKRYAHAESNTLDGLKLDFADRWVHLRPSNTEPIVRIYTEAPTRAEAD
ALASEVMATIAEVGGCQ
>Cag_0760 DEAD/DEAH box helicase-like
MPDIVKIEYHQTGESVKNTANGMRPMAARAYEERNSQYLLIKAPPASGKS
RALMFIALDKIRNQGLRKVVVAVPERSIAGSFAKANLKKENNFFANWEPN
DEYNLCTPGMEGSKSKVSAFKNFIDNDEEILICTHATLRFAFEELDESKF
NNMLLAIDEFHHVSADGDSRLGELLRSIMQKSNAHIVAMTGSYFRGDSVP
VLLPEDEVKFTKVTYNYYEQLNGYNFLKSLGIGYHFYQGKYTSAILEILD
TNKKTILHIPNVNSGESTKDKHNEVDTILDAIGDVQKVDSETGIIFLERH
HDKKIIKVADLVNDNPKDREKVITYLRNIKSVDDMDLIIALGMAKEGFDW
PYCEHALTVGYRGSLTEVIQIIGRCTRDSANKSHAQFTNLIAQPDAADDL
VKLSVNNMLKAITASLLMEQVLAPNFKFRTKLSDDDKADAGEIKIRGFKS
PSSKRVKDIIESDLNDLKATILQDDTVMKAIPGNLDPEVINKILIPKIIE
IKYPDLSADEREEVRQHVLVDSLVKGGEIKEVGDKRFIRMADQFINIDDL
HIDLIDRINPFQKAFEILSKEVTTKVLKVIQDVIESTRIQMTMDEALLLY
PEKVKAFMDKNGREPSVTSLDPLEKRMAEAIIYLKDLKRKKQSGQQ
>Cag_1113 hypothetical protein
MKNNTGLWIDHKTAILVNIKGDYTHVQHVESNAESNLKPSGGWKANGSVV
AQAVANEHTADERRKHQYHTYYQKVIALLANSTEIALFGPGEAKIELAKE
IEKNSDMHKKVSIVETCERMTENQLIAKIKSSFSAKS
>Cag_1183 conserved hypothetical protein
MSSFFNCYEADTARFFDEVIAHDGTPRQHYHKMLQRFDQFSKEDIKARRE
VINIFFRNQGITFTVYGENEGVERLFPFDLIPRVVPAHEWQRIEKGLVQR
ITALNEFLHDIYHSQKILKDKIIPPELILGSQHFRREFVGVNPPLGIYIH
VTGSDIIRDHNGNYLVLEDNLRTPSGVSYMLQNRQAMKRAFPVLFDKYKI
RPIENYPQELLRTLQEIGQSARRNPNVVLLTPGIYNSAYFEHSFLARQMG
IELTEGKDLVVNNNRVYTRTTRGLEQVDVIYRRIDDAYLDPLVFRPDSKL
GVAGLVNAYRKGNVTLANAIGTGVADDKVIYSFVPKIIKYYLGEDPILEN
VPTWLASNPDDLKYILANLGSLVVKAANESGGYGMLIGPESTVAEQERFA
EMLVANPRNYIAQPTISLSRHPSFFHDTELWGCHIDLRPYVLYGKNITIV
PGGLTRVALKRGSLVVNSSQGGGSKDTWVVDE
>Cag_1596 conserved hypothetical protein
MPTNQPPLYALAFGAHPDDVELSCGATLLKIMREGKSVAVCDLTRGEMGT
LGTPESRKAEAEAATALMGYSARTTLDLGDGKLHYCDENLDRIISVIRHF
RPSVVFANPPDERHPDHIKASRLVTDACYYAGLRQRPTTFEGTLQEPHRP
RHLLYYIQFRHLEPHVVVNVSETFEASRHAILAFASQFYREGMSDAPQTM
INRPEFLTSLEARARYFGEQIGVLYGEGFRLSAMQGVAHFSSLFPEL
>Cag_0236 putative plasmid maintenance system antidote protein, XRE family
MNNTYTSQEDIAIARELLSCPGDTLAEHLDYIGITKMELAKQLQCSEQTI
NEIIKGTAPITTAIALQLEQCIGIPANFWIERDRQYWLQLAEINEAENRL
ALSNKALQLNIPLIMKKRTSCSCN
>Cag_1685 Acetate--CoA ligase
MATETPSSSQQEAAANPAEDSISSVLTEKRKFPPPASFSEQAHLSTMEQY
EKLYADAAADPEGYWAGIAEQFHWFKKWDSVLEWNSPYAKWFNGGKTNIC
YNALDVHVKSWRKNKAAVIWEGEQGDQRILTYGELHRQVCKFANVLKIAG
IKPGDRIAIYMGMVPELMIAVLACARVGAVHNVIFAGFSAHAITERVNDS
RAKMVICADGTRRRGSTINLKNIVDEAIVNTPSVRNVIVLKTTGETIKMH
DGMDHWWHDLMGLAVDESEAVELDAEHPLFVLYTSGSTGKPKGILHTTAG
YMVHAASSFRYVFDIKDEDIYFCTADIGWITGHSYMVYGPLLNGATLLMY
EGAPNYPQWDRFWDIINRHKVTILYTAPTAIRAFIRAGNEWVTKHNLNSL
RLLGTVGEPINPEAWMWYHKVVGQEKCPIVDTWWQTETGGIMVSPMPGAT
PTKPGTATRPLPGIMVDVVHKDGTPCGANEGGYLVIKKPWPSMLRTIYGD
NERYEKTYWSEFKDMYFTGDGARKDDDGYIWIMGRVDDVVNVSGHRLGTS
EVESALVSHEAVAEAAVVSRPDDIKGNSLVAFVTLKDEYEGDMKLRESLR
NHVAREIGPIAKPDEIRWAKALPKTRSGKIMRRLLRELATSNEIKGDVTT
LEDFGVLENLRDQENE
>Cag_0039 Drug resistance transporter EmrB/QacA subfamily
MANAPSLSASPKLLGTTEEHYETGWRKLIITLTVIVSAMLELIDTTIVNV
AITQISGNLGASIEDTAWVVTSYAIANVIVIPLSGFLGNLLGRRNYYIGS
ILLFTVASLLCGVATDIWTLVFFRFVQGIGGGALLPTSQAILYETFRPEE
RGKATGIFSMGLVLGPTIGPLLGGYLVDYFNWEWCFFVNIPIGLLAAWSS
FIFLKEPKVTHTVSKIDWAGIGLLAVGIGSLQFILERGESKDWFETPYIT
WFTIIAVLSLIAFVWHELHTKEPAVDLRVLARSHNLPIAAVLTFIVGFGL
YGSLFVFPVFVQGLLGFTAVLTGLVLFPSAMVTGMISMPLGMALQKGASP
KHLMLFGMLTFSLFCWLLGQQTLQSGAENFFWILLLRGIALGFIFIPVTM
LAISGLHGKDIGQATGLNNMVRQLGGSFGIAIANTYIAKRVAAHRTELLS
HLSPYDPEAMNRIHAIAAKATAEHGLPPASAELAALKALEGTVTVQSTHL
AFMDAFMLIALLFLCAVPLLFFIRLHKGEQASAMGGH
>Cag_0780 conserved hypothetical protein
MPLLSVGNILVEPDVLQARFACNLQECRGACCIEGELGAPLQPEEAAQLD
HLPEELFRLLPEKGLRYLRRHGAVELYQGVHYTRTVKSRECVFTVVRDGI
TLCAIEIAFREGLLPFDKPISCRLFPIRVRKKFGLDYLVYEQHAMCRSAR
EAGREQGVRLIDYLTPSLTARFGEALVQELQRFHDSSSNNSHG
>Cag_1329 bacteriochlorophyll A protein
MALFGTKDTTTAHSDYEIVLESGSSSWGKVKCRAKVNVPPALPLLPADCN
VKINVKPLDPAKGFVRFSAVIESIVDSTKSKLVVEADIANETKERRICVG
EGSVSVGDFSHTFSFEGSVVNIFYYRSDAVRRNVPNPIYMQGRQFHDIIM
KVPLDNPDVIDTWENTLRAIQSTGAFNDWIRELWFIGPAFTALNEGGQRI
SKIEVNSIGTQSGDKGPVGVTRWRFSHGGSGIVDSISRWMELFPVDKLNK
PASVEAGFRSDSQGIEVKVDGEFPGVSVDAGGGLRRILNHPLIPLVHHGM
VGKFNDFTVDSQLRIVLPKGYKVRYAAPQFRSQNLEEYRWSGGAYARWVE
HVCKGGTGQFEVLYAQ
>Cag_0229 Magnesium-protoporphyrin IX monomethyl ester anaerobic oxidative cyclase
MKILMVQPNYHSGGAEIAGNWTPSWVAYIGGALKQAGFTQVRFVDAMADD
LPDDAIEEIIRKNQPDVVMATNITPSIFKAQDIMKIAKKVSPKIRTIMGG
IHSTFMYPQVLTEAPETDYVIRGEGEEVAVNLIKAIAAGNDKEARADITG
IAYIDDDGKVFATAAHPVIEDLDTLTPDWSLYDWDKYIYTPLNCRLAVPN
FARGCPFTCTFCSQWQFWRRYRARSPKHFVDEIEILVKKYNVGFFILADE
EPTINKQKFVALCQELIDRKLNVTWGINTRVTDIMRDADLLPFFRKAGLV
HVSLGTEAASQMNLNRFRKETTIEENKFAIKLLQKNGIVAEAQFVMGLEH
ETPETIEETYQLCKDWDPDMANWTIYTPWPFSDLFKELGDRVEVRDYSRY
NFVSPIIKPDNMEREDVLRGVLKSYARFYARKTFFSYPWIKDPYVRKYML
GCLKAFAQTTITKRFYDIDRVKGKNLSKAEIDLGFDKSRILTQEEVKNLK
ELRPEMVADMSFGLKEAGYQREHDEHDWDAFDETTIKDRTSSTVRNC
>Cag_1444 ATPase
MISWTDVDLELGNKLLFSKLTCTIQAGEFVCIVGKSGSGKSTLLKSLYMD
IKPKSGEVSIAGFSSRTIKKRQIPLLRRKLGIVFQDFRLLEDRSVYDNLA
FVMQVTGTKEKLIKEKVMHALACVGLEHAAHSMPLRLSGGEQQRIAIARA
LVREPLVILADEPTGNLDPDTSLEILNYLKSINEKGITVLMGTHDYELVR
HATPCRIVHIHNQKLVESRIESSSAGLRPAMAAPALA
>Cag_1019 Phosphoribosylformylglycinamidine synthase II
MKHTDIEVTLTHAEEHGLSAEEFSQICTILGRTPTITELGIFSVMWSEHC
SYKNSIAVLKTLPREGGALLTSAGEENAGLVDIGDNLAVAFKIESHNHPS
AVEPYQGAATGVGGIHRDIFTMGARPVASLNSLRFGSPKDPRVRYLVDGV
VRGIGDYGNSFGVPTVAGDIYFEEGYTGNPLVNAMSVGIVEHHKTVSATA
YGTGNPVLIVGSSTGRDGIHGATFASEDLSEASEEKRPSVQVGDPFAEKL
LLEATLEAIETGYVVGLQDMGAAGITSSTSEMSARGIEKYGVGGIEIDLD
LVPIREAGMSAYEIMLSESQERMLIVAAKGFEDKIIEVYQKWDVQAVVIG
EVTDDNHVRVKHQGQVVANIPAISLVLGGGAPVYKREAKEKKPETPLANM
VADSTLNFNELGLALLSRPNIASKQWVYRQYDSMVQTNTLTPTGQTDAAV
IRIKGTNKGVAMKTDCNARYVYLNPLAGGKIAVAECARNIACTGARPLAI
TNCLNFGNPLKPEVYFQFKESVRGMGEACRTFNTPVTGGNVSFYNETFIA
GQRTAIYPTPMIGMIGLLDNIENLVGSTFTASGDRILLLGNPQLTLDGSE
YLVMQYGTPGQDAPAVDLEHEAQLQRLLVALAEQKLLHSAHDVSDGGLLV
ALAEKAMMNQEMPLSFRVHLSNNDKSETAIQQQLFSEAQGRVVLSAAPEA
VAAIMALANDYNLPIQDIGEVVNQQTISLSINEQEVVNLPLSNVAHAYYH
ALEHALHLDEL
>Cag_1416 Alpha amylase, catalytic subdomain
MFPLVYEINTRIWLRQLSEYYNQPITLGSVPDSEFSFFAQCNFDIIWLMG
VWQPSKYSNAIATSHPSLRKGFLAHVPHDLKADDITASPYSIPTYTINDA
LGGNDELLAFRARLHRIGIKLMLDFVPNHLALDNEWLPEHPEFFMPLRED
EHSQDPNAGFEYVANSYLAYGKDPYFAPWTDTLQLNYANLATHDMMTENL
MKIGALADAVRCDVAMLILKSVFNTTWSSLGGQMHKEFWFDAISSVKKRY
HDMIFLAEAYWNKEWELQMQGFDFTYDKPYYDYVTNAPVVVDKLSGHLSG
GWDYQQKLCRFLENHDEPRSAAKLGLNNRAAAVVLLTTPGMHLIHQQQMV
GYKKQMPVQLLRQAVEPEDGELAALYEQLFALQTHEVFQHGGIEWLDLNV
CHYCHCFGFRRYHDEKNAFVIVNFSPFGMDLTFSHAALENMEGKALHTLS
STGKLAENELSVEGRAVKVTLSPHEALVMYN
>Cag_1176 Lysyl-tRNA synthetase, class-2
MQKNTSQPTNTNEQSNQPSLNDQIVRRIEERQHLIDSGINPYPCSFNVTH
HSATILTEFQDEAKTPVAVAGRIMTIRKMGKASFFNVQDSAGRIQIYLKK
DDVGEAAYNMFKLLDIGDIVGVTGYTFRTKTGEISIHAESLELLTKSLRP
IPIAKEKEVDGEKVVFDAFSDKELRYRQRYVDLIVNPEVRETFIKRTAIV
AAMRNFFARHGWLEVETPILQPIYGGAAARPFTTHHNALDMQLYLRIANE
LYLKRLIVGGFDGVFEFAKDFRNEGIDRFHNPEFTQVELYVAYKDYNWMM
NLVEELIYTTAMEVNKSDTTTFLGHEISVKPPFRRLTISDAIAEYTDKDI
RNKSEEELRNIAKELGLELDPKIGNGKIIDEIFGEFVEPKLIQPTFITDY
PTEMSPLAKEHAAQPGIVERFELIVGGKEVCNSFSELNDPVIQRERLESQ
AKLRLRGDEEAMVVDEDFLRAIEYGMPPCAGLGVGIDRLVMLLTGQESIR
DVIFFPHLKPE
>Cag_1583 heterodisulfide reductase, subunit A/hydrogenase, delta subunit, putative
MADEKKIGVYLCTGCGIGEALNIDELEAVAKKEFKVPICKRHEFLCSRDG
LQVIKDDMNNEGVNKVVVAACSPRVHTDLFAFDPMKYVVERVNLREQVVW
TQPKTDAGKEGSQLMAADYLRMGITRIGKSNAHEPKIIDVNRTLLIVGGG
VTGLTAALEASQAGYKSVIVEKSGELGGNAKKMFKVFPTAPPFAELEKPT
IDNKIQAAKNDSNITIHTNATVACIEGGPGMYKVTIDKNGASDTFDAGAV
VLASGWRPYDPSNLGGLGYGKFRNVITNLQMEEMVTKNNGKVLRPSDGKI
AKSVVFVQCAGQRDEKHVPYCSTVCCNVSLKQAKYVRESNPDAGAFVIYK
DMRTLGLYENFYKTAQDDEGIFLAKGEVLGIQEGPEGNLYVEINNQLLGM
KMQINADLVVLATGMVSGMVPEDRGVNNLTPEYIGNLVKRETSDGEIVDL
EPLSLALNLKYRQGPEMPHLKWGFPDSHFICFPYETRRTGIYSAGAVRHP
MDAVQSVADATGAALKAIQCIELTAQGRAVHPRTWDRTYPEIRFESCTQC
RRCTVECPFGAYDEESNGTPLQYPSRCRRCGVCMGACPQRVISFKDYSVD
MISSMLKTIEVPDSGTFVIGFVCENDAYPAFDMVGLNRMALNTNFRFISL
RCLGGLNLVWIADALSRGVDGILLLGCKYGDDYQCHYVKGSQMANERLGK
VQETLDRLMLESERVEQVQLAINDWQKLPSILEEFAKKIEDIGENPYKGF
>Cag_1438 conserved hypothetical protein
MQILAGQFRGQKIGRSASAAVRPCSSRVKKSLFDTLAVRMDLEDAHVLDI
FAGFGSLGFEALSRGAASVTFVDRFHESLKALKSTAAKLGVTNKVSIVNA
DALAFLGRTTNQFDLLFCDPPYAWADYHALLELIFRRSLLAEDGLMLMEH
STQHNFSHTPEYLFHKDYGMTRVSFFQPPPLNQP
>Cag_1961 conserved hypothetical protein
MADTKKLAIIASQGTLDWAYPPFILASTAAAMDMEAVIFFTFYGLPLLKK
EIDLKVSPHTNPAMPMKMPFGSKEFQGVNWSIPNLISGNVPGFDNMATML
MKETFKKKGVATIEQLRSMCQEFGVRFIACQMTMEVFGFEKDQFIDGVEY
GGAATFLEYAADANISLFI
>Cag_0360 Acetyl-CoA biotin carboxyl carrier
MNLSDIKELIDVVNNSDLQEAIIEEGNFKLILRRPAPVVVQQAAPAAIAP
TAMPAPTYAAPAPQATAPAAKPAPTIDTSLIECRSPIVGTFYHASSPDTP
PFVSINDTIKKGDVLCIIEAMKLMNEIEAEVSGTIVEILVENGQAVEYDQ
PLFRIKP
>Cag_0883 conserved hypothetical protein
MNIDRIVFAVAGCFVLVSALLSVYHHPNWLYFTGFVGLNLFQAAFTGFCP
MAMILKALGVPPGEAFK
>Cag_0139 Homoserine kinase
MRKAIGFASATVGNVACGFDVLGFAITAPGDEVILTLSDTAQSDLPVSIT
RIEGDGGALPLDPRKNTSSFVVLKFLDYIRHHKNIAFDGHISLELIKHLP
LSSGMGSSAASGAAALVAANALFGNPCTKMELVHFAIEGERVACGSAHAD
NAAPALLGNFVLIRSYTPLDIITIPSPANLYCSLVHPHTELRTSYARSVL
PTNLPLKTAIQQWGNVGALVAGLLTSDYELIGRSLVDVVAEPKRAPLIPG
FVEVKKAALQAGALGCSIAGSGPSVFAFSTSEEIATRVGMAMKEAFLHSN
AHLESDVWITPVCTQGARVL
>Cag_0254 tRNA(5-methylaminomethyl-2-thiouridylate)-methyl transferase
MSTPHHILVGISGGVDSAVATCLLVQQGYRVTAVNLRLLDTLDAPYAEPT
LQPSSLVVSDHPDYHIPLFSLNLSRTFRDEVMRYFNAEYLAGRTPNPCMV
CNRQVKWAGLRHAAELIGANAIATGHYANRAFSDGRYRLYKGADSQKDQS
YFLWMLSQKDLAHTLLPLGGLTKPEVRELAKNFGVRAAEKKESQEICFVP
NDDYSTYLMSAMPELAERVADGDIVDAAGTVIGKHRGYPFYTIGQRRGLG
VSAKEPLYVTALDAEQNRIHVGHKAALMSHRLLASRCNWIGMEPPSSSVE
LLGRIRYRDRQTACRVTPLENGQIEVVFQEPKSAITPGQAVVFYCGDEVL
GGGFIEEGT
>Cag_1422 Heat shock protein DnaJ
MKKDYYETLGVTRSSNKDDIKKAYRKLAVQYHPDKNPGNKEAEEHFKEVN
EAYEVLSNDDKRRRYDQFGHAGVGSSAASGGGGAYAGGADFSDIFSAFND
MFTGGGARRGGSPFGGFEDAFGGGGGGGARRRASSGIHGTDLKIRLKLTL
EEIAKGVEKTLKVKRQVPCEVCNGTGSKTGATETCQTCHGSGEVRQVSKT
MFGQFVNIAACPTCGGEGRTIKERCTACYGEGIKLGETTVKINVPAGVEN
GNYMTMRGQGNAGPRGGAAGDLIVVFEEIHHETFTRNGHDVIYNLAVSYP
DMVLGTKVEVPTLDGAVKLTIPAGTQPETMLRIQGHGIGHLKGGGRGDQY
VKVNVFVPKEVSHQDKELLKDLRKSSNLCPNAHHDSESKSFFEKARDIFS
>Cag_1503 hypothetical protein
MPEQMKRKKIRCYNCGEIFTLLMDIAGEPTRSITCPFCGASLTVTLAKYP
KKVITVYRAAVGESSASEITVYDLPDVLESTESSSQS
>Cag_0879 Phosphofructokinase
MLLEDTRSNTLKLTFGGFVNKKALEKIAVLTSGGDAPGMNAAIRAVTRAA
ISNKLKVVGIRRGYQGMIEGDFINLKASDVSGILQLGGTMLKTARSAAFR
TTEGRAQAHEQLSKAAIDAVVVIGGDGSFHGALMMSHEYNIPFVGIPATI
DNDMYGTDYTIGYETALNTVVEAVDKIRETARSHGRVFFVEVMGHEAGMI
ALNSSIACGAEVVVIPELHNRQYDELHKFLCKGYKKKESSGIVIVAEGEE
TGGALKIAEQVRKEHPEIDVRVSILGHIQRGGSPAAKDRINATRMGAAAI
EALLDDQKSVMIGLANDQIVRVPFNKAVKTNRTISHDLLEMQRLMNMWCC
>Cag_1315 Tryptophan synthase, alpha chain
MAQNRITRLMQQQKKLLIAYYMPEFPVAGATLPVLEALQEHGADIIELGI
PYSDPIGDGPVIQNAAHTAIRNGVTLRKVLELVRKARNGEGCKKITVPIV
LMGYSNPLFAYGGDCFLADAIDSGVDGVLIPDFPPEEAIDYLDRAKNVGL
SVIFLIAPVTSPERIEFIDSLSTDFSYCLAVNATTGTAKLSGHAGDAAVE
EYLHRVRQHTKKKFVVGFGIRDKERVESMWKLADGAVVGTALLEQLAAST
TPQECAERAGTFWQSLR
>Cag_0382 Thiamine biosynthesis protein ThiC
MSKAPLFEKISVKGTLFPIEVSMKRLHLSHPYTCNGQEFSSLPMYDTSGL
HGDCRTQVDPHVGLPPLRSSWNFPRTVQLGVAQTQMHYARKGVITPEMEY
VAIRENQQLEEWIRSFNRHGKKVEPITPEFVRQEIAAGRAIIPANLKHPE
LEPMIIGRHFRVKINSNIGNSAMGSSIEEEVEKAVWSCRWGADTVMDLST
GANIHQTREWILRNSPVPIGTVPVYQALEKAGGKAENLTWELYRDTLIEQ
AEQGVDYFTIHAGILQEHLPYAERRLTGIVSRGGSIMATWCRHHQQENFL
YTHFNDICEILKSYDIAVSLGDALRPGSILDANDEAQFGELKVLGELTKQ
AWQHEVQVMIEGPGHVPLNLIEENMQKELELCYEAPFYTLGPLITDIAAG
YDHINSAIGGALLASLGCSMLCYVTPKEHLGLPDRNDVREGVIAHKVAAH
GADIARGNPTAWLRDTLMSQARYSFAWEDQFNLSLDPEKTRAVHSQSIAA
SGYSAPNPDFCTMCGPDFCSMKRTQKMGKGN
>Cag_1795 hypothetical protein
MAELPEASSSNKTGERMAELPESSTPSKPSRRARFFEEDNGSLSSMRLMS
FVALIAAILFGALTLTFDTSENNGTGLYITLMFLVSAFAPKAVQKFAEQK
LNER
>Cag_0849 ATPase
MLDIKNLNAGYGEMQVLRAIDVTINQGEIVSVVGSNGAGKSTLLKAISGL
IKSKGSMTWNHEDLHALKAHTIVQRGIVHVPEGRKIFPEMTVVENLLMGG
FLCKKDHQKNLEKVFAIFPKLAERRNQLGGTMSGGEQQMLAIGRGLMGNP
KLLLLDEPSLGLSPLFTEIVFKAIKTINSSGVTVMLVEQNVYQSLSISHH
GYVIETGKIVLSGTGEELLNNQDVKKAFLGM
>Cag_0783 Heat shock protein HslU
MSKNNEGDVVALPDVNHGRTTAIAPGNLTPSQIVSQLDKYIIGQKDAKKS
VAIALRNRLRRQSVSDELRDEIMPNNIIMIGPTGVGKTEIARRLAKLARA
PFVKVEASKFTEVGYVGRDVESMIRDLVDQSVALVRSEKSEEVREAAAIF
VEERILDILIPPIVANQHTNDEEQEEQQEEGQEEQSPSDQKKLAEMNRRT
RKKMVERLRSGKLEDRQIELDVSGNDSPGGMMQIFGPLGHMEEIGSLMQD
LMSGMPRKRKKRRVTIAEARKLLEQEEVQKLIDMESVIKEAINKVEQSGI
VFIDEIDKIASPTTGAGGKGPDVSREGVQRDLLPIVEGSNVATKHGMVKT
DHVLFIASGAFHLAKPSDLIPELQGRFPIRVELKSLTEDDFYLILTQPDN
ALIKQYSAMLATEGVTLHFTEGAIREIARIAAQVNENVENIGARRLHTIM
TNLLEELMFNIPENFSDDNVEIDEDMVRNKLNQVVADRDLSQYIL
>Cag_1098 hypothetical protein
MTRNVVHIYTREIDFNLELDIEFLGIRMLTGTAKELLWENNHTEKNDSPH
LAIVIQNQLELFESVTDGMAYSRPRLLIAFGVLSYFTQQIFTPFETYASS
SYVGKFDKECKNRFIFKETELIEDYIQFETIIKYHKDKEFIYSLLDRWRK
GLYMETESEDNMIYDDETLISYFHILELLTTKYEDKQKKELKDKIKDFSK
SIFEESFLFEGNQLKSEINAKSKIIEGLLLPDLSVSSKIFYIFKEQGILT
YRLKSFITNFVKDRNSVAHGRQVYQDRVIFPVPPFFPLIKNRDYPEEFYR
ILTAKAIANFIGVNLYSQEWDEMSQSIIPSFDELREFQREKKYLELTNQD
FCDGKENDITPFVVSHYLISKKLKAEEALEILTPFINNYNKTEDETMMSI
WAIIIILDLLEEGELKEKCIEIIKIAEKNNWHPNYFKMRDEMYKLEYFGF
EINGLKDLIRKKEIR
>Cag_1716 Chlorophyllide reductase subunit Z
MGTIIRDESTASAYWAAVNTFCILRDVHVIADAPVGCYNLAGVAVIDYTD
AIPYLENLTPTTLTEQEISSSGSTAVVRKAIEGLQGSGKHLILASSAESE
MIGSNHARMLSTHYPNVRFFASNSLGENEWQGRDRALAWLYEMFDNGKPA
AIQQSTVSIIGPTYGCFNSPADLAEIKRLIAGIGGNLRHVYPAESAAADV
ADLKNSDVVVVLYREFGYTLAEQLGRPVLQAPFGLAETEHFIKELGRLLH
REAEAADFLKREKESTLRPLWDLWRGPQAEWFPTIRFGVTATATYAQGLA
TFLGKEMGMQCLFSHNSATSDNSAIRQEIQQQQPQFLFGRIVDKIYLAEI
DAKTRFIPAAFPGPIVRRALGTPFMGHSGAIYLLQEIVNALYDTLFTFLP
LNRRTVVEEPAQKVAWSNEANALLHEMVKRAPFISQISFGRELKRKAEVM
ALKQGKECVTPDIFRLLQ
>Cag_1763 conserved hypothetical protein
MTKVVAPEEIQVNVTLRHSHNHSNVEEYARASVLSLTKVFAGIINAHVIL
DHQSNDFEKNKLAEITVHVPQHIFVGKEAAATYEQAIDACIYALNRQLLR
HKEKAQ
>Cag_1946 hypothetical protein
MSKRQSFIIGGVLIALVASLWSFWPTLQAEKVPDKPSVTASVPIDSTVCI
APTEYMRSHHMQILQDWKRTGARDPRPHTTPDGRKFQKSLNTCLGCHSTN
SYFCIMCHDYTHAKPNCWNCHVAPFK
>Cag_1045 hypothetical protein
MRKKALSIALASLLLLGTSLPIAAWLALPRYVEPLLQRALIGKPVQIAIK
DVRPSLHGVAFSSLQATITTPPDECNNYERTIYHVTIKNGTIGWLITDLS
ASHRSPFIPSLLDVKLHLQADTLHLQPTPNTFAFSDSQPEITVNLKLFRN
EKQVLSVVPLDAAYAIHDGTVTREQMRFEGIAYNVAVSSSNKWQQLPDSL
FVARMVNEGKVQPVGNFRAIVGSKGDPLHPCRITLSNCSAEIVDWNASSP
FVHFDRKTKAGDLTLCINDFPLQSLSSIALQAAQQQPKAPSRLAAKAPLP
PMVAGKINATIPLSFRDSTIVIRNASVIAKAGAKVVLYNKQQQPMLFVVA
NKSGMDERIVDKLYVTATLNHAGKTTQSVALQNLSATIFDGSIRSTPLTV
KTDGSSPLDVTVTFDNLKLFDHLILPDNEQSSFQGALSGKLPIRYAKNQL
TIRNASLLASEGTQVKLVTKEQKPLVTIIAGKKGGKETVLDKLNVRARFN
QTPNQTASITLQEFSTTLFGGSVNVTPLTFKTDASSPLVATVTLDKVKLF
EHLILPANLHGSLYGDLSGKVPLTYQNDQLSISNATLRSSGGGSFTLNNA
QQSSNNNLSRSDQQTTYAFSEPALTFSHLANGATTVDFTLNEFRQKSGSN
DFKFGNPKGTIHFAENPREPDVMRLSNFSTNFFGGKIALNEFVYDIKKQE
GETIVQLSNMPLQKLLDLQGTKKVYATGALKGNIPIKLKKGTVEIPDGAL
LAQESGQIIYATSPEERAAAHQSLRTTYEVLSNFLYQQLSTSLTMTPDGQ
STFAIRLKGTNPDMYGARPVELNLNVQQNLLDLMRTLSISSEIEQAISDK
TTQQQKK
>Cag_1645 regulatory protein RecX
MKNTPPENTLAAATGFALKLLGIRNHSHEELERKLRKKGFQAELCATVLE
QLVARGLLNDRTFGEEMLQSRSQRKPSGKLKMRAELLNRGIASDIADELL
SDYKSHELCHQAAAKKIASLRIADEAIKKRKVETFLRNRGFGWQEISTTL
HHFFPTTSTMDDDIE
>Cag_1000 conserved hypothetical protein
MGFNPRAREGRDFEPICLARKPLSVSIHAPAKGATLTGSHCVLSGDGFNP
RAREGRDLAFPCSISITCKFQSTRPRRARRSCSKRIWSALSFNPRAREGR
DPIVPRPFRGFSRFNPRAREGRDRLEGL
>Cag_0368 hypothetical protein
MHAKDVVSKDILKRIALDIARILLHLKVDHAELLETEHQRVEERRADVVV
LVQGESGRFILHLEIQNDNQANIAWRLLRYRSDIGLAHKGYDIKQYLIYI
GKAPLSMPTGIHQTGLDYRYHVIDMHSVDCQALLTQDTPDALVLAILCDF
KGRSEREVVRYIIQRLQELTAENESRYHDYMRMLEILSANRSLEKIIEEE
EAMLSVVDQTRLPSFRIGMRHGIEQGVQQGTLSLVKRQLTRRFGTLSYHH
VARLDKLNIEQLEELSDALLDFNTVTDFDVWLENRKN
>Cag_0853 branched-chain amino acid ABC transporter, periplasmic amino acid-binding protein, putative
MKISFISLVAGAMLAVPMVAQAAPIKIGFINSITGKEAQIGENLTNGATM
AVEDMKAKGYQFEVIREDDGGDPKNALAAYEKLVTRDGVIGVVGPYTSGS
ANVVASRADQYQVPLLVPAAAKEEITMKGYKNVFRLNAPANVYSKVLMDA
VKAYKPKTIAYIYAATDFGVSTVKTAQENAAKMGLTEVANEKYQQGQPDY
RPSLSKVKAANPDLVFLVSYDADAILLMRQAKELGLKPKAFLGAGAGYTT
AQFAQQTEISNLVVSATQWTDDVNWAGAADFSRRYKAKFAKEPSYHAACA
YEAMRIMIDTATRQKSAPKVRTALKAGSWNGIMGPVKFANYAGFTNQNNH
PMLVQQIQAGKYETVYPAQFSKKKLVYPFTR
>Cag_1069 conserved hypothetical protein
MSVTLMIFIYWAIAIAIGFIFFKKDILSFEPKFDGRRIGLLIASLLIIAL
NAWVYSHSTSDGGRSLDPLTLLVFSVGNGIAETFMFYAVFRFGTSLVGRF
TQNAVATFLVGFLCFVVYSGLIHGLFWINILPEHVVQTSPYKPFFMPVQM
LIAGSWALNFFWYRDIRTVIFLHGLVDLTMAWNVKFDMF
>Cag_1193 16S rRNA dimethylase
MINVEYKHTQVAAKKKLGQNFLTDRNITRKTVLLSGAKPDDQVVEIGPGF
GALTRELVEECHNLTVIEKDPTLATFIRNEYPQIKVIEGDVLTINFSAMA
QAGKPLQILGNIPYSITSPILFHLLEHRRAFRSATLMMQHEVALRLAAKP
ATKEYGILAVQMQAFCKVEYLFKVSRKVFKPQPKVESAVIKLTPHATDPA
LDADGFRRFVRIAFHQRRKTLLNNLKESYNLELVDSNKLQLRAEALSIEE
LLELFSLIKTKSE
>Cag_0208 tRNA-i(6)A37 modification enzyme MiaB
MTNANPDAFYIHTFGCQMNQADSGIMTAILQNEGYVAASNEADAGIVLLN
TCAVREHATERVGHLLQHLHGRKKRSKGRLLVGVTGCIPQYEREVLFKNY
PVVDFLAGPDTYRSLPLLIKQVQQAGKGATEAALAFNSAETYDGIEPVRS
SSMSAFVPVMRGCNNHCAYCVVPLTRGRERSHPKAAVLNEVRQLAEAGYR
EITLLGQNVNSYYDPLAQCNFAELLAAVSCAAPATRIRFTTSHPKDISEA
LVRTIAEHSNICNHIHLPVQSGSSRILRLMQRGHTIEEYLEKIALIRSLI
PNVTLSTDMIAGFCGETEADHQATLRLLEEVQFDSAFMFYYSPRPRTPAA
EKLTDDVPEALKKARLQEIIECQNRISASLFSQAVGSVVEVLAEAESRRS
SEQLMGRTAGNRTVVFARNGYQAGDVLHVRITGSTSATLLGEPLISTTL
>Cag_0129 succinate/fumarate oxidoreductase, putative
MKRYAYYLSCINESMTKEVDRSIDLWQKELGIELVKLHECTCCGGSNLDY
VSPKHFALVNARNIALAEKLGLDLVVSCNTCLMTIRSAKKKLDESPALKK
EVNDLLKKEGLEYRGTSEVRHLLWVLIDDVGLDTIRQKVKVPLKNYRIAP
FYGCHILRPSSVLGHDNPLEPTSLDLIIEALGGKTIPYKHKNRCCGFHTL
LVAEQESLNVAAEALQEAMEQKADFIVTPCPLCHTSLDGYQAKALKSAGI
NSSIPVFHLSEMIGLALGFNEKQLGIKRHIVS
>Cag_0092 conserved hypothetical protein
MKPLKQVGESYFLLSQGEKQIEGAAFEEAEQSYRLAMTMARTIPTEEAFD
YDGFDAIAHAGLSSALIGLGRYNEALVSVAEALRYFNRRGDLHSAEGSLW
IAVICNKARALESLGRKDEAIKYYRMAGEMIAEKKGEIKQRDLLTELIEQ
GLQRLEGAKPATAKQGYKAWWEFWS
>Cag_0202 transposase
MKDTVLFQQALCLPMPWFVKSSAFDIEQKRLTIQLDFQKGSTFSCPTCGQ
HDLKAYDTAEKQWRHLNFFQHECYLTARVPRISCPTCGVKAITDLPWARR
DSGFTLLFEAMIIALVPSMPCKTIANYVGEHDSRIWRIIHYYLDEALEQQ
DLSAVTKVGLDETASKRGHNYVTSFVDLESSKVLFVTEGKDATTVEKFHK
HLLAHKGKAENIKEICCDMSPAFIKGVTTNFPETHITFDKFHIIQVLTKA
VDEVRREEQKERPELAKSRYLWLKNQVHLNQSQQVKLEKLQLKKLNLKTA
RAYQVKLNFQEFFKQAPAYAQSFLNQWYYSASHSRLEPIKEAARTIKRHW
YGILRWFTSNITNGKLEGLNSMIQAAKARARGYRTTNNLIAMIYFIGSKF
EFTLPALTHSK
>Cag_0694 conserved hypothetical protein
MSFKEKIDSDLKVAMKSGDKNRLNAIRSIRAALLEKEVSIRVGGTAVLSE
DQELEVLVSLAKKRRDAIEQFIAGNRPDLAETEQLELQVLEEYLPAPVSD
DEVQAVIQEIITKSGAISMKDMGKVMGEAMKALKGKADGGKVQNMVKTLL
SA
>Cag_0152 hypothetical protein
MKQEQQHFLQEKLRECDRHVEKITIAQEHMRSVLPLTPQVYAQLDDVALS
FLDQIVFRYSKLQDTLGDKVFPLLLLATGEEVKRKTFLDILNRLEELELV
DRMTWLQLREARNEVTHDYSSEVGETVDAINAIIVASDTLQKLYSTIRHF
CNHRLQVL
>Cag_1002 conserved hypothetical protein
MCSTKQPTSFNPRAREGRDSTPTFLRLATKRFNPRAREGRDCVPCCVPFK
VIVSIHAPAKGATGKADEEFDAKIVSIHAPAKGATQRR
>Cag_0964 Sec-independent periplasmic protein translocase
MNFIEHLDEIRSRIIQSLIALVVVMALCATYVDFLVNEVLIGPLKRSSPT
LVLQNLVPYGQVSVYFQVVFFSGFILAFPFLVWQIWQFVAPGLHENERKA
GRFSILFVSLCFFAGIAFGYFVFLPVSLQFFSGFGSTLIQNNISIQDYIS
FFIGALLTAGLVFELPFISYILSKIGLLTPAFMRFYRKHAIVVLLMVAAL
VTPSTDLVTQLIIGVPMILLYEASILISAHVNRKNKALQAKA
>Cag_2029 Phosphoesterase, PA-phosphatase related
MTLIEQGDVWLFGVLNGTPKSLLLDELMVYLTTVKMSWHIILLVALFMVV
RRGKSALLLLLCVGLAVGLSDFIASGVLKPLVQRTRPCFALDNVRLLISQ
PRSYSFASSHAANSMAIAVVAWLFFSRGVVVEKLFTLVMFCYAVMVGISR
IYVGVHYPTDVLAGMVIGTCSALLCYLIFSWLAKNMPLRTTTSPPSQNDG
>Cag_1195 killer suppression protein HigA, putative
MKLRYSTRKLEKSVESFSVIKKNYGEWAKKVVLRLEQLSQAPNLAAMRTV
PSAHCHELKADKISELAVDISPNHRILFQLAHNPIPLKDDGGLDWREVTC
IIITAIGEDYH
>Cag_0843 conserved hypothetical protein
MSEPLEMGFVADDRLAGFRLQRLEIFNWGTFDGRVWTLKLGGKNGLLTGD
IGSGKSTVVDAVTTLLVPSQRIAYNKAAGADNKERTLRSYVLGYYKSERQ
ENLGGGAKPVALRDLNSYSVILGVFHNEGYDKTVTLAQLFWMKDAAQPAR
LYAAYEGDLSIATDFSNFGAEIATLRKRLRGAGVELFESFPPYGAWFRRR
FGIENEQALELFHQTVSLKSVGNLTDFVRLHMLEPFTVEPRIAALIHHFE
DLNRAHEAVLKAKRQIEMLAPLVADCDNHHAMAQRTEELRASRDCLRPWF
ASLKLELLEKRHTSLNEELSRHQIAIERLDGERRTLQGRDRELRRTIAEN
GGDRIESIAAEIRQHQQELDRRTQKSTRYKELLRQLGGQLGEHPATSAEE
FYQQRAEHAAMHESAAEAEVQVQNNLNEAGVLFTQGRQEYEQLSTEIKSL
KARMSNIDEKQIAMRHALCKALNLPEVEMPFAGELLQVREEETAWEGAIE
RVLRNFGLSLLVPDHHYPKVAEWVERTNLKGKLVYFRVRPQSRNEQLADH
PASLARKLAIKADSTYYDWIEREVSHRFDLICCTTQEEFRREKKAITPAG
QIKSPGERHEKDDRHRLDDRSRYVLGWSNAAKIAALEAKAKVQQRELTKL
AERISTLQQEQKALKERLTILSKLDEYSDFNDLDWQPLAVAIARLEKEKE
ELEKTSNILQTLTEQLALLEQELQKTEQQLDDRKDKRSKIEQKISSITEL
QQQTAALLAEAGDEVSNRFALLQAMRQEAFGDQSLTVESCDNREREMREW
LQNKIDSEDRKLSRLGEKIIRAMTEYKEEWKLETRDVDVNIAAGKEYRAM
FEQLQADDLPRFEGRFKELLNENTIREVANFQSQLARERETIKERIVRLN
ESLTQIDFNPGRYITLEAENSLDADIRDFQTELRACTEGTLTGSDDAQYS
EAKFLQVRRIIDRFQGREAYADLDRRWTAKVTDVRNWFVFAASERWREDD
IEHEHYADSGGKSGGQKEKLAYTVLAASLAYQFGLEWGAVRSRSFRFVVI
DEAFGRGSDESAQYGLQLFAQLNLQLLIVTPLQKIHIIEPFVASVGFVHN
QEGRCSVLRNLTIEEYRSEKERAIA
>Cag_0275 NAD-dependent DNA ligase
MTIIDASERIAQLRQEIERHNYLYFNEAKPELSDYEFDKLLEELMALERE
FPDLLTPDSPSQRVGGTITKEFPVVTHREPMLSLANTYSAGEVAEFYNRV
AKLLAAENVHKQEMVAELKFDGVAISLLYRDGVLVRGATRGDGVQGDDIT
PNIRTIASIPLRLHQPLAGEVEVRGEIFMRKEDFEQLNDNRPEEERFANP
RNATAGTLKLQDSAEVARRRMNFVAYYLKGLKDETLDHVSRLHKLEALGF
TTGGHYRRCKTIEEINTFINEWEEKRWKLPYETDGVVLKLNNVQLWEQLG
ATAKSPRWAIAYKYPAQQARTQLCNVVFQVGRIGTITPVAELTPVLLAGS
TVSRSTLHNFDEIERLSVMILDYVMIEKSGEVIPKVVRTLPDERPADAHA
IAIPTHCPECDTPLIKPENEVSWYCPNEEHCPAQIRGRLLHFASRNAMDI
KGLGDALVEQLVAWGLVHDVGDLYLLQEPQLERMERMGKKSAQNLIRALD
ESRTRSYDRLLYALGIRHVGRATARELAHAFPTLDALMQANEERLAEVPD
IGTTVAQSIVDFFAKPSSRQLVDKLREARLQLAASASKIEQVNRNFEGMS
LLFTGTLERYTRQQAAELVVERGGRVVESISKKTSLLVAGRDGGSKLDKA
HKLGVRVISEDEFMGMM
>Cag_0226 8-amino-7-oxononanoate synthase
METTKDLFKKCFEYTIADEVKARGVYPFFHPIDENEGPVVTFDGRKLVMA
GSNNYLGLTADPRVKASSINAINKYGTSCSGSRYMTGTVNLHIELEEKLA
DFFEKERCLLFSTGYQTGQGIIPTLVQRGEYIVSDRDNHASLVAASIMAQ
GAGANLVRYNHNNMADLERVLQSIPASAGRLIVSDGVFSVSGEIVDLPEL
VKLAKKYNARTVIDDAHAVGVIGRGGRGTPSEFGLVSEVDLIMGTFSKTF
GSLGGYVVGDRSVINYIKHNASSLIFSASPTPASVAAVLATLEIIRTQPE
LMERLIANTDYVRQALLKAGFTLMPSRTAIVTVLIKDQDKTLYFWKRLFD
AGVYVNAFIRPGVMPGHEALRTSFMATHEKEHLDKVIMEFCTIGRELGII
S
>Cag_0503 Acetylornithine and succinylornithine aminotransferase
MNPLPVTQETEQQLFFHNYARLPLAITHGEGVWLVSADGTRYLDMISGIG
VNALGYGDRRLVEAISAQASKLIHASNLFLLQPQFELAEKLLALSGMSKV
FFANSGTEAVEAAIKIARKWAGQLGKPNKKEVLSLSNCFHGRTYGAMSLT
AKAAYVDGFEPLVPNVGMVTFNDIADLEAKVSGNTAAVFIEIVQGEGGIH
KISQEFVDRLKELREQFNFLIVVDDIQAGCGRTGSFFSHTMVNAEPDLIC
LAKPLGGGLPLAAVLGSEKMAGVLGYGNHGTTFGGNPIACAAGVAMINAI
VEDKLMENATTVGRYIREGFEQLAKKHPQILDIRQYGLMIGVTVDKEAKP
YVMKALEKGVLVNATSVNVIRLLPPLTITCEEAQRCLTVLDEIFSAEK
>Cag_1186 hypothetical protein
MQSQKPYNKQLWWRLILIIGLLLAVLFFIMTPWNIEHLPSRPNPAKSYNE
ALARTEALRLAQAQPMNPRCELQLMTHKHKTEHAIVLVHGYTSCPRQFQA
LGKQFYNAGYNVLIAPLPHHGYANRLSREHGKLTAEELATYADRTIDIAH
GLGNKVVMMGLSAGGVTTAWAAQNRPDIERAIIISPAFGFKQIPLPFTAA
AMNLFSALPDEFEWWNPTLREHETPTYAYPQYSRHALTQILRLGFIAKFD
ALHHAPATKKIALVVNHNDTSVSNEAALQFIAVWQKKQNQTIAIIEFADS
LKLPHDLIDPEKVNQRTNVVYPRLLQITGGER
>Cag_0914 transcriptional regulator, XRE family
MAKTEIKSYSLAEMKDKYIGKLGTEERDQYEYELRMDVLGKMIKTARQER
NLTQEQLGKLVGVQKSQISKLESNTHSATIDTILKVFKALKADIHFNVTL
EKRYVQLV
>Cag_1504 TPR repeat
MVEPLVLTPSVDIVTQHPALIRQTAELSRLYKDKEVLVTDEVLQAIGSIL
WRLLDADEALANAKQRAGQHIVPLLLSSNDAAIQQLPWETIYHPDYGFLA
RHEGFTFSRTIPSVQKALPDAAKGPLRILLFSSLPDDLTEKEQLQIEVEQ
AAVLEALGEWRQSGHVVLEMPDDGRFSEFTQVLKSFKPHVVYLSGHGMFQ
HDALNHTTTGYFFFEDEVSGKSKAFSEAEIAAALTATAVQAVIVSACESG
KAASDSLVNGLTYRLLQQGIPHVIGMRESILDRAGIQFAQAFFSALMERH
GIAEALQQARNAIVLPLQEDEEFKDTVEASISLGQWCLPMLFSHQYNRAL
LDWEFTPQPMRAENRRNKSFKQIKSLPNRFIGRRRELRKWQRKFRSGKQN
ALLLTGAGGIGKTALSYKLIMGLKHDGYEVFCLSFRPEDNWRKYLTSEIP
FSLDEHRKNEFKDRIADNSDIVFQAECLFTLLLEQFNGKVAILFDNLESV
QDSVTRHLIDAELQQLIDMALALESDGLRMLLTSRYALPHWDNSLVYPLG
NPVYRDFLAVAQQQKLPKEFFKDDKGKRTYKRLRQAYEVLGGNFRALEFF
AAALQTMNAAEEQDFLNGLKSATEQIQTNMLLEKVWSYRNQEEQELLCAL
TAYQNAVALDGIKALNLPTMQQSEEFVRALVAVSLVEQYENKVWDVKEEF
LVAPLVRDWLQKQGVATLPIELLQRAARYQQWLLENERRTFEQATITHAA
LMAAGLNDEAHRVTLDWIVTPMNMAGLYQALLDSWLLPACYAVDKQILSE
TLGQTGKQYHHLAQYSTALDYLKRSLAIVEEIGDKSGEGTTLNNISQIYD
ARGDYDTALDYLKRSLAIVEEIGDKARVGAALNNISQIFKARGDYDTALD
YLKRSLNIRQEIGDKSGEGVTLNNISQIFKAWSDYDTALDYLKRSFAIRQ
EIGDKKGEGTTLDNIGKIYLAKGDYDTALDYLKRSFTITHEIGDKKGEGT
TLNNISQIFQARGDYDIALDYLKCSLVIQQEIGDKSGEGTTLNNISQIYD
ARGDYDTALDYLKRSLAIQQEIGDKSGEGTTLNNISQIYDARGDYDTALD
YLKRSLAIQQEIGDKSGEGTTLNNISQIYDARGDYDTALDYLKRSLAIRQ
EIGDKSGEGTTLNNISALYHARGDYDTALDYLKRSLAIAQEIGDKSGEGT
TLNNISALYHARGDYDTALDYLKRSLAIRQEIGDVAGLCATLINMGHIYL
QNNEIQDAVSAWVTAYTLARKIGYAQALDALENLAQQLGLPNGLAGWEML
ARQMGEVNSFNRE
>Cag_1951 DsrF protein
MENENVKKIMHVMRRAPHGTIYTYEGLEMILIMAAYEQDLSVVFIGDGIF
ALKKDQDTAGIGIKGFAKTFMALDGYDVEKLFVDRQSLEERGLSEDDLVV
DVEVLEAAEIGKLMSEQDVIIHH
>Cag_1988 conserved hypothetical protein
MMQWHCKAFLILRHSFNEQCNLFMPEIKPFSGVLYHPELLKQADKLICPP
YDIISSAQQQSLYHRSPLNAIRLELPLEENPYGTAAARLTQWLQSGELQR
DSEPAIYPYFQTFEDLEGKSHVRHGFFTAMRLHEFSENKVLRHEKTLSAP
KADRLNLFRATRTNISPIYGLYADEHRTLDQLMVAYSETHEPLLDANVQG
IRNRLWRITEPTLLEQFRQTLLNRQVYIADGHHRYDTGVTYRNERMAANP
THNGNEPYNFIFSCLTNIYDEGLIVFPLHRVLHSVADFNAERLKEQLAEF
FTITDLNSQDELKAYLAASTSSFSYGVVTSGALYGMTLKGEAAPLLDAQC
AHCPEAVAQLGVVVLHQVIFHKLLGISHEAMEAQRNLLYVTDVNEVFHAV
ACRTAQAGFVVKPTTVQQVLDVSESGEVMPQKSTFFYPKLMTGLLFNPLD
>Cag_1290 Cell shape determining protein MreB/Mrl
MNFFSNFFRDIAIDLGTANTLIFIRSKGVVLNEPSIVARERSSGKIVAIG
HEALLMHEKTHPGIETIRPLANGVIADYEATEELIKGLINKTKTQFSFGI
RRMVIGIPSGITEVEKRAVRDSAEHVGAREVYLVTEPMAAAIGIGLDVQE
PMGNMIVDIGGGTTEIAVISLGGIASGESLRVAGTDITNAIVRHFRKAYN
LAIGERTAEDVKIKIASAYKLDKELTMMVRGRNLVTALPEEREVNSATIR
EAIATPISQIITSVKKSLEVTKPELSADILDRGLFLAGGGALIKGLDKRI
NEETKLAVHISDDPLTAVARGTGVVLENLEKYRTVLLPNKQY
>Cag_1219 3-dehydroquinate synthase
MSTIIVKTPVTVGLQELFREHKLSKKSVVLFDSNTRKLFGDVVLEALRAE
GFQLVELVIPAREASKSFSVAYRLYGQMIEADVDRSWNLLAVGGGVVGDL
GGFIAASYFRGIPVIQMPTTLLAMTDSAIGGKVAINHPLGKNLIGFFHLP
ELVLIDPATLASLPRREIYAGLAEVVKYGFIADANFFDMLEAHFSEVATL
QEPYLTEAVKTSALIKQDVVTRDFRELDGLRATLNFGHTFAHGLEKLADY
RHLRHGEAVTMGLVCALYLSHRLGFLDAPSLERGLRLLQQFVFPKNVVER
YFLASNSALLLESMFSDKKKLDKKLRFVLLKELGQAFLFEESVEDSEVLA
AIESAKECFRQWQQK
>Cag_1033 hydrogenase/sulfur reductase, gamma subunit
MQSAEPNIYKPAIMKIVATHNEAPGVKTLRLEFQDPIEHERFKQLYRTGM
FGLYGIYGQGESTFCVASPETRNEYIECTFRQSGRVTSSLANCEVGDLIT
FRGPYGNRFPIEQFYGKNLLFIAGGIALPPTRSVIWSCLDQREKFGKVTI
VYGARTVADLVYKHELEEWQKRSDVDLVLTVDPGGESPDWKYKVGFVPTV
LEQAAPSPDNCIAVLCGPPIMIKFTLASLTKLGFKEENVYTTLENRMKCG
VGKCGRCNVGSVYICKEGPVFTAAEVSKMPQADL
>Cag_1334 Aspartate carbamoyltransferase
MKHLTGLCNCSAATIATLLELASDYKKELLKSNPQFQPTLCNKRIALVFF
ENSTRTRFSFELAARHLGASTLNFAAAASSVSKGETLSDTIKNLEAMQVD
GFVLRHPSSGAANFITTITSRPVVNAGDGANEHPTQALLDLFTLQEHFGR
LKGIRIIIIGDVLHSRVARSNIYGLLSVGAEVGLCSPVTLMPPDADQLGI
TMFTNLEDALAWADAAMVLRLQLERAAGGYLPSLEEYSVYYGLTDERLDR
IQKNLLVLHPGPINREIEISNNVADRIQPPGYSKSMLLEQVTNGVAVRMA
VLHTLLAENGK
>Cag_1625 Helix-hairpin-helix DNA-binding, class 1
MKWLNSLATKLSLTKAEITLITALLGFLLLGGVVKNFQDVEERTTLIKRA
EAARLDGAEVDSLLRLASLKEGDLSAEPVAEQAEEGEVAPSTKKKSKSAR
SEKKEFHGTVAFNKASAAQLQKIPGVGTVMAERMIAFRLLKGGKVSDMKE
LLEVKGIGAKKLEQLQPYLTLD
>Cag_0040 multidrug resistance protein A
MADNTTPPEGTNNGSKKSSSPLRMGIIGVLLVSALGVGGSYLYHSMLYEK
TDNAQIEGDIYPVISRIAGKVAEVRVRDNQMVARGDTLLTLDRSDELVRL
NMMAAELQSAQAAVSVAQSQAEAAHAAVSAAEATNRKEQSDLQRYSKLLK
QEVISPAEFDGVKARAEAAAAQLLALRSRYQAAQAELPLKQADALKRQAT
LQEAELKLSYTALTAPANGRVAKKNVHIGQYVAPGQQLIALVGNNDLWVV
ANFKETQLEHLVPNQAVIIEVDAYPGKEFSGTIESLSSGTGAKFALLPPD
NASGNFVKVTQRVPVKILFTEKLDKRYPLAAGMNVTVAVKVK
>Cag_0206 IscS protein
MKVYFDHNATTPLHPEVKKEMVAALDLFGNPSSMHAFGREARANVEDARR
RVGKFIGANEHEIVFVGSGSEANNTVLSLFVCTCSQCVPGARRNTIITTR
IEHPCILETSACLAHRGVKVKYLDVDRYGKIDLDQLASMLGDDVGLVSVM
MANNEIGTMQDVATIARMAHEAGAYMHTDAVQAIGKVPVDVTALGVDFLT
ISAHKIYGPKGVGALYVKKGIPYCPFIRGGHQEQGRRAGTENTLGILGMA
RAVEMRAVEMVEEKERLLRFKEMLRSGIESKIDAISFNGHPTDSLVNTLN
VSFEGVEGEAILLYLDLEGIAVSTGSACASGSLDPSHVLLATGVGAEMAH
GSIRISMGRSTTLQEVEYFLEVLPRTIQRIRNMSTAYVKGGLHAASR
>Cag_1401 conserved hypothetical protein
MKIYTKEELIQSLKIIAAQGWIENARHGNHGGIGNTLEDLLGIAENNLPI
PNAAEWELKAQRLNTSSLITLFHIEPSPRAIKFVSQVLLPNYGWKHQQAG
KKYPENEMSFRQTIHGLSTSDRGFQVNIDRKNQKVVISFDWNCVAEKHHK
WLQSVKNRIGLEQLNPQPYWGFDDLSNKAGTKLLNCFYVQAEVKKEAGKE
FYKFSKVMMLQKFNFDGFLSQIEQGNILVDFDARTGHNHGTKFRMRQNCL
PTLYEKMTIIV
>Cag_0906 lipoate-protein ligase A, putative
MTSLFSSVYCLNTGFRSGEANMQLDRELMAAFVDGRFQERFGSKSCLWRF
YGWQPHAITVGYNQKSSALDEAKCQAAGIDVVRRPTGGRAVLHAEEFTYS
FFADSPESNETLYQLVHEVIRQALAELQIEATFSRSTLSEQPSLNAGVGA
VSCFTASARYELQVHGRKLVGSAQRRSGNVLLQHGSLPISSTHKALSNFL
LLPENDAAAMQQALDEEMARKTTSLEELLGYQPSFNHMVELMQCAAGAIH
NVEAVALSEEEIVI
>Cag_0869 ATPase
MESKGIIIEGLVKHYGKGDTLVQALKGVDMHVAGGEVVGLVGPSGSGKST
LLKCLGAVLEPTAGRMTVGNEVIYDGSWRTHDLRALRRDSIGFVFQAPYL
IPFLDVVDNVALLPMLAGVANSEARQRAMELLQKLDVAHRAKAMPSQLSG
GEQQRVAIARALVNHPPVILADEPTAPLDSERALGVIALLNDMAQHYKTA
IIIVTHDEKIIPTFKRIYHIRDGRTHEEVGEGKLPHDLLQAQR
>Cag_1057 Cobalamin-5-phosphate synthase CobS
MLGGLVTALRTLTILPIPGKDAVTFSHSLYWFPFVGLLLGALLAALGYVG
SLSGWHEFAALLVVLGGIVLTRGMHADGLADVADGFWGGRSKEAALRIMK
DPTVGSFGALALSGVMLLKWVAVVRLLSFGLFDVVMAGILLARLVQVLLA
SALPYARREAGTASAFVAGAGAPHAFSALLFTLALLFPFYTENFPTMLWL
LGAALVAGSMVGMVSYRKIGGVTGDVLGAGSELTEVAVWITGALLLSDYL
LF
>Cag_0974 PhoU
MSLRPVHEKINDLSEGLVQLSDKVVGNLVKALQTVRNQDEQEAHQIRKSD
HEIDCAEVDLEERCLAFLALQQPVAGDLRTIVTIIKINDDLERIGDLAVH
VVDRMPEINPALLDMFAFERMGFQAADMVKKSIDAFVLKDKVLADNVCAL
DEEIDVMHRAVFKKVAALMKSPDSDVDQLIAALSISRYLERMADHASRIA
HEVIFLATGEIVRHKSKAYEKLVESLKN
>Cag_1447 SMF protein
MDILNFLMLSQVPGIGAARIKALLTHWGNLSFLQHATIADLTHINGIGET
LATELYNTFHNAAKNDTVRRAAEAQLLALERCNGQVLTLLDEGYPPLLRE
IYDPPPCLFIRGTLPPNTEKSLAVVGTRHASAYGKQVTTHFCHAIAKQEM
PIISGLAYGIDMAAHQAALDAGGTTVAVLASGIDTIYTDPKGLLWPKILE
HGAIVSEEWIGSHITPAKFPKRNRIISGIAKGTLVVESDLKGGALITATT
ALEQNREVFAVPGSIFSHTSRGTNKLIQQGQAKAIMEVDDILMELQPSQP
HQAKPIHPTKATANATTTTATTQLPLLNPLESQIYQALSSSDPTHIDTLA
ATLQLDLSTLFLHLFELELQGVIEQQPGQLFLRKA
>Cag_0066 ATP synthase F0, subunit B
MLTSGTILLAGSLLSPNPGLIFWTAITFVIVLLILKKIAWGPIIGALEER
EKGIQSSIDRAEKAKDEAEAILRKNRELLSKADAESEKIVREGKEYAEKL
RTDITEKAHTEAAKMIASAKEEIEQEKRRALDVLRNEVADLAVLGAERII
RESLNADMQKKVVASMIQDMSSKRN
>Cag_0761 conserved hypothetical protein
MDNNKKLQQLFENDPLGLLDVKPSNSSARNENERLVASFQEINEFFEQNK
REPKADNGIQEHQLYSRLKSIRENPTKSEILISHDIHELLNTKPKAPVSI
DDIIENDPLGLLDDDTAGLFELKNIKPNEKSRAETDFVARRKPCKDFDKY
EQLFKTVQKDLKEGKRKLINFKLGNLRQGSYYIHNGILFFVEKIEITKKD
HYKPDGTRVREDGRTRSIFENGTESNMLKRSIEKILYENGQVVTEHSDQS
NLNYVESLFAITDEDKEAGFIYILTSKSEKKEIKEIDNLYKIGFCKTTVE
ERVKHASQEPTYLMADVRIIKAYQCYNMNPQKLEQLLHNFFGNSCLNIDV
SDKEGNRHTPREWFIAPLGIIEQAINFIISGDIIYYRYDAINQEIVEK
>Cag_0069 Probable nicotinate-nucleotide adenylyltransferase
MGGSFDPPHNGHLALALAARELLNVECLFLSPSRNPFKGESLLDDVHRIQ
LVELLAKEVNRTGSGCEVCRWEIEQAAPSYTVELISYLTQSYPTWRFTLI
LGEDNFHSFHLWKEYQEILRLCHVAVFRRSSEAVVPSLDEAMLVQEGVSF
YNFDAPLSSTDIRKQLRAGLPVNGLLPASILRYIEQEGLYR
>Cag_0108 YajC
MQSLILSLLLFAPPAAGQGNPNPFIQLVPLVLIFVVFYFFMIRPQQKKQK
ERETTLDSLKRGDHVVTIGGVHGTVAGIDTEKKTVLVQVSDNTKIKFDRT
AIATIDKQETGDKLPGKE
>Cag_0220 chlorosome envelope protein C
MNVPISRENNMSESYQKLRKDFKELEFTDRLTFLAESVLLTGQGAVVGGL
NFAGSLVDTVTGTVGSVIEATGISKLLGNTSGVVGETIDRVAITVKDASR
AAGELYVNAVETVENATSNAATAVGDAGVSASEVVKNFAGSFQKVTIKK
>Cag_1063 transposase
MKDTVLFQQALCLPMPWFVKSSAFDIEQKRLTIQLDFQKGSTFSCPTCGQ
HDLKAYDTAEKQWRHLNFFQHECYLTARVPRISCPTCGVKAITDLPWARR
DSGFTLLFEAMIIALVPSMPCKTIANYVGEHDSRIWRIIHYYLDEALEQQ
DLSAVTKVGLDETASKRGHNYVTSFVDLESSKVLFVTEGKDATTVEKFHK
HLLAHKGKAENIKEICCDMSPAFIKGVTTNFPETHITFDKFHIIQVLTKA
VDEVRREEQKERPELAKSRYLWLKNQVHLNQSQQVKLEKLQLKKLNLKTA
RAYQVKLNFQEFFKQAPAYAQSFLNQWYYSASHSRLEPIKEAARTIKRHW
YGILRWFTSNITNGKLEGLNSMIQAAKARARGYRTTNNLIAMIYFIGSKF
EFTLPALTHSK
>Cag_0803 16S rRNA processing protein RimM
MELFLTGKILKPRGLKGEVKVELITDFPHHFVKRKQFFVGVSPSSVVCMQ
VTKAALHGGFAWLFFEGIDTFEQATTLVGFSLFITEEELAPQPENHVYRH
EIVGMRVVDGNGKLFGTVTDIFNMPAHDVYEVQVEGKTVLLPAIEAFVES
FELASKTIVMSRCWEFL
>Cag_1005 Protein of unknown function DUF48
MKKFLNTLYVTSQGAYLSKEGECAVISIEKEVKTRIPLHMLDGIICFGAV
TCSPFLLGHCAEQGVTVTFLTQYGKYLCQVQGATRGNILLRRAQYRIADN
EAQSAALSRSFVIGKIGNARITLARTLRDHPDKVDALRLKQAQHHLAECI
QHLQHETNQERIRGIEGEAAKAYFEVFNECITSPDSHFQFKGRSRRPPLD
RVNCLLSFFYTLLTHDVRSALEACGLDPAAGFLHKDRPGRPSLALDMVEE
FRSYIADRLTLTLINRGQIHANDFTVSETGAVLLKDDARKKLLTAWQERK
QEVIEHPFVKEKMEVGLLWHMQAMLLARHIRGDLDVYPPFVWK
>Cag_1414 Protein of unknown function 661
MKILFGVQGTGNGHISRSRELVRALKEAGHELEVIISGRKEEELKEIEIF
KPYRVLKGMTLVTQKGRLNYVDTMVQLDFVRLVADIVTLDTEGVDLIVTD
FEPITSLTAKLKNIPSVGFGHQYAFRYDIPVAPGSFFEKYALLNFAPAHY
NAGLHWHHFSQPIFPPVIPETLYAKHHVAVISNKVLVYLPFEEVEDITTF
LTPFTDFEFFIYGKVQEGSDHEHLHYRTYSREGFLADLMECTGVVCNAGF
ELPGEALHLGKKMLLRPLDGQIEQQSNALGMVELGYGMAMESLDPTILAD
WLQQPCREPLRYARTVNYIAEWISYRHWDELGKYTAKAWVDHA
>Cag_1182 capsular polysaccharide biosynthesis protein I
MNVLVTGAAGFIGSTLCKRLLERGDRVTGIDNLNDYYDVSLKEARLAQLQ
PYENFTFVKGDLADRAGMEALFAKGEFEGVVNLAAQAGVRYSIENPHSYV
ESNIVGFLHILEGCRHHGVKHLVYASSSSVYGANETMPFSVHDNVDHPLS
LYAASKKANELMAHTYSHLYNIPTTGLRFFTVYGPWGRPDMALFLFTDAI
LKNKPIKVFNYGKHRRDFTYIDDIVEGVIRTLDHTATPNPAWSGATPDPG
SSKAPWRVYNIGNSQPVELMDYIQALENELGRTAIKEFLPLQPGDVPDTY
ADVDQLIEDVHYKPQTSVPEGVKRFVAWYKEYYGVKG
>Cag_1560 VCBS
MRCHLQSIKKFPTLTLYNLHAFFHINFLTIMAINTPSSLVRRELFIFDAS
VSNVATLASALPANSDYFVLDSTRDGLGQMADVLAGQTDIDALHIFSHGS
AGLLRLGNSSLSLANLNNYELPLSQIGSSLSPSSDILLYGCNVGAGDDGQ
QFVATLAELTGADVAASADVTGSAALGGDWELEVESGVVENEPMAVAGFE
GVLANSIPTISAPLATTVAAGDSPVSLNLLANAHDDDLTDTLSVGNVSYT
VNGVPSALPAGITMNGNALTIDSANTAYNGMAQGEQKTIVVTYKLLDSYA
NAEISFATKVDYAVGSGPVGTTSADVNGDGELDLIVANFQSDTVSVLKNN
GDGIFATKVDYPTGSCPQSVTSSDVNGDGKLDLIATNWGSDTVSVLENNG
EGTFATKVDYATGYSPWPVTSADVNGDGKFDLIVANFYSNTVSVLKNNGD
GTFVTKVDYPTGLSPLSVTSADVNGDSELDLIVANMYSDTISVLNNNGDG
TFATQVDYPTGSFPYSVTSSDVNGDGKLDLIVVNYYSNTVSVLNNNGDGT
FATQVDYPTGTWPSSVTSADVNGDGKLDLIVANAQSMVSVLKNNGDGTFA
TKVDYPTGLSPYSVTSSDVNGDGKPDLIIANRDSATVSVLINNSTGFSSV
YPTTTATITITGANDAPIVDVTDVIGTVTKPVPPIGNLTDSGTIHFTDVD
LSDSHSISSVTPSAGVLGTLTSTITADTTGTGLGGAITWHYSVAASAVEY
LAEGEHKVETFTFSLLDGHGGSVERTVGVTITSPNADTILPTLSNSTPAD
AATAVAVDSNITLTFSENVQASTGNIVITNGSDTRTIDVTDNTQVTFSGN
TVTINPTADLQAGHHYHVEMANGAITDEAGNAFAGISDVTALDFTTKGNV
APHIFAPVSLSFADKIDYVTGAQPNSVTAADINSDGNVDLIVANWGGNTV
SVLNNNGDGTFANKVDYTTGSGVISVTNADVDGDGSVDLIFANSISNTIS
VLKNNGDGIFSPKVDYSVGKNPWSIISTDIDNDGMPDLIVGTNGHDMPWD
MGLAQICNAISVFKNNGDGSFASKVDYAIENAFFSVASADVNGDGQTDLI
GANWTTGGLSILQNNGDGSFASKVDYAIPNSFYSVASTDLNSDGKPDIFG
SNLAVNGVSILQNNGDGTFASKVDYATGSNPWIVNSCDINGDGFSDISVV
NTGSNTVSVLINKGDGTFLDKKDYSTGNMPFGLSSADLNKDGKSDLIVVN
SSDSNTVSVFLNSTSSVLTHFTEQTPVAVCSDISSSDPDGDASWNGGALT
IQVTANAEATDTLSIATTNPGGNGVWLDTTIGYKLMAGTTEIGSANAALV
SNGSTLSFSFNANSTNAMVQDVARSVTFNNSSDTPSELERTVTFTVTDNF
GASASVEQIITVTAVNDPIDSISPVLTSSTPSDNTTGVAVSSNITLTFSE
NVHAGTGNITITDGTDTHTIAVANTTQITFNGKTVIINPTKNLQEGHHYH
VEVANGAIKDLAGNAFAGINDATTLNFTTVKSDSDHHDLTGTITFWKTGQ
ALSDVHVNLMPTASTGTAHLIDFRNIELHADGSRTVEIWKTATPTEAENV
DFELELQAGSTATWQSMLPNGWLSADGADGNLFSVQAAGLNAPLSANAVQ
LGVLTLTQPTDTQTFSLSLVDGIVGTQEATPFTLHSEQTISDAAGNYFFT
NLQESNYTISANKEYNTLHNAVTSADALAALKIAVGLTPNEDGSAILPYQ
YLAADVTHDGRVRSTDALTILKMAVGYEGAPENAWIFVAEEDVLATSMSR
KAVDWSIEKIDVPLENDTQVDLIGIVKGDIDGSWGMVG
>Cag_0548 DNA primase
MSMIPPAIIDEVRQAADIVDVVSDYVALQPSGRNYKALSPFTQEKTPSFI
VSPDKQIYKCFSTGKAGNVFSFIMEMEKVPFMEALKLVAQRAGIDISRYT
EPKGKQEGEEEQGSGAALRWAARMFHSLLKQPAGAEGWRYFVEERGLREE
TINRFGLGYAPESWDFLLREARREGIKSEQLVELGLLVSHREKQSLYDSF
RHRVIFPIFSRGGQVVGFGGRALVSDERSPKYLNSPESAMFAKSKLLYGL
HFAKNEIRRQERAILVEGYMDVLALHQAGLTNAVASCGTALTRYQAKMLR
HYSEHVLFVYDADKAGQKSMMSGIDILVSEQMVPQVLMLPEGDDPDSFVR
REGRQGFLQYAESHTMGFQDFQLAFFEAAGAFSTPEQKAEALRVMVRTIA
LIPKRAQRELYAQELSKKVGLTVTALRELLGNATSAVAKQQSCTPSKASA
TAPTSSSATNATSAPTIPHAPNLPNAQALPPLSVLEKTFLKALLESTQYG
TAVLGFAASHQSMLELRHPLAQEIFAHLIHRYHNIAADPEATIDMVSEIS
AFTNPETRDLASTLLLDPPISPKWQQQNDLFSEQARRCLAMFLDAFKNLV
LEPLLDEKNKLMEQIRVEENVEREIELSRQKIVLDKKIREENRSLQQMIK
AILDSTQQVG
>Cag_1662 3-oxoacyl-(acyl-carrier-protein) reductase
MLTGKIAVVTGAARGIGQAIATNLAARGADIVLCDIKAEWLTETADKVEA
IGSKAFCVELDVSNAASVQEVFNKIAEETGRIDILVNNAGITRDGLLMRM
SEEDWDAVLTVNLKGTFACTKAVSRIMMKQRSGSIINIASIIGLMGNAGQ
ANYAASKGGVISFTKSIAKELSSRNIRVNAVAPGFIASKMTDALSEEVKG
KMLEAIPLARFGEPEDVANVVSFLAGDESSYITGEVINISGGMVM
>Cag_0793 3-beta hydroxysteroid dehydrogenase/isomerase family protein
MSKNILVTGATGFIGSNLVRKLVTTTEHRISILVRKNSDISALADVRDRI
QLVYGDITQRSSLDAAMQGVHHVYHSAGLTYMGDKKNSLLYKINVDGTHN
MLDAAIAAHVDRFIHVSSITAVGIAFDKKPVNEATPWNFHALGLEYARTK
HLSEKEVAKAIQRGLDCVIVNPAFVFGAGDINFNAGRIIKDIYNRRLPFY
PLGGICVVDVEIVSDAIITAMAKGKTGERYIIGGDNVSYKQLSDTISRVT
GAPKVLLPLPFWTARILYSLLDLYHNRRRLSKLFNLSMFKVASHFLYFDS
SKAARELNMRYEPHEQSIRNAYEWYRERNML
>Cag_1669 6-phosphogluconate dehydrogenase related protein
MNIGFIGLGKMGFNMVLHLLENAHSLTVFDLSRDAMEAACRHGATAATSV
EDVAMQLSAPRLIWLMLPAGSVVDSTIERLLPNLTAGDCIIDGGNSHFHD
TMRRAAQLREKGIALLDVGTSGGLDGARNGACMMVGGEPSAYGMAEPLFR
ALCVENGYAYTGASGAGHYAKMVHNGIEYGMMQAIGEGFHLLESSEFNFH
LPDVAKVWAHGSVIRSWLMDLMVQALEKDPTLSYLSGAIADSGEGRWTVE
AALEQNVATPVIAAALFARYRSRSEGNLSDRTVAALRHEFGGHGFSEKK
>Cag_0752 Ribonuclease R
MAKKRSRRKSKDSAYKGKKPSTRVPQGDRVSPNDHLLINEFLFKRGETSL
AAEIVTFFADHEGERFRSVELARLLGYTESRQLPGFWYVLHKLHEDGTLD
KDSNRSYGMVGIDDASYQDHLVHVKPFPLLGKKQYKAGELYSGRLITHPN
GYGFLEVEGFDDDIFIKAGDLGAALHGDEVEVLVSSTPETYTAKASPHQR
CEGVVQRVINRHLTMIVGTLTKVARKFQLKPDDRKILPEIFVALKHARKA
KDGQKVLVGELDFTQLGEINGKVVEILGTAGDSAVEVSAIARSKGIDETF
DDALLQQVQSVSDTISEADIKDRLDIRDKVVFTIDPIDAKDFDDALSVEM
LEDGLCRVGVHIADVSHYVQENSPLDREAQKRATSVYLVDRVIPMLPARL
SEQICSLNPGVDRMAYSIFFTLTPEAEVRNYEFHKTVIHSKRRFTYEDVE
EILHKGEGDYLQELQLLDRYSTILREKRFRDGGLDFETEEVRFKLGKEGE
PLEVIKKERLGSHRLIEEFMLLANRKVAKHLTKTFRENRKTPLPVLYRVH
GSPQEEKVVMVANFVKRLGYDLKINRSKEGIIVSAQSLRQLLQQVKGTNV
EFLVSELVLRCMSKAVYTGDNIGHYGLGFEHYTHFTSPIRRYPDLLIHRM
LFEYERMRKKRRKMSETRLGELETTMQRVCQISNEREKSAAEAERESIKL
KQVEYMANHVGKSYAGIISGATDYGIYVRMVDFAIEGMVHMRNLTDDYYE
YDEKTYSLIGKRRKRRLQIGQRVTVKINNVDIHRRTIDLVMELE
>Cag_1602 bacterioferritin comigratory protein, thiol peroxidase, putative
MATSLLSVGTAAPEFSALDQNGQSVSLTEYRGRKVILYFYPKDDTPGCTK
EACAFRDNFPTFNAIGVEVLGVSIDDGAKHKKFVEKYQLPFRLVADPDKT
IVQAYGVWGLKKFMGKEYMGTNRVTYLINEEGVIEKVWPKVSPEKHAQEL
IAYLQPSA
>Cag_1024 conserved hypothetical protein
MKPLPVGIQTFSKIIEDNYLYIDKTDIAKSIIEKYQYVFLSRPRRFGKSL
FLDTLKNIFEGKQELFKDLLIYNQWNWAVTYPVIKISFSGGIHSKADLEE
DLIQILKANEKRLDLKCENRSKAKYFFAELIQQASEKYQQSVVILIDEYD
KPILDNIENIPEALIIRDGMRDFYTKIKESDEYLRFVFLTGVSKFSKVSL
FSGLNNLEDISLNSDFGNICGYTQNDVDTTFAPYFEGVDMEEVKRWYNGY
NFLGDKVYNPFDILLFIKNQKMFRNYWFETGTPRFLIELIKKNNYFVPNL
NKLRINESLANSFNLENLNLETILFQAGYLTIKRLISTNKGVSYELGFPN
KEVQISFNDYLLQELTTVSENELICDDLFELFNNGDIANLEPVIKRLFVS
IAYNNFTNNYIESYEGFYASVLYAYFASLGFDMIAEDITNKGRIDLILKT
FDKTYIFEFKVIAEEPLEQIKKMKYYEKYDGERYLIGIVFDPKARNVSQF
AWERV
>Cag_0920 conserved hypothetical protein
MKLLWYLATRFALKQRSSSKPTFVVVLAVAGIAAGTAALLLTLAIVNGFA
ATISQKLVSFNAHLQLRHSSQEFFQEERSERLQLTSHPEITHFTPFLEQR
FVLRCRNSAQQGRWSSKAVLVKAMPSAERQNFIKRYSVRGEEKECRVAEG
MVGIYLGRSLAEELNCKVGQKVMLIRDSNQASQRLLADATSLPELLASLA
IEPAVVCGIYSTGLQEGFDDYLLLADLSALEQSRKGFISGYEANVRHIER
LPTTVQELTALTNNRLYGYTLFQRYANLFEWIKLQQNIMPLLIVTITVVA
VFNIMATLLVLIIEKTGEIGMLSALGLAPQRIQGLFMLQALLLSITGITL
GNLLALGLALFEQHYHLIRLPEKSYFITEVQLLLNPMDNVVVSLSVLALC
LLFAWLPSTIAASLKPARALDA
>Cag_1864 conserved hypothetical protein
MNTAIVNPLFGEIRELINQSRRQVAVEVNSAITMLYWQIGKRINEEILQN
KRAVYGKEVIVTLSRELTTEYGNGWSTKHLRHCLRLAEMFPDFEIVSTLW
RQFSWSHIKELMYIEEPLKRAFYLEICKLEKWSVRTLRERINAMLYERTA
ISKKPELTIQNDLELLKNEQRLNPDLVFRDPYFLDFLGLSDTYSEKELES
AILVELQKFIIELGSDFAFMARQKRITIDHDDYYLDLLFFHRRLKCLVAI
ELKLGKFEASFKGQMELYLRWLEKNEMVEGENPPIGLILCAYKNQEHIEL
LQLENSNIRVAEYLTKLPDLKLLEQKLHHAIEIARHKSEFDN
>Cag_0458 Folylpolyglutamate synthetase
MTYQEALNFLYPLHRFGIKPGLERVQALLQTHGNPHKRLGRVVHLAGTNG
KGTTAAALAAMFQASGYKTALYTSPHLVDFTERIRINGQPISQEFVAHYC
AIMQSTIQATNATFFEATTALAFSWFADEAVEVAVIETGLGGRLDATNVV
EPEFVVIPTIGRDHVEWLGVTLPAIAAEKAAIIKQGCSVFTAATQPEVLA
VIEQQAQACNAALFLAGRDVHYEVVASEPGLLGLHVQTATQRYAELYLPL
TGTFHAATIALSVQVAECAGLSAHIIKHGLQQLLQTGYRARLEFVNNAPA
IFLDVSHNPDGMKATVDALLTYRERYKRTFVLLGLASDKDALAVIRELQR
LNPLLVAVNIPSERSVAAEQLGALCQQEGIEFIIQGDSVAGLRFIEQQAG
ERDMVLITGSFFLAGELLAHGFFKAVMQ
>Cag_1008 CRISPR-associated protein, CT1133
MSWMQRLCETYDNAHNKVGDYNDDAILLPLYHTTMTCNIEVTLNENGEFV
QAKPSEKKKIIIIPCTESSAGRSGDSPKAHPLSDKLQYIAGDFSHYGGEV
TSGFKNDPEEPFRQLYQQLTEWSEASPDKYKLRAVKRYLEKKRLIEDLIE
AGVLHLDEDRKLLKKWDSKGKKKTDKPPIFESITNVNDATVGWHVEKKGE
PTEPLWKDKDIHKAWQAYYESRKINPKLCFISGRSDVAPAEQHPKKILQG
ASNAKLISSNDKKGFTFRGRFTTAEEACTISAVASQKMHNALSWLVERQG
YNKGTLNIVAWAVSGGNIPDPMKETAPYDYDDLGDDYNAAQAFGLAFKKR
IAGYRARISSTDSILVLAFDAATSGRASLTYYRELTGSDFLDRLERWYAR
HEWLYSKPEKKRGFFIQVRSPEQIAKDIYEHNKTEDKEKKTDDIIRSVVQ
RLLPCIIDGQKVPFDLVVAARNRASKPMSFKKYKEDWKNREDWENTLSTA
CALFRGYYYTNFQEEYSMSLDPNRTTRDYLYGRLLAVAESLEKSALGLAD
EGRSSTAERYMQQFAERPFNTWKTIELSLSPYIARLQSNAPGLKKFYTDK
LDEIHCLFNPDDFENNEPLTGEFLLGYHCQRLKNYEGSSKKD
>Cag_0721 conserved hypothetical protein
MSNEFDKQLFPNLVQIIEQGKKQLAVQVNSTIVLTYWQVGKTINEHILNN
ERAGYAKEIVATVATQLVEQFGKSFETKNLYRMMQFAELFHDFEIVVPLA
RQLSWSHFLALLPLKSNDARIFYAQKAIEANWGKRELRHQIDRKAYERQE
IVNTQLQNTSEFTDATGVFKDPYFLDFLGLKDGYLEKDLESAIIKELENF
ILELGKGFTFVERQKRMIIDGEDFYLDLLFYHRKLQRLVAIELKYGKFKA
SYKGQMELYLKWLDKYERHDNENSPIGLILCAGKSNEQVELLEMHKDGIM
VADYWTELPSKAQLENKLHQLLIEARNRIEQRKALEE
>Cag_0060 4-hydroxybutyrate coenzyme A transferase
MALPLLTAEEAVTAIKSGDRVFLHTAAATPQRLIEAMVARASELRHVELV
SLHTEGDAPYVQPCYRESFRLNGLFVGRNVRSAVQQGDADAIPVFLSDVP
LLFLNGVLPLDVALVHVSPPDRHGYCSLGVSVDASLAAVHSAKLVIAQVN
PNMPRTHGESLLHISKIHAMVEVNDPLPETSRHPLTEAEQRIGQHIASIV
EDGATLQMGIGAIPDATLAALTNHKDLGIHSEMFSDGVIDLVERGVINGR
KKHTHPEIIVASFLVGTKRLYDFVDDNPLVAMMPSDYVNHTNEISKNPKV
TAINSALEIDMSGQVCADSIGQKFYSGVGGQMDFIRGAALSRGGKPIIAL
PSTTRRGISRIVPQLKLGAGVVTTRAHVHYVVTEYGVANLHGKNLRQRAE
ALAAIAHPDFREELCRQAYAVYGAKSCETL
>Cag_0531 Hydrogenase expression/formation protein (HUPF/HYPC)
MCLAIPGKLIERIEAHGEHGLAMGVIDINGAAVRACLAYVPDIAVGQYTI
VHAGFALKILDEEEAMESLKLWQELSDKGAFQPLDDANSSTQSCL
>Cag_1775 metal dependent phosphohydrolase
MIAQIDKEHYQKIHEILALARRNLKNVDESLIQRAFFMCYRAHEGEKRAS
GEPFFYHPVEVATILLNELPLDGVSVAAALLHDVIEDSGYTYDDIVAELG
VEVADIVEGLTKISGITINRETTQAEGFRKMLLSMVKDIRVILIKFCDRL
HNMRTLESLPEHRRLKIALETRDIFAPLAHRFGLGKMKVEFENLALKYID
PEMYRFLEQKLRMNKGDRVSYLSKMIQPIKSDLEQQGFKVEVQGRAKHLF
SIYNKMRNKNKSFEDIHDLYGIRIIIDTERIADCFSVYGFITQQYPPLPQ
YFKDYISIPKHNGYQSLHSAIIGPKGNMVEIQIRTRRMHEFAELGVAAHW
RYKEKISQDDAHVDSFLRWARELIKDADSATEFMEGFKLNLYHEEIYVFT
PKGDMKTLPAGATPIDFAYAIHTEVGHGCIGAKVNGKIVRLNAILKSGDR
VEIITSKTSKPKADWLNIVVTQRAQMKIRSAINEDRRLQIEKGRNIWDKL
VSNTKKLLTDNEIVQQAKKYGIKTPSDFFSALANQEINGEEVLAQASAAK
KAEQLTKSVMDEARELDEYLQAARQEAEPEQQKIASRRDEVVIAGMSNIA
YSYAKCCQPVPGDDVIGFVSTDGVVRIHRKNCINVSNDSLVKSERVVAVA
WNRKIDTEFLAGIRIIGEDRVGIMNQITTVISKLDVNIRTISLNANDGMF
VGIMTIFVRNADKLSSLMEKLKKVQGVFTVERLIS
>Cag_0576 conserved hypothetical protein
MKTITVQLPDEVHKGMTMVAATQHISISKLYEYISQNMLRGYAAEVRFRE
RVAQGSRKKGLAVLDKLDECYGE
>Cag_1593 glycosyl transferase
MKPLPLTIIIPTYNEEDGIRNSIEQLLKLIEQEDGVEIIVSDASSDATLS
IVQQLPVRWCQSQKGRAQQMNHAARLASGNILYFLHADTLPPKGFIADIR
QAVQDGKQAGCFQMRFDDEHPLMQFFGWCTRFPALICRGGDQSLFITREL
FEKIGGFTETMELMEDYEIIQRIEAYTSLHILEKCVTTSARKYHQNGILP
LQYHFGMLHLMYASGVSQRDLVAYYHANIR
>Cag_1167 chaperonin, 10 kDa
MSHLVNITDKFIVVGDRVLIKPKSVSERTKAGIYLPPGVQEKEKIQSGYI
IKTGPGYPIGPPPESDEPWKERTNSAQYIPLQAQVGDLAIFLQNSAHELE
YEEQRYIIVPNAAILLLIRDDDDLGYYLT
>Cag_1075 carbon-nitrogen hydrolase family protein
MQNATLRIAQIDCTLANFQENLATHCTLIEAAIADGMDAIAFPELSLTGY
NLQDAAQDIAMHINDERFAPLCELSRHITIICGGVELSNEYGVYNAAFMF
EDGRGETIHRKIYLPTYGMFEELRYFSAGKQIRAITSRRLGRIGVAICED
FWHISVPYLLAHQGAQLLLVLMSSPMRLKPGSGEPAIVQQWRSIAATCSF
LFSGYVACVNRVGNEDSFTFWGNSSVTNPEGTIIGAAPLMQPHMLDVSLE
AAAIKRARLHSSHFLDEDVRLLSSGLREIM
>Cag_1431 conserved hypothetical protein
MGCITALTLSSALHAASSSPQKIGILWWNVENLFDTQDDPAKRDEDFTPN
GKLQWSEKKLYLKQMRIRDLLGALAADKQMGSLPDIIGFAEVENKTVFEQ
TLQGVKTGSYKSVYYNSRDPRGIDVALAYNSATLKLQHSKAYSVPLKHPT
RPVVVASFMVGRHPLHLLLNHWPSRAFDAELSEPNRIAAATIARHIVDSL
LTANPKADVVVMGDMNDEATNRSLANTLGSSMDGVQVKAAKGKLLYNCWS
GYNGIGSYYYRSKWQKIDHMLLTHGMLDRTGFYVTKEAFRCIDYPALLKS
SGKGTWSTYEKRVYKGGYADHLPLYLKVSVE
>Cag_1913 hypothetical protein
MYQTLINRGALKYISSIERYAMEKGWSEGMERGKEQGKAEGLEEGLLRGR
LEVAERLVASGMSKSEAAVLAGVSVDMLE
>Cag_0898 Dihydroorotase multifunctional complex type
MNYIFQNAHLLNPLEKLDAVGTLTVTSDGTIAAVTLGNEAPPITADDQLI
NCEGKMIVPGLFDMHCHFREPGQEYKETLESGAEAALAGGFTGVALMPNT
RPVIDSPLGVAYIRHHSTTLPVDLEVIGAMTVESKGEHLAPYGKFSSYGV
TAISDDGAAIQSSQMMRLALEYASNFDLLIIQHCEDRSLSAGGVMNEGLY
STRLGLKGIPEVAEAITLGRDLMLLRYLEEHKLHTPLRRPRYHVAHISTR
QAIELVRQAKMEGLQVTCEITPHHFTLCDQELFEAERKGNFIMKPPLASQ
ATREHLLAALADGTIDAIATDHAPHALHEKECPPDQASFGIIGLETSLAL
TITELVQKEVISMARAIELLSVNPRAIMRLKPIRFAAGEAANFTIIDPNA
EWVVTAEHIRSKSSNTPFIGRTLRGKSLGTFHKGALRMTVEE
>Cag_0837 hydrolase, alpha/beta hydrolase fold family
MNKVDNNNHSEWPAEAISQFATINGFNVHYRIAGKGEPLVMLLHGSFLSI
RSWRLVFGELAKHTTVVAFDRPAFGKSSKPRPSTTTGANYSPEAQSDLVI
ALMRHVGFQKAMLVGNSTGGTLALLAALRHPNNVAAIALAGAMVYSGYAT
SGIPAPLKPLFKAASPLFARLMGKMITKLYDRTMYGFWHNKERLSPDVVA
AFRNDFMQGEWARGFWELFLETHHLHFEERLKGIVVPSLVITGDNDLTVK
TAESERLANELPGAALAVIANCGHLPQEEQPEAFVQALLPFIEKVRLHL
>Cag_1478 glycosyl transferase
MREYLISVIIAVYNPNAIFLQKAIQSVLNQSFPVLELILVNDGGNEEFRN
LLPTDSRIKVFRKVNEGVALARNYAIQQSQGEYIAFLDQDDYWFPHKLEK
QISMIPSDQPQCFVVSPIQIIDSVGSVVDKNNLAATSLYKNNLSLVNPFL
GLCYGNYIYSSTPLIHKKVFEIVGLFDVAAQPHDDWDMYLRILYAGVPFF
RYTDSALSVWRIHDSNESHKIKAMLLSKCYVEKKLLELNLVAPVREVVTI
NLLFDNVELAHLFYKENNTPEFRFLMKRYLPSLIRVFFVRFNKTFELDKI
LFRRIRKIILKSFRRYIVSFLRCNG
>Cag_0540 Phosphoribosylformimino-5-aminoimidazole carboxamide ribotide isomerase
MLIIPAIDIKDGKCVRLTRGDFSQKKIYLDNPSDMAIIWRKQNAKMLHIV
DLDAALTGEMVNIAAISDIVANVDIPVQVGGGIRSVDAVKSYLDIGVARV
IIGSAAVTNPKLVEELMHIYRPSQIVVGIDAENGVPKIKGWLESATMQDY
ELALRMKEMGIERIIYTDITRDGMMQGIGYESTKRFAEKAGMKVTASGGV
TNSSDLHKLEGLRRYGVDSVIVGKALYECNFPCQELWYSFEDEISIDHNF
STARQKS
>Cag_1104 conserved hypothetical protein
MKNNINYTDEPVGELVVVKDFLPSPDQLILKEDNVKITIALKKSSIDFFK
NEAKKHHTSYQKMIRELVDWYAVNNAKNA
>Cag_1189 RecR protein
MRFPSVALDTLIDEFAKLPGIGRKTAQRLAMYILHEPKIEAEQLAKALLD
VKEKVVRCTICQNITDVGTDPCAICASKARDRTVICVVESPVDMLAFEKT
GHYKGLYHVLHGVISPLDGVGPDDIKVRELLARIPVGEASGVREVVLALN
PTIEGETTSLYLARLLKPLGIAVTKIARGIPVGAELEYVDEATLSRAMEG
RTVV
>Cag_1031 hydrogenase, iron-sulfur binding protein, putative
MAMSIQEEYKNKAKELLSNGTVKMVIGYKAGTTANRRRPHFMRTPEECDT
MLLDSNCIANLSGYLLTEGLLSDEKKVAIFLKPDGIRSINILAAEAQLQS
PQVVIFGFDIQGNDVVEVKGHSVADFGSLMAATKEKAPTSLQEPTIEALQ
SKNAVERFSFWKKEFERCIKCYACRQACPMCYCRRCVVDNNQPQWVNTSS
HTLGNFEWNLVRAFHLAGRCVGCGNCDRACPVNIPLRLLNTRMAQEVLNA
FDHVAGMSDTQPPVLASFKKDDSETFIL
>Cag_0314 Excinuclease ABC, B subunit
MENRTDNEYQLVSPYQPAGDQPKAIEALVQGVRDGRHWQTLLGVTGSGKT
FTISNVIAQLNRPVLVMSHNKTLAAQLYGELKQFFPHNAVEYFISYYDFY
QPEAYLPSLDKYIAKDLRINDEIERLRLRATSALLSGRKDVIVVSSVSCI
YGLGSPEEWKAQIIKLRAGMEKDRDEFLRELISLHYLRDDVQPTSGRFRV
RGDTIDLVPAHEELALRIEFFGSEIESLQTFDIQTGEILGDDEYAFIYPA
RQFVADEEKLQVAMLAIENELAGRLNLLRSENRFVEARRLEERTRYDLEM
MKELGYCSGIENYSRHISGRPAGERPICLLDYFPEDYMVVVDESHVTLPQ
IRGMYGGDRSRKTVLVEHGFRLPSALDNRPLRFEEYEEMVPQVICISATP
GEHELMRSGGEVVELLVRPTGLLDPPVEVRPVKGQIDNLLAEIRHHISIG
HKALVMTLTKRMSEDLHDFFRKAGIRCRYLHSEIKSLERMQILRELRAGD
IDVLVGVNLLREGLDLPEVSLVAILDADKEGFLRNTRSLMQIAGRAARNL
DGFVVLYADVITRSIQEVLDETARRRAIQQRYNEEHGITPRSIVKSVDQI
LDTTGVADAEERYRRRRFGLEPKPERVLSGYADNLTPEKGYAIVEGLRLE
MQEAAEHMEYEKAAYLRDEITKMEQVLKKDG
>Cag_0174 indole-3-glycerol phosphate synthase
MTYLDRILEHKQIEVAALKKEQPRQRYEELQAELEAPRNFSASLKRPANK
GVRLIAEIKKASPSRGLIVQDFDPLAMAQRYQELGAAAFSVLTDQQFFQG
SNDYLRQVKGAFKLPVLRKDFIVDALQIFEARLLGADAILLIVAALESSQ
LRDYLQLSAELGLSALVEVHDGAELDEALQQGATILGVNNRNLKDFSVDI
NTSIKLRPSIPSDMIAVAESGLKRAADIDAVNAAGFDAVLIGEGLHISTE
LRSLIWT
>Cag_0056 UDP-N-acetylmuramate--alanine ligase
MELGKTQRVHIVGIGGAGMSAIAELLLKSGFSVSGSDLSTGDVTDRLTAH
GAVIYKGHQEGQVADSDVVVYSSAIRSEENVELRAALKAGIPVIKRDEML
GELMRYKSGICISGTHGKTTTTAMIATMLLEAGESPTVMIGGISDYLKGS
TVVGEGKYMVIEADEYDRAFLKLTPTIAILNSLESEHMDTYGTLEELKQA
FITFANKVPFYGRVICCVDWAEIRKIIPSLNRRYITFGIEEPADVMATDI
VLLEGSTTFTIRAFGIEYPNVRIHVPGKHNVLNALAAFSTGLELGISPER
LIAGLGCYSGMRRRFQVKYSGNNGLMVVDDYAHHPSEVKATVKAAKDGWQ
HSKVVAVFQPHLFSRTRDFADEYGWALSRADEVYIADIYPAREKAADHPG
VTGELVANAVRKAGGKQVHFVNGMEELYTALQTHVAPQTLLLCMGAGDIT
HLATKVAVFCKEHNADH
>Cag_0817 Phosphomethylpyrimidine kinase
MKRYTTVLTIAGSDGSGGAGIQADLKTFAALRCYGVSVITAITAQNTQGV
SGVFPVENHCLQAQFEALQKDITFDAIKIGMLGSASTITTLAALLRALPT
PRPPIILDTVLASSSGMALLPPSAIACMVSKLFPLATLITPNIPECALLA
GKSAVPQTAQEIEAVAKELQAQGAASVLIKGGHGKSNECHDCLLWQERCQ
WFSAPMIATHNTHGTGCTLSSAIAAFMAKGYPLDNAVLQAKNYLNSALQA
GAAYQLGMSANGHGPLFHLWELEDE
>Cag_1015 conserved hypothetical protein
MINNLLLPNQRDDYDSPWKEAIELYFPEFMAFFYPNAFLAIDWSKPYHFL
DQELRSILPEAENGKRIVDKLVQVHLLGGKERCLYIQIEVQGNREADFPR
RIFICNYRIFDKYGKPVASFVILTDSDSSWRPTTYSYEFAGSKMTLEFDM
VKLLDFEPRIKELLASDNAFALVTAAHLLTQKTREKSFERLDAKSQLIRL
LYNKQWTKERVKELFRVIDWFMELPKELEQQLQTEIYNIEEEQKMKYISS
IERYAMEKGWSEGMERGILEGMEKGLMEGMERGMAKGKEIGAEQTKLDIA
RRLVASGISKAEAALLAGVSLETL
>Cag_1792 hypothetical protein
MISAWINLSPLKNIFKKELNEGLPAHYGLRTLGQGIATYNKQKRVMAAFD
FGTTVPPEQFSHFRHHPVQLDIQRALREGWELFRNNPSEFLGFTLVCFAA
WLVGLLFDGGSSIIFSSIAAPLYSGYTMAVFRIAKGESLEFSDFFKGFNY
LLPLFLASLACTLFVSFGLMLFILPGIYLAVSYMFTTFFIIDYKMNFWQA
METSRTIITRSWFSFFAFAIVLFIINVAGAFLFGIGMIVSAPVTACAAAV
AYRNCMGVQISSIDEE
>Cag_0892 internalin-related protein
MALRLFFSIFKELKSSAVPQSPLVPLSSDGIFWYQKALLPIAIKKMEVRQ
STNQLVISPTLNALLIVEQHTYNQCPVCGFPLSMNSAICPRCGNDILEDI
SSLDQQSLERYHKHLENKKAEWYARCLTDQITGGDNPPLSAEHQECPAGR
QKPHALFNSDDELAFFTSLNRADILRDTNLRKKWWQSITADWQDVVRFTL
KINHDPSDSDLLAFFDSTNLRCDDRRIHSLLPIRVLEKLQQLRCDESPIE
SLEPLAHLTLLQRLYAFDCDFTSLEPLRNLTHLKLLWISSTEITSLEPIS
NLINLEELYCSETDITDLEPLRKLINLEKLSCYKTSITSLEPLAELENLI
ELGINHSDINDLTPLAGLINLEYLRCSKTAISSLEPLRNMVELRELSIAH
TNVDSLEGLQGLENLEELDITNTLVSSIEPLMGLEYIEKLELSVGTIPDE
ELERFVELHPDCNVVAK
>Cag_1956 Sulfite reductase, dissimilatory-type alpha subunit
MEGAQKAASDEVKCCCSNGCGGGNGTFLNPTPMLDKLEDGPWPSFVSGFK
QLAERTENTMLRGVMDQLEYSYNTKMGYWKGGLVTVDGYGAGVITRYSMI
KDKFPEAVEFHTMRIQPAPGLHYSTEMLRELCDIWEKYGSGIITLHGQTG
DIMLQGIEQDKVQACFDELNQRGWDLGGAGAGMRTAVSCIGQGRCENACY
DNLKAHLKILKHFVGPLHRPEWNYKLKFKFSACPNDCTNAIMRSDFAVIG
TWKDAIQIDKEEAKAWIAERGMDVFVNQVLNLCPTHAISLKDGEMAIESR
DCVRCMHCLNAMPKALSPGKERGISLLMGGKNTLKVGVNMGSLIVPFMKF
ETDEDMDEFIELIEEIIDWWDDHGLDHERIGETIERVGLKQFIEGVGLDL
DINMVSRPRDNPYFKAEY
>Cag_1468 putative PTS IIA-like nitrogen-regulatory protein PtsN
MKIESLLSESYVALNLELVQKSQVIATMLSIVERHPAVSDNVKLRADVLK
REQEMSTGIGKSLALPHAKSAAVSQPVIAFATLKKALDFDSIDGEPVQIV
FLLATPEAMLAEHLKLLGRITRLAGREEVRHKLLQLRTPAEVIALFKEEE
KDFPEI
>Cag_0662 hypothetical protein
MIFFVSFGGIWASGGGCSMWLAERDELVTMFQRDVVEVKPLRNRRLVLTF
RDGLVATLCLDDIVHHYNGVFLPLLDVAYFNQVAINRDLGTIVWPNGADV
CPDVLYAVASGKPIVCE
>Cag_1227 conserved hypothetical protein
MIRLHVTAEGQKYMEHDQPIKNLLQMVGEQNPELINDGWETAPSKRIINE
IPEYDKVSSGVLVTEKIGLSILRKKCRHFHEWLIRLEQLGETM
>Cag_1270 thiamine biosynthesis protein ThiF
MPLNPQQRQRYARHLALPEIGEDGQERLLASKVLIVGVGGLGSPAAFYLA
AAGVGTLGLMDGDVVDESNLQRQILHTSASVGELKVASAAERLQALDPAL
HLITYPFTLTTDNAEAIIAEYDFVIDATDNFRSKFLIADGCHRTVTPYSH
GGIKAFYGHTITVHPRKTTCYRCLFHNEGAVDNNEPQGPLGALPGIIGSL
QATEALKHLLNIGTPLTNTLLTCDILTMNIRKIPVSRNPHCPLCGSSL
>Cag_0675 putative glycosyl transferase
MNKILSVITVVKDDYSGLIETAKSVSLIVPSELVEYIIWINESSFEIIHN
IDLVKKLASKVIVGTDYGIFDAMNKALNYAAGDYVLFLNAKDLVIQGFNI
EKMLRPCLLKVQYIDYFGRLRKVVRNHKIDFGIPYCHQGMILPRKGYLFD
ENLKYGADYLALINMNLNWPLPISDSGLIHYDTTGVSTVNRWESDKWTAS
IIKRKFGIFYSLRYLFYCLIKLSIKRLYDVKIILTKKLNIFYVHR
>Cag_0122 LipD protein, putative
MKLSLADALSRAREQNYTVKAARSRIAQAEGQITQSRQSLLPKVTLSETF
MVTNDPGAALVYKLQHNTIEQSDFMPSKLNNADVIDDFHTSVQVMQPIYN
ADAKKGRSMALVAKKGQEFMAERTAETIALHVSKAYYGLLLARKNSEAID
GSLAIMQGYNAETARGFNVGMLSRSDKLSTEVRLAELQEQKMMMEDEIKN
ATDALRVLLNLDPTVTIVPTTDLNVDGSMPSVKDGGALEQRSDLQAMEVF
RQVASLQAEMADASRLPRVNAFAQGNLHGATPLEGGSSWALGVNVQWNIL
DAKVSEGQMQEAKAKKLEAMYSYEAAKSSGTAEINRALRSLKTAKARLAI
ASKSLEGAKVSFDHIGKQYKTGMAMTMELLMREQAFTYAKMRLNQAAFDY
NVAKSELEYYKGN
>Cag_0985 putative plasmid maintenance system antidote protein, XRE family
MMILMYLKNKGGARNSCLIFLTFHYAEAFGTSPEFWLNLQATYDLSLHKP
TKHIQPLVAVSAQHEIQERSCIVGLFLLCVLG
>Cag_0775 Ornithine carbamoyltransferase
MTQQSQESKKRDFLGFASLDANKVIELFDYSLYIKQARKNHNLDFKPLLN
KTVAMIFSKPSLRTRVSFELGIHELGGYAITLDGQSIGLGSREAVGDIAQ
LLSRYNDAIVARLHEHAVIEGLAEHATIPVVNALTDLSHPCQILADAFTL
YEKSLWHDGIKVVFVGDGNNVANSWIELAGLLPFHFVLACPEGYMPNHQL
LEAARSKGISRIEVIHDPREAVTNADVLYTDVWTSMGQEQERAERLRVFK
PFQINAELMALAKPSAVVMHCMPAHRGEEISAEVMDGKQSIILDEAENRL
HVQKALLVKLLNHEEYRKFHLTHRLQNAAKKVKS
>Cag_1780 phosphofructokinase
MKKIGILTSGGDCGGLNAVLKGAALMAQSKGLELYIIPNGYAGLYNLVDQ
ERIVRLDRIRLDQFDASFAGSEAGHSRVKIKAISNPEKYNRIKAGMKKFE
LDALIISGGDDSGSVMVDLNHQGIQCIHAPKTMDLDLQTYSVGGDSTINR
IAQFVHDLKTTGKTHNRVLVTEVFGRYAGHTAFRSGIAAEADCILIPEIP
ADWDVVYEHLKERFTRRIKDSDVHSGTYTIVVAEGLKNADGTDIVDESAG
VDAFGHKKLAGAGKFTCQQIQKRMKADPAMPLFMKETGMFVEGVYEIPEV
REIHPGHLVRSGNSSAYDINFGFEAGAAAVLLLLDGKTGVTVSRVKGRKV
EYIESSKAIVQRHVDLEQVALHESLGICFGRAAHAYEPILREVEGVYDRI
Y
>Cag_0993 hypothetical protein
MGISSRDVSKLNKTIQHTYNGGSTMTREEIIKRLDEIFQQKDIRHGVEVR
KLHVELFGVEPVFTGYYYECEKGALEWMIESMLDGKPFVEPELEDGEFT
>Cag_1881 hypothetical protein
MNIINSLIEKLKGAEIKMAHEKGAFDLFALFMLDVIPEQWDVVAAASWIT
DENYDASLRYIINCIQPLLSSKELFSISGVVLIDQYNPGLDAVLEAIHVE
HGLVEVRDTTFFGLDIKHGFVITSCTRHGCTAKSA
>Cag_0994 hypothetical protein
MDDLLIDLVINEHIAVFGVEPVFTGRSAFLSQEEIIANIDAAIRKGEPYV
EDDVPDDVDI
>Cag_0785 conserved hypothetical protein
MHQNSDENLQADIVSEQSLQVTKSLQAFSGPLPHPDLLREYENILPGCAE
RILVMAELEQTKRHEITERESKGLLDHLARGQHYAFAAVIITFASAIYLA
MNGHEFIASTIVTLDIIGLVVAFITGKALSLKQTKHNES
>Cag_0626 ATPase
MTESAIQPDLFGFSTPSSSVTSTTEKSSRFVPLAERVRPRMLDEVAGQQH
LVGANAPLRRFLESGQMPSVIFWGAPGCGKTTLAEICASTLQCHFEQLSA
VDAGVKEVRKALDIATRVRQAGQRCLLFIDEIHRFNKSQQDTLLHALEQG
LILLIGATTENPSFEVNGALLSRMQVYTLKPLTAEELEQVIRRALATDAL
FRERSIELADLEVLWHYCAGDARKALNAIEAAFALFPTNQSSVQLTREHF
EAALQQKAPLYDKSGENHYDVISAFIKSMRGSDPDAALFWLARMIEGGED
AKFIARRMVIFASEDVGNADPYALTLALSVFQAVSVIGLPEARINLAQGV
TYLASAPKSNASYQAINEAMAEVKSTTATTVPLHLRNAPTKFMKNEGYGA
GYCYPHNYPSHFVEQHYFPEGMEPKAYYRPTAEGREKMAQERLHQLWKER
YRK
>Cag_0831 regulatory proteins, AsnC family
MLNTSTIHITPEMLSLIAHIDEFKGAWRALGVLAPERLSALRRVATIESI
GSSTRIEGSKLSDQEVERLLSNLTINTFETRDEQEVAGYAELMELLFTSW
QYIPFNENHIKQFHQLLLSHSSKDARHRGTYKTTSNSVAAFDENGKQLGI
VFQTASPFDTPFLMQELIAWVNQEREAKQLHPLLIIAIFVVVFLEIHPFQ
DGNGRLSRALTTLLLLQAGYAYVPYSSLESVIESNKEAYYVALRQTQGTI
RSEVPNWQAWLLFFLRSLVEQVHRLQNKIEREHVVLAALPELALQIVEFV
HQHGRITIGEAVKLTDANRNTLKVHFRKLVEQGYLKQQGSGRGVWYERG
>Cag_0090 CTP synthase
MARPKNVKHIFVTGGVISSLGKGILSASLGMLLKSRGLRVAIQKYDPYIN
VDPGTMSPYQHGEVYVTDDGAETDLDLGHYERFLDESTSQTSNLTMGRVY
KSVIDKERNGDYLGATVQVVPHVIDEIKERMAEQAKNSNLDILITEIGGT
IGDIESLPFLEAMRELKLDMGDGNLINIHLTYVPYIKAASELKTKPTQHS
VKMLLGVGIQPDILVCRSEKQLSRDIKNKVGHFCNLNDLDVIGLSDCATI
YEVPLMLLQEELDSRVLKKLGIKGYQEPALTYWRDFCNKVKFPQEGEITI
GICGKYTEYPDAYKSILEAFVHAGASNNVRVNVKLLRAESAEEPTFDFAK
ELAGIHAILVAPGFGDRGIEGKIRYIQYAREQNIPFFGICLGMQCATIEF
ARNVCDLQDANSTEFNKRARFAIIDLMEHQKKVKEKGGTMRLGSYPCIIT
DGSKAHMAYQKFLVNERHRHRYEFNNSFRTLFEERGMLFSGTSPNGELIE
IVEIKNHRWFVGVQFHPELKSRVQKVHPLFHSFVAAAKDYARGVQQMDMA
IEMPSFMPILNEEGESKSE
>Cag_2013 thiol:disulfide interchange protein, thioredoxin family protein
MKLLRSLTATIVCSLLFLSVPAVMQAAPKAPAFSAVSVTGKATDTKQLAG
KTYIVNFFASWCPPCRAEIPDMVALQNKYSKKGFTFVGIAVNEDAENIRS
FITKNNINYPVVMATPELVDSYNRYVQGGLRGIPTSFVVNSSGEITQVIS
GGRSYAEFEKIILGAMKKSTVAAPKK
>Cag_0059 Cell division protein FtsZ
MAFELDPGLFDSDQGKGVTIKIVGVGGCGGNAVNNMIDRKISGVEYIVFN
TDRQALLNSKAPLRVQIGKKATSGLGAGADPAKGRQAADDDREIIAAQLR
GADMVFIAAGMGKGTGTGATPVVASIARNMGILTIGVVTRPFSFEGQVKA
RIADGGIAELRKYIDTLIVVENEKILSITEEGVSATEAFNKANDVLYRAA
KGIADIITRHGHVNVDFADVRSIMAGAGDAVMGSAAAAGERRAMKAAADA
INSPLLEGVSIKGAKGVLVNITGEVTMRDMSDAMNFIEEQVGSDAKIING
YVDEPQLSGEIRVTVIVTGFKRKESEESKPAATNRHPIVQTAGVKAGQIP
ISRQPVSFTPEHQEEDLRIPAYIRRQLSLQEPDEMSARKVPHSNNASVPV
NRQEHEDKIQKGMTDTPAYLRRRHNDQ
>Cag_1345 Quinate/Shikimate 5-dehydrogenase
MNSSTRIFALLGRAVDYSYSPLIHNTAFQALGLPYHYTIFNIAEAALVGD
ALRGARALGLGGFSVTIPYKQTVVPFLDELSEEATTIQAVNCIVNKNGKL
IGYNTDILGFASPLFAYREALHGATIALFGSGGAALAAIEAFQRYFTPKQ
VLLFARDSQKAKSQLRSSLALERYTNLAIVPLSDYERVRECRLVINATPL
GTKGRADGSAIIPLESNLLHSEHIVYDMVYNPTITPLLQAAQAVGASTIF
GIEMLIGQAEQAFTLWTGEKMPTELVRQTVMAKLQEL
>Cag_1872 exodeoxyribonuclease, small subunit
MTSSQPNEPSLEELLQRLDEITHTLENPDTGIERSIKLYEQGLLIAERCR
KRLEYARSKIEKLKPNSSSSLPQFPLTDDLFN
>Cag_0858 conserved hypothetical protein
MLKSERFFTKFNCRRLIMTTVEKLYKTVQELPEPVISEILDFAEFLSKKR
PVINKGSNISLLDLVGGLENSIAFNGNPLAIQKQLRNEWD
>Cag_0189 conserved hypothetical protein
MKQKALLFFYGAALALLLVLFVVVSQQFLSLFAFLSALHPYVGMGFLALS
GIILLFTLVTALLFFARPSEPSLPDNDVSPAMAAYVRYRVARVPTHPKHP
EGSNAPKDQRWLRTNLKLLDGDAMEITREIATKNFFVGAFAQNTSYGTTT
SLLNNIRMLWRIYTLHYRQHHFREFVALARDVYETLPLSDFRKEELPEHI
KPIIQCSFSNTLASLLPGGNLLTPFFMNLFLSGSTNSYITCLTGIAATRY
VQASTQEERHEVMQQSMFEASFMLKEVVRECNPILSVTISKAVKKAGMDS
LDTMQQPSASSGVAQDIVAHLANSLRTILRDDG
>Cag_1892 heat shock protein, HSP20 family
MLLKLSKDPMKLFDDIWSGAQMPSVPAFKVDISEDEAAFHIDAELSGLTK
ENINLHIEDDVLTIQAERKLETEENKKNYHRVERATGTFSRSFNLGETID
QENIQADFENGILHITLPKATAVSKKKEISIN
>Cag_1856 Ribosomal protein S12
MPTIQQLIRLGRSVKVSKTASPALEKCPQKRGVCTRVYTTTPKKPNSALR
KVARVRLSNKVEVTAYIPGEGHNLQEHSIVLIRGGRVKDLPGVRYHIVRG
SLDTSGVADRKQSRSKYGAKQPKAGAPAAPVKGKGKK
>Cag_1578 conserved hypothetical protein
MINQKNIGSSFDEFLEEEALLDEATAVAVKRVIAWQIAQEMKAKHLTKSL
MASKMQTSRAALNRLLDATDTSLTLTTLSSAASALGKKFRIELVS
>Cag_1368 hypothetical protein
MVFQRSMKVFTTFALFAGMMMASSNLHAVTVDDSIHEKACSVVAGERTVT
LSIDPKPVKHMKELTFTVSVTPCDKLPDMLLLDLSMPGMQMGKNQVTLKK
ISSCKWQGNGIIVRCMSGRKLWQATVLSNELNNPAFAFNVRD
>Cag_0315 hypothetical protein
MQKLPLPRTVWMLLHPTIFIINNTMKQMLSVAGMRCSNCEVLVKEALEEL
AGVSSAQASQREGSVTVEYDEQQVLLDTLKIRYHNKDLRYRACRNRFGLL
IATRCCTACFEEK
>Cag_1352 hypothetical protein
MVSIAGGAVTVTQNRGFVNGRVFTGYNFNKNIAVELGYLQTGDTTANIAG
VSGTLVAYTGELKASVSGIDYSILIRPISNNEWNGLFIKAGGHYLEMDQK
GSLTFAGIGTNTIKESVNGSGFLIGIGYDTPITDNIDLRSAYTYYGKIAG
DSDSEANFFTIGLLAKF
>Cag_1870 acetyltransferase, GNAT family
MAADDVVVRYARLDDAPVIAAITAKYARQDIMLERTADSVVEHIRNFFVA
ECNNEVIGCCAVAFYTQKLAEIRSLAVCNSFKLQGIGRILVEQAETVLHE
EGVTEVFVLTLSPIFFSRLGYTEIRKEYFPEKIWKDCVHCPKLKACDEMA
MVKHLAEGVRVQPIE
>Cag_1906 Dihydroxy-acid dehydratase
MRSDTIKSGFEKAPHRSLLKATGAIRSSSDYRKPFIGICNSYNELIPGHT
HLQELGRIAKEAVREAGGVPFEFNTIGVCDGIAMGHIGMRYSLASRELIA
DSVETVAEAHRLDGLVCIPNCDKITPGMMMAALRINIPVIFVSGGPMKAG
HTPEGKTVDLISVFEAVGQCSNGSITEGELQNIEEHACPGCGSCSGMFTA
NSMNCLSEALGFALPGNGTIVAEDPRRLELVKAASRRIVDLVENNVRPRD
ILTRQALLNAFALDFAMGGSTNTILHTLAIANEAGLSFDFSELNALSAKT
PYICQVSPATMAVHIEDVDRAGGISAILKELSSIDGLLDLSAITVTGKTL
GENIANAEVLDRSVIRSISDPYSATGGLAVLYGNLAPQGAVVKTGAVSPQ
MMQHSGPAKVYNAQDDAIKGIMEGDVKAGDVVVIRYEGPKGGPGMPEMLS
PTSAIMGRGLGDSVALITDGRFSGGSRGACIGHVSPEAAERGPIAALQNG
DIITIDIPARTMSVALSESTIKERLAQLPPFEPKIKRGYLARYAQLVTSA
NTGAILGHL
>Cag_0111 magnesium chelatase, subunit I, putative
MQLAAELEALGAEIVDASRFLVRVEEELSHRIVGQREVVRRVFIALLVNG
HILLEGVPGLAKTLIVSSFAEAMALKFQRIQFTPDMLPADLVGTLIYNPK
DLTFFPRKGPLFTNIVLADEINRSPAKVQSALLEAMQEHQVTIGDESYQL
PAPFLVLATQNPIEHEGTYLLPEAQMDRFMMKVEVDYPSYDEELEIMLRS
ATNAPRPSIQAVAQPEDIERARTLIDRIYVDPRVQRYIVDLVVATRSPAQ
YGMENLNGMIECGASPRASIYMLLAAKAHAFLQQRPYITPEDVKAVVYDV
LRHRIRPGYEAEAENMRSTDIIRQILQHVQVP
>Cag_1512 Filamentous haemagglutinin-like
MKTHPLFFPLHGRDVFVVALCVTQLLLVVPQAQALPTGGAVVAGSANVTL
PSATTMQIEQASQKAIINWQSFGAERGERVQIVQPESSSVLLNRVIGNNP
TSFFGQLQANGQVFLVNPNGIYFAPTSQLNTGGLVASTLSLNDRDFLAGN
YAFVAQGAMGALLNEGTLQGGFVALLGSNVENRGAIVTTRGTAALAAGEA
MTLNLDASGLVALTVDQAAYNAHIRNSGILEAEGGTVVLNAGAAEDVLAG
VVNNSGRVVATSVSERNGAIVIEGGSLVQTGEVVAPTINVAVNRMVDAGS
WRAEQGNITIHAATTIEQTAASHISASGKQGGSVRLEAGKQLYLSGAIES
NGTDGQSGSGGTIAVTSPTTTIAGATLSANGGTDGGMVLIGGGWQGSEPN
LPNAATTTVTASSSISANASTVGNGGTVVVWSEQATTFAGTIAANGGSES
GNGGAVEVSGHEQLAMSGTVSTSAHHGEAGFLLLDPRNITIEQPLLLSQF
QFQLISLLDPNATAGNQHGSGAILELLNGNLLVTSPLDDVGGSDAGALRL
YRPDGTLLSTLTGSATGDLSGGTITPLQGNSNAVFLASNWSNGTAAKAGA
VTWIDGTNGVSGTISEGNSFVGTHANDGMDAEVIALSNGNYVAHLPSWQH
DEVLNAGAVAFGNGTSGSAGTISEANSLVGTKANDSDSAKVVALTNGNYV
VASPLWDNGSTTNVGAVTWGNGQTGKVGAISGSNSLIGTKSGDNVGLQVT
SLANGNYVIGSPNWDNGSTANVGAVTWADGNLSIHGALSATNSLVGAKSG
DYVGSSVTALTNSNYVVVSQSWSSDTATDVGAVTLGHGDAGTTGVVTADN
SLVGSSTGDGEKLSATALANGNFVVVAPKWDGDATNMDVGAVVLGNGVTG
SVGQISATNALVGTTANDLESATVTPLTNGNYVVAATKWDNGVVADAGAV
IVGSGTTSITGTISAANSLVGSVSNDLLSATITPLTNGNYVVAASKWDNG
AVLDAGAVAWGNGQAGTVGSISESNSLVGNKKDDFSGLTITALHNGNYVV
SASLWDNGSITNVGAVTWGNGQTGTVGTINSTNSLIGAKSGDKVGAVTVA
LSDGNYATASGECDNGSLANAGAVTFGNGGGGTVGVVSSANSVMGSEKDG
KIGSGGLTPLRVGSVSGGVVVSSPLAQASNGNVTLFAPSTANEAGMLSAD
YTYAADGSSNVTVTPTQLATLLNNGTSVRLQASNTITLNTLLTANASSST
TLELHAGKSILLNNSIVTGNGNLTLIANDSAEHGVDNTLRESGAAVISMA
SGTAINAGTGQVVVELRDGGERANNASGDITLGSVTAGTISVANNGSSNS
SGVVLAGAALTANESNGSTIVLSGQHFTNSANATLNTEPEARWLIYSSSP
EATQKGGLTSSFRSYNVLPATYAAAAVTEQGHGFLYASAPSQLGVNITLN
NGSASSVYGNEPNATLGYSLHGFADNEESANTIGLEGSMQVSGMPNTTSS
VGTYNVAYAGGLTSSKGFTFTAGTPLALTVEPQPITVNPDDQEKTYDDTD
PDLTWQVEAQGVGRGLLVGDVFSGELGREAGEDVGSYAITLNTLHNDNYA
ISFIPGTFTITQRPLTLSATSTQKVYGEADPTLAVTITSGSLASTLRQDA
LSDVVGTLNREVGNNVGSYDVVLGSGSRSSNYNITFAADNNAFTIAQRPL
TVTASPLTKTYGDADAALAWQAEAASSGRGLLANDTLHGELAREAGEDVG
NYAILQHTLGNNNYAISYQGSNLSITQRSLTLSATPTQKVYGEADPTLAV
TITSGSLASSSVQDALGDVTGLLSRQVGNNVGSYDLQLGSGSRASNYNIT
FTANNNAFTIMPRPVVVAANNFSKVYGDADPALTWQAESSDPALAAENLA
LLRSLELFNNTNNLSGITGAPSSNESLLSSTENNAQSSSDTASTSSTNNE
DEEMVGIRSPMGNIYISFPLAEYDFKVEWCQGSHILHGTKPAFIASERL
>Cag_1614 DNA/pantothenate metabolism flavoprotein
MPSPLYNKRILLGVSGGIAAYKIPHLVRLLKKAGAEVQVVMTEHAKEFVS
ELTLATLSGKTPYSAIVPQVESRTHDYTAHISLGEWADALLIAPTTANTL
AKLAAGICDTMLGACFITLRPSKPILLFPAMDGEMYRSPSVQRNLATLTA
DGCTVVPPESGALASGLCGAGRMPEPDTIVAHLEQALAQVAQPSTLCGKS
VVITAGPTREKIDGVRFISNYSSGKMGFALARAAAQRGARVTLISGSVNL
ATPLGVERFNVESAVEMYAAAEPFFASADIFIGAAAVADYRPVEVVQGKR
KKDGTALQLTLVENPDILKAFGLQKKGHQLACGFALEGENGVEYARKKLQ
EKALDLIAYNSFDGATCGFEVETNILTLIAQNGEAVALPLMSKDDAAAHY
LDALEALMRSCV
>Cag_1092 conserved hypothetical protein
MKIRRFELNAFGPFSGNVLDFNSPTPGLHIVYGANEAGKSSAMRALYAWF
FGYPLRTTDDFLHKKSNLLLSGTLENKQGEVLTFSRRKRKERDLFDGNDQ
PLEAQTLEHWLLGMDRELFQALYAMSHESLALGGQGILDEEGEIGKALFA
AGAGLASLRPMLAHLQSEADELFRPQGARQQLNEALARHRTLQQQLREAT
LSGSVWQEKKEALEQAEAKRNALQVRKQELETEKHRLERLQHALPELADR
KHVVEQRAALGKVPLLPADFAAQREALQKQLHLAQHNYEREQERITALQQ
SISSHHVNHALLEQAAVLDELHQRLGEYRKGKNDLPQRQSQRAAALQAAM
DILRPLWADLAGSEEAMDATDDSPTLMQRLQKALLKKKEVQRLATHFEAL
VSSGKSARQQVQESEQALEQLQRDLAALPMQGDSNQLEQTLRIAERNAAL
DRDIAELEQSLRHSEQECHAMLQRLTLWHGTLEQVPTLPLPLPETISRFD
EAFQRLQSDTLALRAQAEGVEKRLQEITTELEQLAAESHVPSVEELQHSR
AERNKGWELLKRQWLQQEDVTAESNAYSAAHPLHEAYEIMVNAADQLADQ
LYREVERVRRHTALTAEAKKLHHQHTHLHERLATLATEEAALHTAWQEQW
RTTEIEALPPREMVAWVATFEALRQHVRERDKLLLERNVRHKRRQEAHEQ
LHQAVEAVAPPFPIKNNELAPLLQYAQQQLSRMQAVEKRGENLLNRQRDI
THNLESSRQLLNRAEEEHREWRKEWIAVTSALELTGQAQPMEIVDSVEAM
QQALTKLKEAEEFRKRIEGIERDMRQFELDVATATATLAPDAQESDGAKR
VAMLHERLDEARREQTLLQREKDEVTQHKEALRRHAATLQEGELQLTAMC
QQAECATPTDLPIAEARSQQAQELHEKLMAVETRLVRIAGSASPEALEAL
ETEAATVERDALPSHIETITTEIHQEIEPEIDQLNELRGRLRNELKQMEK
EDGNAADLADAAQRELARIRRLTNRYIRLRLAETMVRNATERYRSSSERP
VLSLASTYFATLTLQSFVALDTESDDNGHIALMGVRTNGNRIGVEAMSSG
TRDQLYLALRLATLQWRMQQSEPMPIIADDILITFDDARSRSTLQALAKL
GESCQIILFTHHRTIADMASHRAFKGTVFLHTLGTTNESEHNNAETSAQP
PKPENLTLFG
>Cag_1600 ATPase
MRIEHLIVKNFKGFVSKEFTFHPNFNLIVGMNGTGKTSMLDALAVAIGSW
FLGFYVDSLKMRQIRHDDVLLKYIQHSWEHIYPCEVEAYGVVMDRHIKWS
RELNTINGRTTYGNALAIKELALQATRSMLNGDDIILPLISYYGTGRLWQ
EPREAFKVSDPRKVANKETQSRRTGYFNSIEPRLSVNQLTQWIAQQSWIA
YQEQGQVFPVFNTVQDAIIGCIEDAKKLYFDAKLGEVIVEFSSGTQPFSN
LSDGQRCMLAMVGDIAHKAAKLNPHLGSDVLKETNGVVLIDELDLHLHPR
WQRRVIEDLRNVFPKIQFICTTHSPFLIQSLRSGEELVMLDGQPFATLGN
LSLEEIAHGIQQVKNPEVSLRYESMKATAKSFLTMLDEASLAPKEKLKQL
ADKLRPYADNPAFQAFLEMERIAKLGE
>Cag_0864 Ribosomal protein L21
MQALITISDKQYLVKQGDKLFVPRQQAAIGDKLTIASLAQIDGANTTLQN
SNSVQAVVLEHVKDEKVIVFKKKRRKRYQSRNGHRQQMTHIEVLSL
>Cag_1499 membrane protein
MPSLRSISRYPFAIPLLSILAALLLSSLIIVAAGRDPLMIFQKMLRSVAG
SPYGMGQVLFRTTTLVCVGLAVALPFHLKLFNIGGEGQLLMGTFAAAMAG
LFLPQTVPPMVAIALCTLAAMAAGSLWALTAALLKVRFGVNEVIGTIMLN
FIAQGITGYLLTWHFAVPSTVHTAPIIDSATIPTFSVLTGWFASSPANPS
IIFVLLVALMLHLLLYHSRMGYEMRAAGLQPDAARYGGINATMHTLTAFA
LGGAIAALGATNMVLGYKHYFESGMSGGLGFTGIAVALLAGAHPLWLLLS
ALFFATLEYGGLTVNIWIPKDIFMIIQALTILIFISLSALGKRAN
>Cag_1733 glycine dehydrogenase subunit 1
MPFIATTESERTEMLQAIGVNSFDELIADIPYSVRLQRALELLPSLDEPQ
VRRLLERMAASNRCTAEYVSFLGGGAYDHFIPSAIKTIISRSEFYTAYTP
YQAEVSQGTLQAIYEYQSLMCRLYGMDVANASMYDGATALAEAVLMAMNV
TGRDQVVVAGKLHPHTTAVLKTYLEASGHQAIIQNALVDGRSDIAALKAL
VNQQVAAVVVQQPNFYGCLEEVEAIGAITHEQGAIFVVSADPLSLGVLAA
PGSYGADIAVGEGQPLGSSQSFGGPYLGIFTVKQQLVRKIPGRLVGMTKD
RDGEDGFILTLQTREQHIRREKATSNICSNQALNALQAAVYLSLLGKQGL
QQVAAQSAQKAHYLANAIAALPGFSLKFTAPFFREFVVETPMPAAHLIEQ
MVEQRMFAGYDLATHGESGLLIAVTEQRTKEELDAFVAALSALKA
>Cag_1509 conserved hypothetical protein
MANNLLLPNQRDDYDSPWKEAIELYFPEFMAWYFPNAYAAIDWSKPYHFL
DQELRSILPEAENGKRIVDKLVQVHLLDGKERCLYIQIEVQGNRETDFPR
RIFICNYRIFDKYGKPVASFVILTDSDSSWRPTAYSYEFAGSKMTLEFDM
VKLLDFEPRMKELLASDNAFALVTAAHLLTQKTREKSLERLDAKSQLIRL
LYNKQWTKERVRELFRVIDWFLELPKELEQQLRTEIYNIEEEQKMKYISS
IERYAMEKGILEGMERGMVAGKEVGVLEGMERGLEEGLLKGRLEVAQRLV
ASGMSKAEAASFAGVSVEML
>Cag_1046 conserved hypothetical protein
MKPLPVGIQTFSKIIEDDYLYIDKTDIAKSIIEKYQYVFLSRPRRFGKSL
FLDTLKNIFLGNKELFQNLHIYNQWNWNITYPVIKISFSGGIRNNESLRK
NLFYILKDNQKRLNITCEENDEPNLCFAELIQQAFEKYQQKVVILIDEYD
KPILDNIENIPEALVIRDGMRDFYTKIKENDEYLRFVFLTGVSKFSKVSL
FSGLNNLEDISLNPNFGNICGYTQHDVDTVFAPYLEGVAMEKVKRWYNGY
NFLGDNVYNPFDILLFIKNQKTFKNYWFETGTPTFLMKLFAKERYFLPNL
EHLEVGDEILDSFDIEKIQLATLLFQTGYLTIEKRFETFERLRYQLKIPN
QEVRLALSDHFINVYTEQPNELKYAQQNRFYTYLTQVDMLGFQQTLQALF
AGIPWNNFINNSLPEFEGYYASVLYAFFISLNATVIPEDTTNQGQVDLTI
MVENKVYIIEIKRDTVKSYEISQQNIALQQIQRKGYATKYKGQGKTIIQI
GMIFNIYQRNLVQMDWEVVG
>Cag_1346 Peptidase A8, signal peptidase II
MKLFFSLALFVVAADQFSKYVALRFLRDANQSISIIPNFFSFTYAENRGI
AFGLEPAPPALLLLFTMMISAAVLWYVLRSNNRRLIFLLPFSLILGGGVG
NMIDRMVRGYVVDFIYFNLYNGYVGNIYLSLWPIFNIADSAITIGGTMLL
LFHRTLFPDDPIA
>Cag_0163 CDP-diacylglycerol--glycerol-3-phosphate3-phosph atidyltransferase
MEGRIINLPNALSFLRILLIPWFLYAYHHGHLTTAIVVMIVAILSDWFDG
RVARWTGDVSDMGKILDPLADKLCLASVAIYFLWVGELPLWFVLFALLRD
VVIFLGAGYVKLRHAVVTTSMWPGKWAVGFVSMMFIVMVWPHPIFKAYPL
KEIFLYLSTLTLLYSFVLYTIRFYRIHKGLDFKA
>Cag_0504 Phosphoribosylglycinamide formyltransferase
MQSTKTRIAVFCSGNGSNFKALYHAIAHKQLPASIELCISNRSQCGAMEF
AQEHGIASAHISEKQFASYDDFVTAMLHELQRHQIDVVLLAGYMRKIPER
VVAAFSGRMLNIHPALLPKFGGEGMYGIHVHSAVIAAGEKESGATIHFVS
EEYDKGGILLQRSVPVLPTDTPETLAERVLACEHTLYPDALELLLNELRK
>Cag_0384 conserved hypothetical protein
MKKVLVAGATGYLGRYAVEAFKKRGYWVRALVRNLDKAKQPGPYFAPEIA
SLADEIVVGDATLPATIATVCDGIDVVFSSLGMIKPDFVHTIFEVDYQAN
MNLLDLALKAKVKKFIYVSVYDAHRMMNIPNVQAHEKFVRELKAAKIDST
IIRPNGFFSEIGQFVARAHKGFMLLVGDGYQRSNPIHGADLAEVCVDAVD
RSDKEIGVGGPEIFTYQEMMDLAIEIAQNQPFIFPLPLWAADTLVAATGL
VNRDVHDVALFATTLSRIDVVSPEYGTHRLRDFFMQCKAAL
>Cag_0140 ATP synthase F1, alpha subunit
MSTTVRPDEVSSILRKQLANFESEADVYDVGTVLQVGDGIARVYGLTKVA
AGELLEFPNNVMGMALNLEEDNVGAVLFGESTMVKEGDTVKRSGILASIP
VGEAMLGRVINPLGEPIDGKGPIDAKLRLPLERRAPGVIYRKSVHEPLQT
GLKAIDAMIPVGRGQRELIIGDRQTGKTAVALDTIINQKGKGVFCIYVAI
GLKGSTIAQVVSTLEKYDALSYTTVIAATASDPAPLQFIAPFAGATLGEY
FRDTGRHALVIYDDLSKQAVSYRQVSLLLRRPPGREAYPGDVFYLHSRLL
ERAAKITDDVEVAKKMNDLPDALKPLVKGGGSLTALPIIETQAGDVSAYI
PTNVISITDGQIFLESNLFNSGQRPAINVGISVSRVGGAAQIKAMKKIAG
TLRLDLAQFRELEAFSKFGSDLDKTTKAQLDRGARLVEILKQGQYVPMPV
EKQVAIIFVGTQGLLDSVDLKFIRKCEEEFLAMLEMKHADILSGIAEKGT
LEADVASKLKDIATKFIATFKEKNKA
>Cag_0191 chlorosome envelope protein B
MSNGSNDLSGAISNLIETMGKLAQQQVELINTGLKTAAQMAEPLSKTATD
LLGNMVSTMNQMLQTISSAIAPKQY
>Cag_1595 DeoxyUTP pyrophosphatase subfamily 1
MLNIKIVRLHPLATLPAYATAHAAGMDVAACLEAPVSVAPFSTALIPTGF
ALELPIGYEAQLRPRSGLALRSMISLPNTPATIDADYRGEVKVILINYGK
EPFMVQHGDRIAQMVIARVEQVQFEEVAELSATVRGEGGFGHTGIASVQ
>Cag_0465 Glutamyl-tRNA(Gln) amidotransferase A subunit
MQLSSYTELREQLLAGSLRCEEVVRSYLERIDAAREDNIFITVFHERALE
RARMLDRKLAEGGTVGKLFGLPMAIKDNIAMKGERLTCASKILENYESVF
DATVILRLEAEDAIFLGKTNMDEFAMGSSNENSAFGNVPNPFDKSRVPGG
SSGGSAAAVAANLALVALGSDTGGSVRQPAGFCDIVGLKPTYGRISRFGL
VAFASSFDQIGVLARTCGDAALVLEVMAGKDERDATSSAHQVDSYHSMME
RVSPEGLKIGVPQEFFTDALNADVARLVLATLDDLRNRGAELVDITLPDS
AYAIAAYYILVTAEASSNLARFDGARYGFRTSEAADLAAMYVNSRTEGFG
REVKRRIMLGTYVLSAGYYDTYYKKAQQVRRYFQDQYRAALQHVDVIAGP
TSPFPPFGLGDKTGDPLEMYLADVFTVPASIVGMPAVSVPLGFDSQKLPV
GMHLVGNFFEEGKLLGIARMMQRA
>Cag_1185 Transglutaminase-like
MILKVEHATLFEYDNPIYETATEVRLHPSNTSAAPQRCASFKLQVDPSAL
LFEYTDFYGNTVHHFNLLQSHKRVNIVATSIVETGEGKSAAGDDEDIFLM
DFLGQSRYVHFDQAIRDFTAQFAPATDNFALAETICRHINSSFIYEPGIT
DVHSTSTVVMALGRGVCQDFAHIMLAACRYLNVPARYVSGYLYGGSTPDG
RDEASHAWCEIYCGKEQGWIGMDPTHKTIFVDERYIKIGSGRDYSDVPPV
RGTFKGNATEKLTVSVRVSAME
>Cag_0924 oxidoreductase, short-chain dehydrogenase/reductase family
MHDKVFLITGASTGIGEATARRAVEAGFRVILVARSTDRLANLVAELGAT
HAHAIPCNVAEWQEQEQMVVQALERFRRIDVVFANAGFSKGSPFFGGENK
LEEWKEMVLVNMFGAAATARLTLPELVKNKGHFLVTGSVAGRSTSIRNLY
SATKWGVSGMAYAIRNEMAEHGVRVTLVQPGVVDTPFWDNLQKLGTPELQ
ADDVARAVLYAVSQPPHVDVNEVVIRPVGQPH
>Cag_1760 conserved hypothetical protein
MTDLSHETLLRLHSELAAEYELAESSYQFGGKPFSFLHVRDSYALLDRIS
PEEFVKDEQMPYWAEIWPSASALSTFFMDEVALEGKHLLELGAGIGVVSI
VAAWRGAQVVATDYSIEALRFIRYNSLKNSVALTAERLDWRQVQRSDRFD
YVVAADVLYERVNLLPVVLALDKLLKADGVAFIADPRRRMAEQFVELATE
NGFVVTTHARRCQIGEAPVMVNIHQISRL
>Cag_0981 conserved hypothetical protein
MNFIIKQATVSNRKDFLKILQCWNMQNGFLHDETELDYSNFFIAEVNNQV
VGMAGFMPIDGERYRTRLLAVYPEFRGTEIGKALQDRRLEEMYKRGAKIV
ETSVDNLEMKHWYKKHYGYTEIYKTKKEYEISFIDVDVVDVLYLNLIEYM
KNKIAFDSKKLRYMEKYEPHPLSPYPPLIINVALTGVIPTKTLTKYIPIS
VNEIIEEAINVYDAGASIVHLHAKDENGKACSDAKYYEKIISGIKKERPE
LICCATTSGRDGQSVEQRAEVLSLTGNAKPDMASLTLGSLNFLSGASINS
IDTVTELAYIMKEKGIKPELEIFDTGMVNLAQYLERHNIINGKKYFNILL
GNLNTAGATIKDLSHIYTSLPDNSIWAAAGLGHFQLPMNMASIVAGGHVR
VGLEDNIYYDLNKTKLATNITLVNRIKKIANELERPISTAQKTREILGI
>Cag_0098 YgfB and YecA
MNAPDPMMQPLTLQEFTILEEFLVSERTPEEALSSLEMLDGYMTAAIIGP
QAFEPKDWYALMWDKNKQLEPQFSSADEADMISELIVRHNNSIEAVFLED
PESFVPLFDRVAYENEEIHKLAVEEWCMGFLIGMELAYEAWQPLFDNEDA
AVMTMGFFMLSKVSDEFAHMTEREIEEITSTVGDAVIGIYLYWHGDDEMD
EEDDDELFRE
>Cag_1696 conserved hypothetical protein
MEKIAEELLKLIPFSKLFELFGLSKEISLALSTIFSIGIVALLWYGVKMI
YLRFQIAINAKDIKPYFNNANDIEKKLSLFIETWGQDKLPAREEEPIYTH
ESALNKSRLISHFIKKVFSADKSGEKFFLVLGDSGMGKTTFMVNLYVRCQ
SFINFRRKNKVKVKFFPFGYKGEILDKIKEIPQDEKINTILLLDAFDEYY
KLLPPDIPDGLSDDKRFRKVLDEVIDVVQDFRKVVITGRTQYFPGEDDKS
YILEIPTFDDNGFHKLNKFYISPFTEDDIRHYLYKKYGYIRFWNFKKREK
ALKLIFENLKETKFLLVRPMLLSYIDLFVNSNQIYKNIWDIYEALINEWI
EREGNKRKHDSIACQQLKENLHNYSQKIAVTIYENRKGMQIVSLTKEEAT
ENINDALKHYEVTGQSLLTRDAENKWKFAHRSILEFFIAKEAVKNQEFAN
KLDVTNLDMAKKFCIEKGLGYLFDYVPIKGGEFTMGSPDGEVDRSSTETQ
HQVKLHNFYIAKYVVTVAEFRKFIEECGYKTDAEKANSSRIWTGKEWKYK
AGVNWRCGVGGQLRLQNEENHPVIHVSWNDAKAYCDWLSKKTGKKYRLPT
EAEWEYACRAGTTTPFNTGDNLTTAQANYDGNYPYNGNAKGKYRQTTVPV
DSFAPNAWGLYNMHGNVWEWCSDWYNDKYYEVCKAKGVVENPECTEEQSY
RVLRGGSWGNDARSCRSAIRNLLRPRPPLQQRWLPPGFRPVASGVAHSTC
F
>Cag_0844 conserved hypothetical protein
MSWTTPAELKRQVQKLWDRGMLLATFCNGKALFPRRLMLKAPDARQLSTS
FPEVREWIAQLSNAAKHYRIVWRTINHRILGANELPAEIWIDSLDNALLL
IGKQREAQQFAAMVTLTRTMQPALLPWLEKRPLRALELAPEWHRLLSIVA
WRITHPKPAIYLRQIDLPGIHSKFIEQHRGVLGELFDLVLPPEEIDTTAI
GVGGFCRRYGFQDKPLRVRFRILDPALALLPTVSDHDITVTQATFACLEI
AVTKVFITENEINFLAFPNVPQAMVIFGAGYGFENLASVKWLHDCAIHYW
GDLDTHGFAILNQLRRFFPHATSFLMDSKTLMEHQALWGIEPSPETGELT
RLTAEESALYDQLRQNELGHHIRLEQERIGFEWLVGALGRGTEKAAV
>Cag_0253 hypothetical protein
MKDTVLFQQALCLPAPWFVKSSAFDIEQKRLTIQLDFQKGSTFSCPTCGQ
HDLKAYDTAEKQWRHLTSNDVVNGISILCTIPSPLINLAKNFAERCNSCK
LFLLRSSVVETDDCLTSNTDCEILLSSEHRHNREKLRDCLLLYAIEPTNH
FPHNAIPRFYAKPRNICHALYCSVLKAHYGKRNNTPYSPPLTVNTLKGQS
REMV
>Cag_0966 conserved hypothetical protein
MKLTRYIHRVVLLAVLGLQLAGCSGNEASRLRKDDIRFAAFYTDYLLLSG
VEPATKDEQLVALSPAMVDQLLEKHHLTRQSMSSLVANYRRNPEQWQVVL
EQVRGNLRNKAREANGE
>Cag_0847 Helicase RecD/TraA
MSVQDGQESYYYSPKERLSGAVERVTFHSQKNGFSVLRIKVKGRRDLVTV
VGATPSIAPGEFVECLGEWHNDSTYGLQFRATELTVVPPETIDGIEKYLA
SGMVKGIGPHFAKTLVYAFREDVFTVIEEEPERLLELPGIGQKRMEMVTS
AWADQKVIRDIMVFLQSHGLGTSRAVRIFKTYGNESILRVKENPYRLVLD
IYGVGFKTADALAMQLGIAPDSLIRAQAGVHHVLQEIASSGHCAAPREQL
VAEASRLLSIPEERTHEAIDAELRAGNLVREELRGVETLYLLSLHRAELG
VATSLMRLLEGEIPWRHLAIEEALPWVEAQNNITLSPSQKEALHTALTNK
VTVITGGPGVGKTTLVKSILLILQAQKVRVALCAPTGRAAKRLSESTGLE
AKTIHRLLEFDPLTGGFKHQRDNPLECDLVVVDESSMVDVVLMNRLLAAV
PEKAALLLIGDVDQLPSVGAGAVLADIIRSETIPTIRLTEIFRQAASSRI
IMNAHRINKGELPLRDESNTLSDFYLIAANTPEEIYNRLLTVITERIPAR
FGLHPVRDVQVLTPMNRGGLGARALNVELQKVLNGQVEPSVTRFGTRYAA
GDKVIQMVNNYDKEVFNGDIGHISAVEREDGAVLVDFDGTLVSYEFGELD
ELSLAYATSIHKSQGSEYPAVVIPLAMQHYNLLERNLIYTAVTRGKKLVV
IIGETRALAMAVKNHKAMRRLTGLAERLSALARYEANL
>Cag_1734 Dihydrodipicolinate synthase subfamily
MPTPYLSGSAVALVTPFKSDSSIDFEAIARLTEFHVAAGTNIIIPCGTTG
ESPTLAEEEQVAIIKTVVEAAQGKLMVAAGAGTNNTHHAVELAKNAEKAG
AAAILSVAPYYNKPSQEGFYQHYRHIAEAVAVPIIIYNVPGRTGCNVAAS
TILRLARDFDNVLAVKEASENFTQISELLEERPSNFAVLTGEDSLILPFM
AMGGNGVISVAANEIPAQIRQLVESAGSGDLVTARTLYSRYRKLLKLNFI
ESNPVPVKYALARMGMIEENYRLPLVPLSAESKRAMDEELTLLGLV
>Cag_0940 conserved hypothetical protein
MPTTYHMPPEWALHQATWLSWPHKLESWPGKFEPVPAVFAEIAAWLSTSE
EVHINVLDEAMANEARRLISKVQGVPLRMERIVLHTIANNDAWCRDYGPN
YVFHEKDGKRDKVIIKWKYNAWGGKYEPYDADDNVAFVIAAMQQIPLFET
AMVLEGGSIDVNGEGLLLTTEACLLNPNRNPFLNKEQIEKILGCYLGVQK
VLWLGDGIVGDDTDGHIDDLARFVNANTVVITVEDDPADENYPILQENYE
RLCTFTDLEGKPLNVVKLPMPSTVFYNNERLPATYANFYIANSVVLVPTY
RCANDAKAIEILQRYFPTRQVIGIDCTDLIWGLGAIHCISHEEPVL
>Cag_1758 photosystem P840 reaction center cytochrome c-551
MGSLLLGLSFLTGYKPPAENISHLLGPLGSFAGWLALILFASLTIMGLGK
MSGKISANWFLSFPFGIIIIVAVMFASLWFKPSGMLMSGAGRTTMDDGRY
IRSVEEYKDYLSKPVVSANVPKAPEGFDFDAAKSLFDAKCNVCHTFDSTK
DAFKTKYKKTGKIDVVVKRMQAVPGSGITTEDADKIMMYLNEKFQ
>Cag_0221 Anion-transporting ATPase
MRILTFTGKGGVGKTSVSAATAVRLSQLGYRTLVLSTDPAHSLSDSYNLP
LGAEPTKIKDNLDAIEVNPYVDLKQNWHSVQKYYTKVFMAQGVSGVMADE
MTILPGMEELFSLLRIKRYKTSGKYDVLVLDTAPTGETLRLLSLPDTLSW
GMKAVKNVNKYIIRPLSKPLSKMSDKIADFIPPTDAIDSVDQVFEELEDI
RNILTDTKKSTVRLVMNAEKMSIKETMRALTYLNLYGFNVDMVLVNRLLD
TQENSGYLENWKAIQQKYLGEIEEGFAPLPVKKLKMYDQEIVGLKSLEVF
AHDMYGESDPSVTMHDELPIKFVRRENVYEVQLKLMFVNPVDIDIWVTGD
ELFVQIGNQRKIITLPLSLTGLEPGEAIFKDKWLHIPFDLDKNNHTRSQQ
EQQAASRR
>Cag_0312 hypothetical protein
MNTHTILIDFPSDILLALNETEAELKLRIKTTLAMRLYQLQKLTIGKAAQ
LSGLSRIEFETLLSENEIPISNLTIDDVIDDYKKLK
>Cag_1930 ferrous iron transport protein A
MRLSELRVGDAGEIATVKSQGAVRRRIMDMGLIKGTPFRVVRVAPLGDPM
EIAFKGLHLALRKSEAQEIFVVKH
>Cag_1663 Malonyl CoA-acyl carrier protein transacylase
MKAFVFPGQGSQYCGMGRDLFESFPEARALMEQANEILGYSITDIMFTGS
EEELRQTKFTQPAIFLHSMVLATLLGNNGVSMCAGHSLGEYSALCFAGAM
SFEDALRIVAKRGNLMQEAGTQNPGTMAAIIGMADEALEGLLAEAQSDGI
VQAANFNSPGQIVISGDVAAVKKAVALAPSKGARMAKELVVSGAFHSPLM
KPAEEELAAALAQVEIKEARIPVCMNAVAKAVTSAAEIRCNLVLQLTSSV
LWTQSIEAMAQAGVTTFVEVGPQKVLQGLIKRIVKNVTIQTVDTAAELQG
MLP
>Cag_0857 conserved hypothetical protein
MSGIKYLLDTNIILGLLKATSTVLEAIGFRSIQAAECGYSAITRISNCYV
LFVS
>Cag_0737 nucleotidyltransferases
MIDSIEIVKKQIVDALMPLNPEIIILFGSYAYGVPNKNSDLDICIVEKEY
SNKWKEKEKIRKLLNKIDMPMDILNPKLDEFEFYKNEINSVYYDADKKGI
WLWKKNS
>Cag_1257 O-acetylhomoserine/O-acetylserine sulfhydrylase
MTYRFETLALHAAQPVDGTLSRGVPVYRTTSYLFKSTEHAANLFALKELG
NIYTRLMNPTTEVLEARMTALEGGVASVVVASGTAAIFNTIITLAEAGDH
IVSANNLYGGTYTQFDAILPKLGITTTFVDPKEPANFEAAITDKTRALYI
ETIGNPVLDFTDVKAIADVAHRNGLPLIVDGTFTTPYLLRTIELGADIVI
NSLTKWLGGHGAAIGGSITDAGRFNWAAGKHPLFTEPDENYHGLRWALDL
PEALAPMAFALRTRTVPLRNLGACIAPDNSWLLLQGIETLPVRMERHCSN
ALTVAQFLSQHPTVAWVRYPGLPNDPTYATASQYLTRGFGGMVVFGVKGG
YDAAVKIIDTIDLFSHLANVGDAKSLILHPASTSHSQLTQEQRIASGLSD
DLIRLSIGLEHPDDLIEALDKALQ
>Cag_0718 hypothetical protein
MIKPRSRNVAPVTPSLDDFIRQPEQPAARELEPNASRKFKTVSLPMNEYE
YSQLHATCKKTGRSEKNLLRYAMMLYAKEVLAE
>Cag_1223 Methylmalonyl-CoA mutase-like
MKPDFSRINLTTPSFTLPKKEATEASWLTAEGIALQATYHADDITDQHAL
HFAPGFAPYSGGPYSTMYTTRPWTIRQYAGFSTAEESNRFYRQNLAAGQK
GLSVAFDLPTHRGYDSDHPRVVGDVGKAGVAVDSVEDMKILFDQIPLGDI
SVSMTMNGAVLPVMAFYIVAAEEQGVPSEKLSGTIQNDILKEFMVRNTYI
YPPEPSMRIIADIFRYTSANMAKFNSISISGYHMQEAGATAEQELAFTLA
DGLEYLRTGLKAGLAIDDFAPRLSFFWGIGMNYFMEIAKLRAARMLWAKL
VKQFNPKNPKSLMLRSHCQTSGWSLTEQDPFNNITRTALEAMAAALGHTQ
SLHTNALDEAIALPTPFSARIARNTQLYLQEETAITKSIDPWAGSYYVER
LTTELAEKAWGIIEEIEAAGGMVQAIEQGLPKLQIEAAATRRQARIDSGK
EILVGVNYYRTEQKTDIELLEVDNSAVLKQQVERLATIKATRNNPACTEA
LQALEQCAASGEGNLLELAVNAARHRATLGEISFACEKTFGRYRSTTRLN
SAMYRDEMKDNQSFQEAQRLADHVAELEGRRPRIMIAKVGQDGHDRGAKV
IAAAFADIGFDVDISPLFQTPDEVVQQALDNDVHLIGISSLAAGHKTLIP
AIVEGLQKEGRSDIIVVAGGVIPERDYQFLYDHGVAAVFGPGTVIAEAAI
TILKKMMTNDTI
>Cag_1641 conserved hypothetical protein
MAEYDKDKQAKEYYLDTPVNNYGFFLKGAHSLDWGMKNRLSRIFRPDTGR
TVMFAIDHGYFQGPTTGLERPDVNIVPLMPYADAIMLTRGILRTTVPPSL
TKAVVMRCSGGPSILKELSDEELAVDIDDAIRMNVAAITLQIFVGGEYET
RSIKNMTRLIDMGLRYGIPTMAVTAVGKDMVRDAKYFRLACRMAAELGAQ
IVKTYYVGEDFETVTASCPVPIVMAGGKKLPELDALTMSWNAINEGAAGV
DMGRNIFQSDAPLAMMKAVNKVVHENLTPQEAFDYFNSIKAEG
>Cag_1599 hypothetical protein
MRPIRRGTSPIAGDYQDYDKAKPELISRLGSYCSYCERRIATQLSVEHIQ
PKALPKYAHFEGRWTNFLLSCVNCNSTKYTKDFLFTDVLLPDRDNTFAAF
TYLSDGTVQPSILASEKGLQAVAQKTLELTGLHKEAINTPDINGKQVALD
RVSQRQEVWAIAERQKSLIRTNPNNLGFRECVIDNALANGFFSVWMTVFE
DDTDILSRLIAAFPSTKNSGCFTENGLPLSPALNPDNLEHGGKL
>Cag_0500 Phosphate ABC transporter, permease protein PstC
MVLTTTVIYGILAGLIPLSWIAYRIGSGKAIGFEKTATLHSRPAMHGWYV
VATTALPSMLVLVVFALLHVTKLYQAPAILLFAALFVVVAGGFFLGVRTL
SPSLRARNHVESVAEWVLIAASCISILTTVGIVFSIVFEAIHFFQLVGFW
NFLTGTSWNPDTAFLEGAGRSGDESAKAQFGSVPVFAGTFLITFLALLVA
VPVGLYSAIYLSAYASYHFRQVVKPVLEILAGIPTVVYGFFATITVSPLI
VEAAEFFGLEASYTNALSPGIVMGIMIIPLVSSLSDDVINAVPQSLREGS
LALGMTVSETMKNVILPAALPGVVSAVLLAMSRAIGETMIVVMAAGLDAN
LTLNPLEGVTTVTVRIVDALTGDQEFNSPATLSAYGLGFVLLIVTLILNV
GSSVVVRRFRERYE
>Cag_0723 oxidoreductase, Gfo/Idh/MocA family
MKIGVIGVGKLGEFHTKLLTELAHERTDLHVAGIFDLNTQRAEEMAQKYN
VPRFNSVEELAKTCDAAVLATTTSTHFALASALLNEGLHLFIEKPITTTV
EEADELIRLEAENNVRIQVGHIERFNPALRTVEQWIGRPMYIQAERLSGF
SLRVTDVSVVLDLMIHDIDLVLSLIQSDIKHIAASGVKVFSNELDMATAR
IDFVNGATANVTASRLSRSKMRKLRFFCTEPKSYASLDLTSGKSEIYRLV
PPDMASSKNPLKSFAARKILEQFGEIQESLNGKVLDYIHPEVPKVNALRD
ELEYFINAVRDNAPTVVSALDGRRALFVAGKITDEINASTALLHD
>Cag_1163 oxidoreductase, FAD-binding
MIVSTKQETIAGYLNDTSHIKAGHSPAIYFPESVEEIAELLSNATTDGRR
FMLSGNGTGTTGGRIPFGDCILSLQKLNHIGRVTEQGEGKATITVQAGAL
LGDIQKVVEQQGWLYPPDPTEKLCFIGSTIANNSSGARTFRYGATRNHIE
RLLIVLPTGDMLNLRRGTCFADADGFFNLSLLSGKELRLQRPNYVMPNTS
KHNAGYFTKEGLDALDLFIGSEGTLGVIAEAELRLIALPEAIISSLVYFS
HIDDLFTFIHELRHPEDGVLPRAIELFEKNALHFLAQRYPEVPAESAGAI
FLEQEITAATEERHLDRLLALMERCNAMSDHSWLALDPQEQARLREFRHS
LPLLVNDWLNQQQESKISADMAVPEANFRELYNYYCSACEGEGFVYIIFG
HVGNDHVHLNILPRNHEEFVRAKALYRVFIKKVMELGGTLSAEHGIGKLK
AEYLSSMYSAEAIREMVRIKRFFDPHLTLNIGNVLLKEYLIDEKSSVM
>Cag_0597 transposase
MKDTVLFQQALCLPMPWFVKSSAFDIEQKRLTIQLDFQKGSTFSCPTCGQ
HDLKAYDTAEKQWRHLNFFQHECYLTARVPRISCPTCGVKAITDLPWARR
DSGFTLLFEAMIIALVPSMPCKTIANYVGEHDSRIWRIIHYYLDEALEQQ
DLSAVTKVGLDETASKRGHNYVTSFVDLESSKVLFVTEGKDATTVEKFHK
HLLAHKGKAENIKEICCDMSPAFIKGVTTNFPETHITFDKFHIIQVLTKA
VDEVRREEQKERPELAKSRYLWLKNQVHLNQSQQVKLEKLQLKKLNLKTA
RAYQVKLNFQEFFKQAPAYAQSFLNQWYYSASHSRLEPIKEAARTIKRHW
YGILRWFTSNITNGKLEGLNSMIQAAKARARGYRTTNNLIAMIYFIGSKF
EFTLPALTHSK
>Cag_1572 conserved hypothetical protein
MQKATISILGCGWLGLPLAKTLIAQGYNVKGSTTSEAKLDVLQEAGIEPY
LVTFEPEIEAEDAVSFFQSDILIVNIPPGRREDIVEYHIMQFSSLIDALG
QSPVRSLLMVSSTSVYPSLNQEVIEEDAVDPESPSGQALLMVEEMLMQES
GFQTSIVRFGGLVGYDRTPARYLSALKEITNPNHPMNLIHQDDCVGIISE
IIRLEQWGEVFNACSPIHPLRSEYYNRAADDAGVARLPLGAVDDSMGYKI
VSSEKVVKALNYTFKHSDPIG
>Cag_0724 conserved hypothetical protein
MQQIPFGIQTFKKIRQNNLVYIDKTADIANLVAQHNAVFLSRPRRFGKSL
LIDTIQDLFEGNKTLFEGLYIADKWNWTTTYPVIKIDFAAGVLHSVDDLK
SRIRKILFDNKQRLQITCEFLDDRDLAGCFADLISKAHEKYQQPVVVLVD
EYDKPILDNIENVDIAIQMREGLKNIYSVLKAQDAHLRFVMLTGVSKFSK
VSLFSGLNNLDDITLDATYATICGYRQVDLETSFAEHLEEVDWERLKLWY
NGYNFLGESVYNPFDILNFIKKQHTYRSYWFETGTPTFLMKLFAKERYFL
PNLENVEVGDEILDSFDVEDIQLETLLFQTGYLTIKQRVEMFGNLRYQLK
IPNQEVRVALNNHFINVYTAQASVQKYAQQKRFYTYLMAVDMLGLQQALQ
ALFAGIPWNNFTNNDLPQFEGYYASVLYAFFCSLNAMVIAEDTTNQGQVD
LTIIFDTLIYIIEIKRDTSETYQVSPENVALQQLLQKRYFEKYQGQGKEI
VQVGMIFNTVQRNLVQLDWAKP
>Cag_0643 Proton-translocating NADH-quinone oxidoreductase, chain M
MLSLIVFLPLIAGLIILAVPASQKQVIRIVSLLAALVQMVLAVMIWRDYD
PSLAGITAGAGGTLAGSFQFVERLPWISLDLGSFGPLTIEYFLGVDGLSI
TMIILTALVSAIGVLSSWTIQKQVKGYFILYNILATAMMGCFVALDFFLF
YVFWEVMLLPMYFLIGIWGGPNREYAAIKFFLYTLFGSVFMLLVMIGLYF
SVIDPLTGNHTFSLVAMASQENYVKGAILGPDSVFWRYAAFIVLFVGFAI
KVPMFPFHTWLPDAHVEAPTPISVILAGVLLKLGTYGMMRINFPLFPEVF
QASLYVIGIFGAINIIYGAFCALAQKDLKKMVAYSSISHMGYVLLGLAAG
NSEGMLGALYQMFNHGTITAMLFLLVGVIYDRAHSRQIEKFGGLATYMPV
YAAFVTVAWFASLGLPGLSGFISEAFVFVGAFSAEVTRPIAIVSVLGIVF
GAAYLLWSLQRMFLGQRRADALYDVVEDEHGHKHIHFHDWNGKLDLDARE
LTMLVPLVIITIFLGVYPMPIMGLLTSSINKLVQVLSPVVLSQM
>Cag_0171 Phosphoribosylglycinamide synthetase
MKVLIVGSGAREHALAWAVAQSAEVAQVFVAPGNGGTAQMGGKVCNCSIK
ATAINELLAFVQQESIGLTVVGSEQPLELGIVDQFRSAGLAIVGPTQYAA
QLESSKVFAKAFMQRHAIPTAGYQRFCDVASAQTYLQQPQLPFPQVIKAS
GLCAGKGVIVAMNKAEALAAVSDMLEDRIFGDAANEVVIEAFLQGEEASV
FALTDGVSYKLFLPAQDHKRVGNGDTGKNTGGMGAYAPAPIVTPEVMQKV
EERIIRPTLQGMAAEGSPYTGFLYVGLMIDKGEPSVVEFNARLGDPETQV
VLPLLKSDFFAALRASVDGTLESAPFEMYAKSATTVVVASQGYPDSYTTG
KPITIAPEAATMEEAIIFHAGTALQGNSLVTAGGRVFSATALGNSLQESI
TRSYALVQHITFEGAFYRNDIGVKGLRA
>Cag_0776 arginine repressor
MNKQARHFKIREILQRHSVENQHDLLQLLREQGMSVAQATLSRDCAELGL
MRVRVNGSYRMVVPDDNAGRIIKGLVGIEVLSINANETTVIIRTLPGRAH
GVGSYLDQLKSPLVLGTIAGDDTVLVIPASVHNISSLIAYIHSNLSKT
>Cag_0244 hydrogenase/sulfur reductase, delta subunit
MKRPAKLKIASFDFTCCEGCQLQLANSEASLADFLALLDIRNFREISSER
YDDYDIALIEGSVTRQDEVERLLAIREQAKVVVAFGSCACFGGVNSLKNY
HTISDCVHTVYGNMAVETLPVRKISDVIAVDFSIPGCPVSKAEVERIVVS
FAMGSLPTLPTYPVCVECKQQLNTCLFDLGEICLGPISRAGCNAPCPTAQ
SGCLGCRGAADDINMAAFVALVQERGLSLNKLREKLAFYNAFDSLPQ
>Cag_0495 GreA/GreB family elongation factor
MNDRIYLTNDGYNRLKEELYVLVHQTRKEVLEKIAEARAHGDLSENAEYD
AAREEQSQIEARIGDLESKLSAATILDPKQIKTDKVYILTSVKLRNLDNP
DEIIEYTLVSSEEADTDAGKISVRSPVGRSLIGKTVGAKVTIVVPKGELH
YEILEIFVKP
>Cag_0781 Sigma-54 factor
MADFTLQQKQVQHLSAQQILGSQLLQLPMQQLEERIYQEVQENPMLELVE
APRDGQIDGVVAEPSNGAVGEMFDSIDRFSRASLNSRVHSGSPSSGQGGS
DDSKERFFQAVQHDSLAELLCRQLALQEHIGEREMAIAEEILGNLDSDGY
FTESIELIVASLQQAEVIVSNAEVEAVLHSIHFLDPAGIAVRNVQQRLMV
QLQVAAHRYPAATYNVAMRLLGDYYDDFLNRRLDMLLKKLGVPKAELEAA
VTAIIALDLHPGVFYDEGGHYISPDVIVTYENGELTAALNDRSALSVKVT
DRYRELLANRKAPKEEKQFIRHNIQRAQDFATALAMRRQTLLKVMEALLK
QQYAFFVSGPEHVVPLGMKSVAEETGLDISTISRAVNGKYVQTRFGVFEL
RYFFGSALSTDEGEELSSKIIRQHLAEIIKAEDSAHPLSDDTLAEMLVSK
GIRIARRTVAKYREQMQIPVARLRKKIF
>Cag_0797 citrate lyase, subunit 2
MSIVANRDTRAVIIGGVAGVNAAKRMAQFDFLVNRPLTVQAFVYPPEEGQ
QKEIYRGGELNNVTVYATIEKALAENPEINTALIYLGASRATQAAKECLT
ASNIKLVTMITEGVPEKDAKILRKMANDQGKMLNGPSSIGIMSAGECRLG
VIGGEFKNLKLCNLYRQGSFGVITKSGGLSNETMWLCAQNGDGVTSAVAI
GGDAYPGTDFVTYLQMFENDPATKAVVMVGEVGGTLEEEAAEWYGREKRR
IRLIASIGGTCQEVLPQGMKFGHAGAKEGKKGLGSARSKINALREVGAVV
PDTFGGLSKAIKQVYEELKAVGIIQPEAEIDETVLPDLPLSVQEVMKQGE
VIVEPLIRTTISDDRGEEPRYVGYAASELCEKGYGIEDVLSLLWSKKLPT
REESEIIKRIVMISADHGPAVSGAFGSIIAACAGIDMPQAVSAGMTMIGP
RFGGAVTNAGKYFKLGVKEYPNDIAGFLSWMKQNVGPVPGIGHRVKSVKN
PDKRVKYLVDYVKNHTSLHTPCLDYALEVEKITTAKKGNLILNVDGTIGC
ILVDLDFPEYSLNGFFVLARTIGMIGHWIDQNNQNARLIRLYDYLINYAV
KEERPVPDKK
>Cag_1007 CRISPR-associated protein TM1801
MSTLNQKIDFAIIMRVTNANPNGDPLNGNRPRTDLDGHGEMTDVCLKRKI
RNRIMELKDKEQKYQFDIFVQPDDSKRDSHTSLKARFESEIGKNVKDKDD
AAKKACKKWFDVRAFGQLFAFDGEESSGLSIPVRGPVSIHSAFSVEPVNV
SSIQITKSVSGNEGKNGKRSSDTMGMKHRVDYGIYVTYGSMNPQLAERTG
FSDEDAKVIMEILPKLFENDASSARPDGSMEVVSVIWWKHGSKAGKHSSA
KVHKSLHVNEDGTYRLDDLEGLTPECINGF
>Cag_1771 conserved hypothetical protein
MTFADFSEKKFPIKIQTMKHPEEKFLTQAERQRIEERIAAAEKRTSGEIV
VKVVAESYHYPLEAMQGSLLLAIMVGIAVALLISEETMWLFFTFFGFSFF
SAPHTESMWFFLAIFSVSFMVFHELLKRVSFLKRIFVRKANMREEVEEGA
TYSFFRRNIHHTVNRTGILIYISLFEHRVRVVADQGINEKVTQSDWDEVV
TIIINGIKTGKQADAIATAVDRCANILAAHFPITTGDRNELSNKVILGRN
G
>Cag_0947 D-alanine-D-alanine ligase and related ATP-grasp enzymes-like
MNNSVGFRIECKSTHIVSGYLLGMTQPSLVAQLQFGETISYKALIDRLVL
CLSNYLPPQQLKELSFAQNDAQSFAKVIVLIVTGLQESVGLPVLGIAKVM
NQSNKALKELEKPVYLFQLFFPSFEPQAAKLLLEWLLNTLNHLKEKQTAL
SETQHKALQQLFQKLRVMAPGGTNNIRFIRAAHTLGIPLLSLPGGVFQYG
WGCNARWFDSSITDATPAIGVKLARNKLVTNALLKIGGLPVPEGRRVSSL
EDALLYAKQLGYPVVIKPADLDQGAGVYADLRNSDEVREAFAHARKLSPT
IMLERHSVGKVFRITVFRGEAVAVVERLPAGVLGDGVSTVQALVEKVNKD
PRRSKTSFSLMKPIVIDNEAQMMLDREQMSLNTVPLAGQFIALRRAANVS
TGGDVILLSPHEYDASYANLARRAAALLRLDVAAVDFIAHDIGKPWQQAF
ATVIEVNAQPQMGGVRTNLNKQLLASYVHGDGTIPAVMVLGADSATIVRQ
VRERYGNELSGLGSVSSDGVFIGNNAIGNGGKNIITEIQALIVDSTITAL
LFAGEHNNLLSQGLPLPFCHYLILSDCSSEVNDIARQLDIFHKHIKNEVW
LVQGHVLQGHVERLFGQDKIRLFVSIMEIVTAIKQTFDVNIMSAKIN
>Cag_0169 Tetraacyldisaccharide-1-P 4'-kinase
MSNPLRLAFRPFALLYEAIVQTRNQLFNRAVLRAWESPMPVVSVGNLSAG
GTGKTPMVDWVVKYYLSIGFKPAIISRGYKRQSKGVQLVSDGNNVLLSSR
EAGDETAMLAWNNPDAIVVVASKRKQGVKLITKRFAQRLPSVIILDDAFQ
HRQIARSLDIVLVNAEEPFVEAAMLPEGRLREPKKNLLRADVVVLNKITD
LEAATPSIKALEEMGRPLVKARLSTGELICFSGDATTLDEPATAHHLNAF
AFAGIAKPESFVTSLQHEGVNVGATRFVRDHAPYSAKMLRAIRRQAEEQG
LCLITTEKDYFRLLGQPELLSIITALPCYYLKIAPDIFDGKALLQEKLNA
VVHYVPKPEPPKKIEEPYRRW
>Cag_1267 ThiG protein
MDSLHIGSYSFSSRLILGTGKFSNSKVMLEAIRASGSQLVTVALRRFNRE
QAEDDLFGPLSALEAVTLMPNTSGASTAAEAIRAAHIARQLSGSPFIKVE
IHPNPQHLMPDPIETFEAAKVLVNDGFLVMPYIAADPVLAKRLEEIGCAS
VMPLGSAIGSGQGLSTFEMLKIIIRESTIPVIVDAGLRSPSEAAHAMEMG
CSAVLVNSAIAAAGNPPAMAEAFSEAVIAGRKAFQAGLMEQSGEAVATSP
LTFLGA
>Cag_1154 UDP-3-O-(3-hydroxymyristoyl) glucosamine N-acyltransferase, LpxD
MMTIQEIYEYLSRFFTPVELIGNGEELIHAPAKIESAQAGEVTFVANKKY
LRFLALTEASLVIVERSLAVEEYVGKHSFLKVNDPYSAFVFLLQRFIPPR
RIAKQGIAATASIGSNVTIGENVSIGEYAVIGEHCSIGNNTVIAAHSVLL
DHVTIGSDVVLFPHVTCYDGTRIGNRVVIHSGAVIGADGFGFAPQQDGSY
IKIPQIGIVEIGDDVEIGANTTIDRATLGSTVIESGVKLDNLVQVAHNCR
IGAHTVIAAQAGVSGSTTLGNHCIVGGQVGFAGHIEVSDHIQVAAKAGVS
KSFMQSGIALRGYPAQPMREQLKYEAQLRTVGDLHAKLKALEQELKALRG
SEMPLQNTL
>Cag_1082 ferredoxin, 4Fe-4S
MALFITEECTYCGACEPECPVSAITAGDDIYVIDAAGCTECVGYADAPAC
VAVCPAECIVQG
>Cag_1954 iron-sulfur cluster-binding protein, GltD family
MKVESNPILDFAINYQFPPFEELTGTHKIVAFGDHSHKCPVYVRQTPPCQ
AECPAGENIRGYHRFLNGIDKSEDEWKSAWETLVEINPFPAVMGRICPHP
CQSACNRQYHDESVAINAVEQAIGNYGIQAGLQLPEPAPATGKRVAVIGG
GPAGLSCAYQLRRRGHAVTLYDANEKLGGMVLYGIMGYRVDRKVLEAEIQ
RIINLGIETKMGVRVGSDVTLDELEQEFDAVFIGIGAQAGRSLPVAGAAE
TQGVTNAIEFLRSYEVEGDNITIGKKVLVIGDGNVAMDVARLALRLGSEA
AVVAGVPREEMACFKEEFDDADHEGAVMHFMSGALELLKNDDGSVRGLRC
AKMVKKAKGEEGWNSPIPFFRYKNSDETFDIEADTVVAAIGQTTNMQGFE
AITNGAPWLKVDRSFRIPGREKLFGGGDALKVDLITTAVGHGRKAAEAID
AFLKGEPMPDQGYREVTKVSRQDVLYFPVTPPAKRDTIKIQEVVGNHDEL
LVALTPEQAKAESGRCMSCGLCFDCKQCVSFCPQEAISRFRDNPVGEVVY
TNYDKCVGCHLCSLVCPSGYIQMGMGDGL
>Cag_0353 Ribosomal protein L11
MAKKVIGFIKLQIPAGGANPAPPVGPALGQKGVNIMEFCKQFNAKTQADA
GTIIPVVITVYSDKSFTFITKTPPAPVLLLKEANQKKGSGEPNRNKVGTV
TGEQVRKIAELKMPDLNAVDLAGAEAMIRGTARSMGIVVQD
>Cag_1678 hypothetical protein
MFISVHPDAVLDDVDEAFDEPNYNNYNTSPEPPSPYADLEKKYRKNRFNR
QRQHSNSNPLKDVTAVNGNRITSTQKANGSKPQKPKAYKPKSKPSNSTVA
AAPKSAQATYSKHSAHPTKSTSPAKASTNGKTFAKVERTPLSPRAAHTPS
ASHTPHKPHASNSPHKTHTTHSPKTAHAEKSSQLTNQAKSAPTNQQAVRP
SRSPQLSQLPPLPKSSQSSQSPQSPQSPQSSKTSKSPHSTTKPHSSQTTH
SPRQSRPQQNKKRPV
>Cag_0808 specificity determinant HsdS-like
MSQIELTTLGKSCEFFNGKAHEKSIDENGQYIVVNSKFISSEGKSFKRTN
EQMFPLYKGDIVMVMSDVPNGKALAKCFIIDKDDTYSLNQRICCIRSNKF
DTKYLYYQLNRHEHFLAFNNSENQTNLRKDDILACPLIKPSMEEQQRIVS
ILDEAFAAIDQAKANAEQNLKNAKELFDGYLQSVFENQGDDWEEKKLGEV
IKLEYGKPLDETKRKSNGKYPMYGANGIKGRTDEYYHDKKSIIVGRKGSA
GEINLTENKFWPLDVTYFVTFDEKIYDLMFLYFLLSRFDLPKLAKGVKPG
INRNEVYEIQALFPSLEEQQTIVRQLDTLRAKTQKLEEIYQRKIADLEEL
KKSMLQKAFAGEL
>Cag_0567 conserved hypothetical protein
MDKFDLFFDQLGDIQQMVNEKKRLQAEVAQMEQECAAKMQPLKDELNAIY
RQLEARIPKFMNQGGSTPRTSSRIPRGKLGESIKNLLRSNPEKAFKPREI
AEALDIKGTAVSLWLNKSGQEDPELKRIPTGPEGKRFVYTVN
>Cag_1149 hypothetical protein
MLRKKLPLLLCAAALLSPMNSLYAGKPTAQAKSSVADGAAAASSNQASMS
AIVPVADHTKGIYLVLTEADAMTQMMALVLATQHLEQGKTVQVLLCGPAA
KLAVKNCKEEPTIFKPINKSPQMLLAVLLSRGVQVEVCPLFLPNSNMTQE
QLIAGVTVAKPPVVASQMRADGIKTMNF
>Cag_0248 protoporphyrinogen oxidase, putative
MMQNDVVVVGGGISGLSLAYYAVKSGLKTTLLEKNETLGGSFASPRYKSG
GKEFWMELGAHTCYSSYQNLLSIVEECGIMDTIIPRAKVPFTLLIDGEIK
SLMSQLHIMELLTNGPSIFSLKKEGQTVRSYYEKIVGKRNYSEVISHFFN
AVPSQKTDDFPAEMMFKSRPKRKDVLKNYTFKHGLQGVAETIAATRGLNV
FTGQDIVNISMDNGNYVVSASNGVSYAAKTLVMATPSAISAKLLQEVNGD
LAAHLSKLKAATVDSIGVVVPKDAINIKPVAAIISPNDIFFSVVARDTVP
DDNYRGFAFHFRPGLDDRTKRDRIAAVLGINFSKVEHTVSRLNTVPSLRL
GHNAWLEQTNALLKNNRSLLLTGNYFGGMAIEDCVSRSRSEFERLSGK
>Cag_0545 hypothetical protein
MSTINIQLPNSLHIKMQEVARQNGVSLDQFIATAIAEKLAALMTVNYLRE
RTERSSQEDFERALSEIPDVAPEEFDKL
>Cag_0508 Rab family protein
MKDLDLIKLIEDKVNIKLALFENLSSINTGYKLNAQGDVSELSLSKCKIS
SLYLFIDILSQFKHLQILYLNDNNLSDVALLSKLTQLKALVLLNNPINSI
PFILTNLPLEFEWKNTDYGHIGFITLYNNPLTDPPPEIVAQGKEAVRSYL
LTRQKAEAEGQQMQVLHEVKVHLVGDGMAGKTSLLKQLQGLAFDKNESQT
HGINVVSLQAPQLKGCKINNELKESIFHFWDFGGQEIMHASHQFFMSSRS
IYILLLDSRTDSNKYYWLRHIEKYGAKSPIIVVMNKIDENPSYNIQQQSI
NQQFPAINNRFHRVSCNSGEGLDGVVLSLIAVLNDEGCLYGAEFPPAWLA
VKKALVTATAQERHISRNRYEELCSEQGINDAHERDTLLGYLDNLGIVFY
FKESHFTNNIYVLDPHWVTVGVYRIVNSAKTANGIFKLADLEYILNQEEI
SSSSYTPAQKKHFIYKGDEQRYLVDIMKQFELCYEEENGRFILPSNLSKE
PQTALPNLEDETPLRFIMQYDYLPAPIIPRLMIAMKDDVVEELRWRYGMV
LQSKNLEGVQASVIANQEKKEITIIVKGPDRYRRSYFTIIWNHLRTINQR
FENLQVKEFIPLPGYPNEFVEYEELLGHARNGRDNYFSGKLGQAFSVSTI
LNSIISTEERYRKSENMEVQITFSPIIKVEPPVHTQNVTVQLELHQHIEE
LQGRFRILKEDLLDEVEIELEDPKEQKRLFNELNKVENAITEMAEAANSE
QPKRLSSAVRERLDGFLESLHNPDSRLNKVLQAVEKGAERVKTLSDIYQK
CKPFFEQFPIS
>Cag_1040 hypothetical protein
MIDIFFTNPMRLIRSCTSAVEDDSHHSKQHVNRRKTMKKTIWLAAGVAGM
LLGNPTVNAQAETRDIIQPRSDHSFVIDARPSFIYLPDQGFAVSVDSPYD
IISGDDHYYMNQKGSWYRSSSYRGPWKLRKEKNLPSKIKKHRLEDIRMYR
DAEYNKIINQRNSQQQRMDNNRPR
>Cag_0823 aminopeptidase
MKRCCNISLLKELTLYYPLFMNIQVTVEALANVQAELLALPCSRQEIKEQ
GETLLASFGIDAAVLSDFKGDAGEMVMLYGLSGGYGAKRVALLGMGDNAK
LDDFRKAAVAFASRATDLKVENAALDASRFAAFAESLGESVSSLASIVVE
SAFRGTYRFNRLKSGKTDKKADDEAASQKPKELATLQLCSTEASAHAELE
RGVAEGVVLGSCQRMARELVNLPGNYLNAEQLAEAAVASGKRAGYSVKVL
HKADIEALSMGGLLGVNKGSHNAPTFSILDYTPKGEAQAVIALVGKGVTF
DSGGISIKPSENMGEMKSDMAGAATVIAAVEAAARLELPLRIVGFVPATD
NMPSGTAQQPGDVLTTMSGITVEVSNTDAEGRLILADALTYAVKNYQPDA
VIDLATLTGACIVALGYSVAGLFSNNDALANKLFTAGERMGEKVWRLPLW
DLYDEQIKSDVADVNNTGGRGAGTITAAKFLEKFIDGHSQWAHIDIAGPS
FVPKSGGKANGGTGFGVRLLVDAFKHWA
>Cag_1862 polysaccharide efflux transporter, putative
MSRNSLVAGQAGFAFAGLLFGQLMRFGYNLVVARLLGVEALGIYALAIAV
MQVAEVVALAGCDASLLRFVNLYHNDAARQRQVIGFAAKSSLLFSLAVMA
LLMLFANQLSALFHGNELLTLALSCYAAALPFNVLTQVTAHALQAFQHLK
PKIIATQLLSPLLLLLFTLLFYYTVGIQAALLMPFLLSACGALLWILLPF
ATTTGIRFIDIVRARHDNAMLTYALPLMAVSLFSMLSHWLDVMMLGIFSD
AVTVGLYHPAARTAGLLRSVLLAFAGIAAPLFAELHAQGNKAEMARLYKL
VTRWSVILLIPPLLIFMVLPQQVLSLFGAHFADSGAVALQLLSAAYFVQC
VFGIASTLLAMSGYAQLSLINAVVALALQAGLNWLFIPTMGLQGAAVASL
VLFLLLSALRWLEVRLLLQMNPLSTMLWKPLVAGAVTFLLLMLMHSWLLM
LPSLLALGVGTVIAFSCYVALMLMLKLEVDEKEIIFKYLPFMRKDG
>Cag_1570 virulence-associated protein D
MFAIAFDLVVADTSQNHPKGVAQAYSDIGSTLAAFGFTRVQGSLYTTNSE
DLANLFSAISALKALPWFSASVRDIRAFRVDQWSDFTNFIKL
>Cag_1880 hypothetical protein
MELSYEWLLTSITTFFTEQFYLLPEEYQVGLNIFFLIVAFLAASGLLLIA
LKKLWSFIRTVLTQRKVESGYRLQFPGSYSWGNALMRLPLVLIGTDIFCF
AQLSQKGKLLRYIKSASVLIPTLWSVFGLVVFMASTGRAYQTHELYLAAL
PVLLVGGTIFFLDLSIISAGGTIKAKSIRFFLAMVTGYIFSSVPLNYYFN
ADISSYMLKHDDQIAAVESSYGKRIAAIEQSGWYQQYYSLLRQEEALRQD
LMDERKGIEGAGLSGKPNALNPTLNVHYNAIESELQQLQQTRADYVRQYQ
PKLDELQQLITLKSKKVVEIAKNNEGSHIKRHHALWEYALSSTSTALFFL
AAMILFWAIDSLSILCSYIDESEYNFLCQERNEEMREFFGSRTVRQRFNP
VQTSELR
>Cag_0622 deoxycytidylate deaminase, putative
MSNQPLEGSSSPIASVPPQRLSWDEYFMSVAHLVSRRATCTRAHIGAVIV
RENNILSTGYNGAPTGLPHCHDDNCRIYRCTHPDGTVEENCVNTIHAEIN
AIAQAAKHGISIRDSDIYITASPCIHCLKVLINVGIKTIYYDKPYKIEHI
DELLRLSNIRLVQVQMADSLKDL
>Cag_1375 conserved hypothetical protein
MAIEIRHISNSDKKARKEFIKFAWQIYRNNPELNRNWVPPVIEDYMKTLD
TTIFPLYDHADLAMFTAWQDGKMVGTIAAIENRRHNQVHNDKVGFWGFFE
CVNNQQVANALFGAAAAWLRKKGLNAMRGPVSPSMNDQCGMLVRGYDSPP
VFLMLYNPPYYNDLVRNYGHRIGQELLAWYIDQTLIDIERLRRIAAHVMK
REELTVRILDMKHFDRDIEIVRNIYNKAWEKNWGFVPMTDKEFDMLAKSL
KPIANPHYVYFVEDRNKRTVGFSLSLPDVNQALKHVNGNPFSPIGLLKYL
WYSRNITMVRTIVMGVLPEYRNKGIDSIMNVQIADYGGQHGVFASEMSWV
LKANEAMSKLAQVIGGKPYKEYIIYEADI
>Cag_0866 hypothetical protein
MSANLLIIGDTKARALYDVDADALYLPISGRHLESAKALSLPLVMVDDEG
IFLAPSHWKRLLPESSSTIDTIERGLLKMGRDARQEL
>Cag_0009 Peptidase M50, putative membrane-associated zinc metallopeptidase
MDTTFYFIIAIFILVTAHELGHFLTAKLFGMRVEKFYIGFDFWNLRLWSK
QIGETEYGIGLIPLGGYVKISGMVDESFDTDFQGKPPQPWEFRAKPVWKR
LIVLAGGVAMNMLLAAAIFVGVTMSIGESRTSVSTPAYVEQGSVFADMGM
QTGDLIQAVNGKAVESWEEALDPEFFTASTLTYTLLRNGQEVTVTAPSNI
MSLINDQKGLGIRPVMPPLIGEVLPDMPARAAGIQPNSVIVAINGKSVVD
WHEVVGTISANAGKPLQITWKHLAFADGKEPSVADIRASGEMFVATIVPT
EAGKIGMALQQTIASERRKLGIGESLTSGVQQTWKATVMTVQGFGKILTG
KEDLSKSVGGPLKIAEIAGQSARQGVLGFLFFLAMLSISLAVINILPIPA
LDGGQFVLNAIEGIIGRELPFELKMRIQQIGVALLMSFFAFIFINDILNF
FKR
>Cag_1138 putative metal-dependent hydrolase
MLQIGNYRIAALLVQEFALDGGAMFGVVPKVFWQQQAPADALNRVTLAAR
LLLISGAGRNMLVDVGLGDAWNDKQRSIYAISPFRLREELQRFQLTADDI
TDIILTHLHFDHIAGAFAVENGGLVSLFPAATFHVQERNLQVACQPHIKE
KGSYLSPYIDALMQQCNVSLKQGECELCEGVSLLVSNGHTQAQQLVKISD
GKQTLLHGGDLLPSAAHLPLAWITSYDVEPLQAINEKTALLETAMDEEWL
LFFGHDPRYAAARIRRGEKGAEVAEYFEEL
>Cag_2007 DNA-directed DNA polymerase B
MENLINHIVTNNLLFGKDKEERIVGAYQLSDTHIRLFNRNGDTVTFHDEP
FYPYFFLSDSSLLETFVPENQEKFWLVPLAGSNYYTALAIFKSSRNHKNA
VDFLNRKWNGNQAAQGEAAGKNSMESNPFMYNKGDTITQYLMQSGKTMFK
GMLFDDIYRMQLDIETNYNGEKKGFYDDEIIIISLSDNRGWEQPLHSKGR
NEKELLQELIAVIQEKDPDVIEGHNIFNFDLPYIQRRCERHSIPFTIGRN
QTIPRTYPSSIRFGERTIDFPYCDIPGRHVIDTLFLVQGYDVAKRSIESY
GLKNVARHFGFASANRTYIEYKDIARLWQEEPNTLLAYALDDVRETQALS
SLLSGSNFYMTQMLPYSYAMTARLGQAAKIEALFVREYLREKHSLPKPTS
GQQQSGGYTEVFLKGILGPIVYADVESLYPSIMLSYNVCPKSDALRVFPN
VLRSLKELRFKAKDQAQQELQAGNKRNADNFDAMQASFKIIINAMYGYLG
YSGGIFNDYGEADRVTTTGQGIARKMIAEFEKRGCKIIEVDTDGIFFIPP
ASIASEQEEKALVEEVSQQMPDGINIGFDGRFKKMISYMKKNYALLSYNN
VMKLKGSSLNSRSAEKFGREFIRRGFQMLLAEDIKGLHLLFAEYKEKILN
HQLSIEEFSRSESLKQTKEQYLEDVASAKRSKSITYELAIRKGMEIRKGD
KISYYITGSGSSNFSWDKGKLAAEWDPNKPDENSAFYLKRLDEYSQKFLP
FFKPQDYSMIFSTGSLFAFSEEGIELLKEIPNTDSQTE
>Cag_1643 conserved hypothetical protein
MNYSFKTLWNRAFYFISPLWCLLVWIIWSTDQLQDPADKIVFISIVIPGF
FAVYVSGFLIEKWHNNKQQKLK
>Cag_1688 Leucyl-tRNA synthetase class Ia
MKYDFTSIEKKWQTRWQAEATFATGTDHSKPKYYVLDMFPYPSGSGLHVG
HLEGYTASDIIARYKRSSGYNVLHPMGWDAFGLPAEQFAIKTGTHPRITT
EANVKNFKGTLQAMGFSYDWEREINTTDSGYFKWTQWIFLQLYDRGLAYM
SEVDVNWCEELKTVLANEEVDEKLADGYTVVRRPLRQWVLKITAYAERLL
ADLEELDWPENVKQMQRNWIGRSEGVEIDFELRCHRTTLKAYTTRPDTLF
GATYLVIAPEHPMAEKLATAPQLLVVKEYITKAKLKSDLERTGLQKEKSG
VFTGSYAINPATGQPLPIWISDFVLISYGTGAIMSVPAHDSRDWAFAKQY
NLPIIEVIKSPHDVQEAVFEEKNSTCVNSANDEISLNGLDFATAFERMAT
WLESKKVGKRKVNYKLRDWIFSRQRYWGEPIPIKHYEDGTIRPETNLPLE
LPAVEAYHPTSTGESPLANITEWLIGNDEHGAFRRETNTMPQWAGSCWYY
LRFIDPHNHAQVVDGNNERYWMNVDLYIGGAEHAVLHLLYSRFWHKVLYD
LGVVSTKEPFQKLFNQGMILGEDNEKMSKSRGNVIPADHVLQRYGADAVR
LYEMFLGPLEQVKPWNTNGIEGISRFLGKVWRFVYPEHSEAATQPSNEPL
PDELLRRMHKTIKKVGDDTSSLKFNTAIAEMMVFVNELTKTGCNNREAIE
TLLKLLAPYAPHMTEELWEALGHTNSISHEPFPTFNPALVEENMAIIAVQ
VNGKLRGTFTVPAKSPKEMLLEEARKVESVAKFLEGKTIVKEIVVPDKLV
NFAVK
>Cag_0846 cobalamin biosynthesis protein CobP
MKKIIYVTGGARSGKSSFALQQALTYQNRVFLATAEPFDDEMRQRISKHQ
QERLEHFTTIEEPLNLEQALKMLPSDTEVVLLDCLTVWTGNLMHHLGTES
ADGEKAIEEKLQQLLDVLRQPPCTIILVSNEVGMGIVPENRMARHFRDLA
GTINQRVAAIATEAWLLCSGLPIRLK
>Cag_0891 metacaspase
MAQRALLVGINDYAPIGPGGPDLRGCVNDVQDMANTLSVLGIIPASPVNM
RILTDGRATKAAILDGLQWLTAGASPGDTLVFHYAGHGSQVLDISDDEPD
GKDETICPHDFATAGMILDDDLAAILGTVPTGVNFDVIIDACHSGTGARE
LSALTALSDDEAVAYRFIEPPIDWGFFLDSAPSLPVRGILKRNTTRGKAK
ATAAKNEDQGVGQLNHILWAGCQSNQTSAEATVNGQKRGLFTATFCKILR
SANGNITRKNLEVQVSRNIRAMGYSQIPQLEGASTHLKKKAFT
>Cag_1724 conserved hypothetical protein
MQTIYADGVANIALIDGIIRFDLVNITKMEKENVNLRPVAPVAMSVTGLL
RMHDQLSQAINKMVEDGILKKNEQPPVVIDGGQ
>Cag_1026 C-type lectin
MSIIQEHESNDLIANATSLTLVAESLGSTTSVADGVGTQSNPTQYNSWSD
PDYWRIELLAGYQISFSILTPSSSLQPYAELRDAANNTVAYATADASGES
VHLVPYTLTASGSYYVVVGKNYYTSDGGDYELHVETAPFIKPEDDDNNTI
DKATELVLSENPASSGLLLGVGMGEQDPATMYNHWSDPDYWRIEVLAGDL
VSITVQTPDSELNPYVELRNAADGNLVGSNDEGAGNDAFISSYEIKDSGS
YFVLVGKDYYSGGGTYSVQVDVARGIQMESDANYDNGNLTQANALHLSAE
ITNTDRYQVATVAGAIMLPEYLTVDVDVFALGRLNAENTVELTSLLPSTS
NLTPLVTLLDSAGNVVTDSDGNSADGNFSATLTKDDDYYVQVERGYQYNG
HTYLLTNNGMNWTAAQEYAESLGGHLVTIDDASEQQWLFSQFGSTNSWIG
LNDEVTESIWQWGNGATSTYRYWGDGHPYGGEYYNYAYIATDGKWYSGNE
TWGYYALIEIENTASAANSASSSTSNYLLDVRIEDSVAPRVESITLPANG
SSVDDPIGATITVTMSEKLEPATVKAGLREVWVRDGHYYTVTDAAVSWTD
ANTAATALGGQLVNIESAEEQAWLLSMLDGRYGDVWLGLSDTATEGTWVS
ANGETVWVYGAETNSAYANWGDSQPYYQWDENYDYAAMNGSGKWYASYGS
NAMRGVIEIVDNDSDSDGLSDALDPYDNDPLNAWDLREAGADGVFDTPDD
VIHRLLLKEAYNGGTAVNLLIEDGTLGAGSYRFTANTTLTDIVGNVLDGD
SAKIGSEPYVHYFTITHPAGVTAEGGRNNILQNATTLSLNEDPAGKGLWL
AHGVGNQDPATIYNHWSDPDYWKLELQQGDLVSIYVNTPDSSLDAYVELR
NANDGYVASSNDDGASNDSFISRYVVTESGTYYVLVGKGYYSGGGAYELQ
VDVAKSIQMESDANYSNNPLSGANLLSFVADGNNQVATVAGAIMESEGQP
DIDVYALGRFNAGNTITLTANLPSTSGLSPIVTLLDEAGHLLLDADAHYA
DGTCTITLEDNGNYYAQVEKGYQYNEYTYIVSSTTMSWEAAHIYADMIGG
HIVTINDADEQQWLTEQFGWTSSWIGMNDAALDGTWVWDDGTTVEYQNWG
SGHPYTWSNDYNYGYLATDGKWYSSYNGYTYRALIEIEIPHTAEASTNSL
NTSYLLDVSIEDDVAPRVESTTLPTNNSTVDNLVGATFSVTMSEKLDAAT
VKAGLREVWVRDGHYYTVTDAAMSWQDAAIAAQALGGQLVNIESADEQAW
VQSMLDGRYGDVWLGLNDAATENTWVSADGSTTWIYGAETNSAYTNWGDS
QPYYQWDENYDYAAMNSSGKWYATYNNSSYNLMRGVIEIVGSDSDNDGMP
DALDPYDNDLYNAWDLREAGADGVFDTTDDIIHRLLLNGTYVDSTTVNLL
IEDGSLNAGSYRFTVNTTVTDIVDNTLNGNVLNGDGDSTAGDRYEHFFTI
APPAGVTTEGGRNNISPNATALTFSEDPSGKGLWLAYGMGNQDPATMYNH
WSDPDYWKIELQAGDYVSVYVNTPESDLDPYVELLNTNNGGVAWSNDDGA
HEDSLISRYAVTESGTYYVLVGKGYYSGGGSYELQVDVARGIDMESDANY
SNSSFGNANRVTLVDAGVEQRATVAGNIMAPESTFDRDIYALGRLNAGNR
VELNTAMPSSSDLMPVVTLWQADGTMVADSDGNYTDGQFNALLAADSDYY
TQVERGYTFGEHTYLLSSSNMTWSAAKTYAESLGGHLVTIETAEEQAWIN
ELLGSSTSWIGIYDAADNGTWVLLDGTQPTYTNWEASQPSTWDNYNYGHI
NYNQLWYAGLESWGLPVLVEIDTVGTLPSSSALGSEYILDITVTDGVAPR
VESTTLPTNNSTTDNLVGATFSLTMSEKLDAATVKAGAFEVWKYNGNYYA
LTESAMTWVQAEAAAVALGGHLASVLDANEQAWVQSMLDGRYGNVWLGLN
DAATEGSWVYSDNNLALYTNWASNEPYYQWDGNYDYAYMHTNGEWRNTDG
SSTMRGLIKLNDTDSDKDGLPDAFDLYLTDSRNAWDLREAGADGVFDTAD
DIIHRVLLNGAYENGTTVNLLIEDGSLGAGSYRFTANATLTDIVGNALDG
NGDGTAGDYYQQFFTIAPPTGITVENGRDNISTNATALALHEDPSGKGLW
LAHGMGNQDPATMYNHWSDPDYWKIELQAGDLLSVYVNTPESNLNPYVEL
RNATDSQISYNNDDGAKEDAFISRHLIEQSGTYYVLVGKDYYSYGGSYEL
LVDVARGIDMEYDPQYRNDSLGGSQSLGFSAAQNYQLSTVAGAIMGYEYG
YDYDVYNLGRFNAGNTIELTITQPSTSSLVPLVTLYNAAGTPVTDANWNP
ADGTFNATLTHNDDYYAKVEHGYTFAGHTYVLTHDNMYWSDAEAYAEALG
GHLVTINNAFEQQWLANTFSWANPWIGISDSSNTNEWHWSDGSQSLYRNW
GDSQPDNYYDYGYLNPNGYWYTGANNWNYRALIEFDSMGIIPAATDPTVT
NYLLDVRVEDSVAPRVEMVSLPANNSTIEHLVGASITVTMSEKLAPATVQ
AGMFEVWEHNGHYYTLTDRAKSWQDAEASAVALGGHLVSINDATEQAWLQ
SMLDGRYGNVWIGLSDAAAEATWQLTDGTTSTYANWASNEPYYQWESNYD
YAYMNSNGQWGASYNTNSMRGVIELVGTDSDNDGMPDQLDTYRNDPYNAW
DLREAGADGTFDTADDVIHRLILNKGYSSGSTFVNLLIEDGSLNAGSYRF
TANATLTDIVGNHLDGNANSIGGDAYMHYFTIAPPAGVIAEGGRNNTMQN
ATVLPLTQDPAGRGLWLGHGIGNQDPGFAYEYGIDVDYWKVELQKDDLVS
ISVNTPDSNLDPNIALYDANGTYFDYSNNEGPDYDAFISRYVVTTTGTYY
IKVDKDYYSGPGSYELQVDVARGIQMEYDRNYDNDPLSGANVLTFTQAGT
QRIATVAGNMMSAGDGQVDDDTYALGAIEAGNTILLGITIPDMGDLRPVV
EIYNAQEQLVGLDPNPSSGVARFDVITTGTYFARVLPFTGSGSFGHYLLD
AAITPTVEAQFADLAIDSKSVIVPATPQSGSTITIAWSVGNYGKIATEQS
TWQDRVVLSLNSRLGDADDLLLATVEHNGVLDPATSYNASVDVVLPTLLE
GNARIFVTSDVADVVEESFFEINNTAEKEIVVSLTPYADLHVAEASMPST
LQADTTFIVTATIANNGTGAPGTGIPNESVNTWVDKLVLSGNAVLGDADD
EILETLAHTGGLDAGSSYEVSFDVTLSAEQLQDHLFIVSDSGDAVFEAYN
SGMNERRVNHLPEGSVVINGNALQSTTLTVTQTINDADGMGDLLYQWYAD
DEAIAGATQTTLWLDTSLIDKQISVVAHYEDGYGVEEAVESTPTEAVVAD
TTAPTVVSFAPTDNATEVGLNSSITVEFSEAIERGSGTISLHTGSPQGTL
VESYDVSSSYNLHIAGSTLTITPNNRLDDSTHYYVVFEEGSMQDMAGNDY
AGTVEYDFTTVVNHAPIIRIPHELSFADKVDYATGDEPYDISTGYFNNDE
WLDVAVVNSRSKTFAIYNNNGNGSFTLGESYNTPHMSFSIASGEVTGDDY
VDLIVSNYYNNTVSVWSNSEIGTFTETSSCVTANAPSDVRLADTDGDTDL
DIITLHQDSNSISIIKNNGDNTFANYVVYATGNHPSSLAVSDLNNDGFVD
LMVTNTVGDSVSVLINDTYGAFSEKVDYSIVNPSIVISRDVDADGDADMV
VGRALFGYVSVLKNNGDGTFTAQADYRLADNPASLNSVDVDGDGMLDIIV
GYRDELSTISVLKNNGDGTFGTPIDYPAGTKSYSIASGDFNNDEQSDLVV
VHYDTDTFTLHLNNSVEKTATAFTEQTPVAVSSNITINDPDGDASWNGGC
LQVQITANAESLDRLLLPTVAGNGVWLDSANNNALMAGELRIGEANVVGV
QGSAAWHFSFNEHATNALVQEVTRSIMFNNNSNTPSELERTITFTVTDTF
GDWAAVDQQITVTADDDPAPITHDLYGNITFWKSGNPLNDVQPTLASKPM
ENHDQEVAFRNLQQQSDGSYTVELWASTTQTDIHDFQLQLYFPESTTSVS
WQASSTLNGWISVLSDQTTGQVVLGGVSTAQTLQTGSVQIGTLSFSAPDI
PDNFTLTAASGWIGTENIVPTSILCTATDTTGDYRFEDVTDGWYRVAGES
NTNMLANAVTTEDALAALKMAVELNPNEPNASGLLDPVSPYQFLAADINR
DGKVRANDALNILKMAVNYANAPTDEWIYVRQDHQLADMDRSHVDWSFAE
RAIDIYGDMNVNFVGVVKGDIDGSWGMVP
>Cag_1575 conserved hypothetical protein
MDTLTKLRILSGAARYDASCASSGSNRSGASCGIGNTSQSGICHSWSDDG
RCISLLKILLSNDCCYNCAYCVNRATNPVERASFTAREVVDLTLDFYRRN
YIEGLFLSSAVMQSPDATMERMVAVAETLRSEERFGGYIHLKIIPGASSE
LVRKAGLYADRISVNIELPSQVSLERLAPQKHRAAILEPMALIGREINTS
LVERQHSHRAPRFAPAGQSTQMIIGATPESDFQILRLSQGLYKKMNLKRV
YYSAYVPVSEDNRLPVLAAPPLLREHRLYQADWLLRFYGFSAEEILSEEL
PHLDEQFDPKTAWALRHPEFFPVDINRADYATLLRVPGIGVTSAKRIVAA
RRFSLITFEGLKKIGVVIKRARYFITMQGRRVECTDFSPTLIRRQLLLSE
STEKPASRQLVLPGLEPILA
>Cag_1812 conserved hypothetical protein
MPSCVYSADEGLIDTDYGGGVIKQRIPRQGEGKSGGLRSIILYKKADKAF
FVYGFAKNAQQNIKDSEVKGFKKLAKNIFELTDTQLEKLIISKEFTRVQC
HEE
>Cag_1849 50S ribosomal protein L23
MANPLLCPLLTEKSTGLTEKKGQYVLVVRPDADKLTIKDAIEKKFGVSVT
SVRTINYLGKLRAQNTRRGRIIGKKSDWKKAIVTLAEGQSIDYYSGTAQK
SEG
>Cag_1939 lysophospholipase L2, putative
MPEPKPHTLAVLVLHGFSGTLESVKALREPLQALGLPVAMPLLAGHGEHS
PEALRGVTWETWLADAEEALLQLSNQALQVIVIGHSMGALLAVQLAYRYP
TLVDSLVLAAPALRIASIFAPGRLFHSIAPIVSRLVKNWRLQSEEDRRNA
VYGDLHYEWVPTKTVLSFFELVKKTERLLPHITHPALILHCRCDNTVLPE
SAEIAHSSLGSMPAAKSLVWFDKVGHQLFCATERDCVVLEIVGFVKSRFR
>Cag_1209 peptidyl-prolyl cis-trans isomerase, PpiC-type
MALLSSLRNKTHIILYVLLASFLALIVFEWGMESNFIKPQNVAGKVNGAS
ISYQEYDNAYKALSENVRRANPELELTPEMEHELQERAWNSLVDQRLLEE
QFRKFGITVQDREVLEAMNSPQPPMVIRQYFTDPATGKVDRQKLESARRD
PANRDMWLQLEKYVRMELQENKLLRAFQTFERVTDREISDMINRKVSLFT
ASFIPLPLSAAGDDKRFPITDDDIKKYYDEHKEQFRQEKPSRKVETVFFP
LVPSAKDSAAVRQELEALRADFSKSQNDVDFVKVQSDRSNSANVVLTRAD
FSPLAGNAIFNTSSLAAGSLIGPFADNGEFRLLKVVRVQPAVQPIARASH
ILLRFNPANKAEIEKVQQLTMQIGQQLQAGVPFETLAKQYSADPGSAENG
GDLGWFSPDRMVPEFSKAVFNSRPGAIIGPIQTQFGVHIIKVTGFDQRAL
VASEVVRTIRSSSESMESQRRRAMAFQVDAKEKGFAKSAATAGVKVEESN
EFTRRMAIRPLGYNDKVATFAFSAKEGDISDVIESKKGFYVARLTAKHDE
GYRSLDKDAKERIKQELLVEKKGAALQQKLTAAAKVPNATLEQIAARVGS
QVVHADGIRWADGFIPNYGVDRVLVEAISGMTAGKLFKPVKTSNGYALVR
LERKSIPEGFDPNAVKGMVAPQLLQAKHQQLFTEYFQSLRNVEDLRP
>Cag_0957 conserved hypothetical protein
MSTSYSVLWTKVAERDIKEIITFIANDNPSNALHVLEKIKDKAAALSMAP
ERGRIVPELHSKGIFIYRELIISPWRLLYRIANHEVYIMAVLDSHRNVED
ILFHRLIQS
>Cag_1053 Filamentous haemagglutinin-like
MNRVFNVIWSITREKWVVVSERVKSNGSVPKSSLVSIAFLSALLGGGSVA
QAVDANQLPTGGVIAAGSGSIAASGNSMTIQQSSQKMVANWSSFNVGSDA
SVRFQQPNASAAALNRIAGQSPSQILGSLSANGRVFLVNPSGIVFGKNAR
VDVGGLVASTLNISDNDFLAGNYAFRSTGSAGTLRNEGVINAMPNGVVAL
LSPSVVNNGTINAAGGTVALAAGNAMTLDFGGDGLMTVRVDEGAVNALVE
NNALIKADGGLVVMSAKAADELALSAVNSSGVVQAMSVVEKNGRILLDAE
GGQSTISGTLDASSVDGKGGQVVVTGKQVMVADGAHLNASGLTGGGEVLV
GGSWQGSDASVRQAVGTVVMPGALLQANATGNGNGGTVVVWSDVNNPLSV
TRAYGTFEAYGGLLGGNGGRIETSGHWLDVAGSRGGASAVNGNAGVWLFD
PWNVIIGPDPTTSGTSFTNPFNPTGDSTILASNINTLLNAGTSVSITTGT
GGTVGVGDISVNAPILKTTVTGLNTLTLSLIAEGNIFINNSIGNSSGTLN
LNLTTVNGAISGTGNITGNGNGDTIFTVGAGSGTYSGNLVDRRFVEKKGV
GTLIVSGDNNHDGETRISAGTLVVQSSTALGKTTNGTQVVDGATLQLEAN
IAAQELLYLAGDGVNSNGALKNIGGNHVYGGDIILLNNSRIMSDANTLTL
NGSVNGAYSLTVNSVGSTIFNGLIGNSAPLGAFIGTAGTPITFNGSSITT
VGAINAAGVVTASNPLTISAGAGNISLSNTGNNFNSVNITSAGTVSLVDT
NALALTGVNATGDVSIATRSGDLTIDGHLLTTSPTSSAMILNAEQAQIAG
NGNGGNLVFSSGTLTVGSGGIATLYTGSVAGSTSIASVVNAGHFRYNSDE
AINGTHYTDPLTAGLNLIYREQPTLLVAPAATPTPYGTAPSYTPSYSGAV
NNDPTVGTVAGTPQWAFDNATIPTKSLSGQDEVGTYNVKYVGGLTSTLGY
GFADNGGNGELTIAPKEIVFGNGLTGGVNNKVYDGTLTGTITPLVLYVVA
GDNVSLNSTGATATFSNKNVGVGKTVTVAGLALTGDDAGNYSIGNQTTTA
NIIQASLTVTAPGNLTKVYDGTVTAIGVATVTGLVSGDTVAGTVAIAYAD
KMAGSSKAVNPLSVMIVDGSDMNMTGNYNIAYVPTVNNTITQASLTLTSP
DNVSKFYDGLMSAPGAPMVTGLVPNDVVVTPAPLSYNDPEVGNNKTVSPN
PAGLVIHDANGGDMTPNYVITTIPRNDGVIVEKTFTPYKEWNDIDPSTPE
VPTAAPEVSGNRDLGDVELAADDGGTTATRSLAMVAMDETAIQSDIVVTL
LEPAAKNKQGVVKVFVPKEVLAKPAFLFPLPDDVATAINQTAVQERVFLQ
NGDALPGWLSYDRDKKIFTAKSAPAGSLPLTVMVQAGSMAWQVIIQQ
>Cag_1245 nitrogen regulatory protein P-II (GlnB, GlnK)
MLMIRTIVRPEKAYDVMQGLFDAGYPAVTKISVVGRGKQRGLRVGDTVYD
ELPKEMLFTVVADVDKDFVIRAIIDNAKSGADGKFGDGKIFVSAVEEVYT
ISTGERETDPLSPAKGV
>Cag_1843 Ribosomal protein L29
MKKYEIAALSTQELMDKIRQLEERLADIRFYQVIEQPQNPMVFRNSRRDI
ARMKHRLHQLASAEK
>Cag_0599 hypothetical protein
MENFEKIKKILTSNFEIELFEAALASLNDKSNRLRFNNFAYSIRELSRHF
LYSLSPELNIKNCRWYKTETNDDKPTRAQRVRYAIQGGISDELLEDWSFD
ILGLADTIKSVVSSINSLNKYTHINPEVFDLKDEEVKEKSILVLETFSKF
VETIKEYREELKKFLDGHIENHMINSVISNFFKNVDCLAPHHSLEYCEVS
DYHISEINDKKIVVNVTGDLHVVLQYGSSSDRREGDGLDLNENFPFETKI
RYEISEDFPSDNHEVDDYDVDTSKWYE
>Cag_0777 Argininosuccinate synthase
MSKEKIAIAYSGGLDTSVMIKWLKDKYDADIVAVTGNLGQQKEIENLESK
ALATGASSFHFVDLRTEFVEEYIWRALKAGALYEDVYPLATALGRPLLAK
ALVDVAVAENCTMLAHGCTGKGNDQVRFEVTFASLAPHMKILAPLRVWEF
TSREAEIAYAMEHNIPVSATKKNPYSVDENIWGISIECGVLEDPMIAPPE
DAYQITTSPEKAPDTPAVVELEFVEGVPVAMDGKKMNGLDIIVQLNTIGA
AHGIGRLDMIENRVVGIKSREIYEAPAATILHFAHRELERLTLEKSVFQY
KKNISQDYANIIYNGTWFSPLRTSLDAFVNETQKPVTGLVRLKLYKGNVT
LLGRTSPYSLYNEALATYTEADTFNHKAAEGFIHLYGLGLKTFNQIHKG
>Cag_1049 conserved hypothetical protein
MTIAELQEQPLAERLMLMEELWETLCNEKHHIQSPAWHQEILEERINLIN
SGEAEYLSIEELKKY
>Cag_1099 DNA modification methylase-like
MNELQDESVHLIVTSPPYWQLKDYGTENQIGFHDDYETYINHLNLTWQEC
YRVLHKGCRLCINIGDQFARSTYYGRYKIIPIHSEIIKFCEIIGFDFMGQ
IIWQKTTTMNTSGGASIMGSYPNPRNGIVKLDFEYILLFKKQGTSPKPTK
EQKDNSVMTNEEWNTYFNGHWYFSGAKQDQHLAMFPEELPRRIIKMFSFP
NETVLDPFMGSGTTALAARNLNRNSIGYEINPTFIPIIKNKIGMDDVFMK
VETSVIKQPEITIDFNECVNRLPYQFIDTHKLDKKIDVKKIQYGSKIDSE
STGKREDFFSVKEIISPELLKLNNGLIVRLIGIKQNPAINGKATEFLFNK
VRGKKVFLRYDAIKHDKENNLMVYLYLENKTFINAHLIKNGLVLVDNSID
FKYKAKFNSLTNG
>Cag_1490 Hybrid cluster protein
MGMSCNQCQESIHGTGCKARGVCGKDELTAKLQDVLVYATEGLALVAEAK
GGAVSRSVGQLISESLFVTVTNTNFDEDAIVVHIRKTLALRDELKNGLPT
VPAHDAATWSGSSKDDFLAKAPSVGVESLSGNEDLRSLKSIVLYGTKGLA
AYTDHAAVLGYHDDDIYAFYVKGLSAMRKELSADTLTALVLETGAVAVKA
MALLDKANTETYGHPEITSVKTGVGQNPGILISGHDLRDLEDLLKQTEGT
GVDVYTHCEMLPAHYYPAFKKYSHFVGNYGNSWWAQDKEFESFNGAILMT
TNCIVPVREAYRNRMFTTGMAGYPGLRHIPARAEGGVKDFSEVIAVAKSC
KAPIELENGTIVGGFAHNQVLALADKVVDAVKSGAVKRFVVMAGCDGRHA
SRRYYTDVAAALPNDTIILTAGCAKYRYNKLGLGDIGGIPRVLDAGQCND
SYSLAVVALKLKEVFGVADINDLPISFDIAWYEQKAVTVLLALLYLGVKG
IRLGPTLPEFLTPNITATLVQVFDIKPISTVEADVDAMMAGK
>Cag_0811 conserved hypothetical protein
MELAQILEGNWLFRVEHRGITIHSSTIQDSPIQGFKGEVELQVSLKRLLS
AFYDMENYKRWVHQLAELTVIDKPDPTEYIIRQVINTPWPLQQREVIMRS
RLEGVGENGVALSMQSEPDYLPLHAQCHRVRHAQGMWVFTPNGHGVVQVM
FIMHLDPGPDVPPPVSNAGLFEVPFYTLKNLKALLDDAKYQPMWPEELEH
YLAIVEEDNLDTL
>Cag_1406 hypothetical protein
MEMQENNIRQLRLHFDGLATVEHKLPASLLVQALSKFQRVVHLIAMADEG
REVLQRARITREIERRFPLICEVPQKGGYALPITIGGEADQLFDEQACEN
IAKKTREVIVAIDRSDVKELGNIIPDMFYRRSILEELKAMQPASHSCFFI
DIEDCYNQPILNGSTATEKIKTLLMPPTNETSSSDFGYVTGALIEMKFNE
RRLVMKLLGSNKQLSVTYAEDFEPMLLDNPRELIQIHGNIVWNDDGLPQS
ISDVDEVVAIDETPLDIHVVEFDTIFLQPKKTLQSEVVFDRESALFQASG
PFDIYLCAATRAELEEQLYNELAMLWQEYAKPPSSDLTLDAQELQKELLY
AFEEVIRGI
>Cag_1868 glycosyl transferase
MRIVLLSPFPPLKGGIAHCSGALHAALTAAGDAVVVLPFKKLYPSFPSFL
FSALSPPTPSNATLVLYNPLTWLSAVRRIREQKPELLVIAYWSGVLAPLA
LLFCRLSGTRMLLLLHNLTGHEAFWGESFLQRKLLSSVAGVVTLSHTVTR
QVQHVAPSLPTLTLFHPIEKLPAPSFSKLEARKALGLTSNAPVLLFFGYV
RRYKGLDLLLQALPHVVAQEPSLQVVVAGYFYEPLPRYQQIAETLGITHN
VTFHAGYVPSEKNATYFAAADGVVLPYRAATQSGVVPMAFAYGVPVIVTP
VGALSEMVQHGTTGWIAKAASPDAIAAALREWLANRERWSAMRSSIEAMR
DSVSWERFAAECQPFFASLIDKGRR
>Cag_1781 lipoprotein releasing system
MNFTDHLRIAFVHLRERKRQTILAALGVAVGSAMLITTIAIARGSSDSVI
AKVIDTAPHVLITAERVTPLVPDNLVPRSKQHITMLTKNITPDEPEIIKN
YSEVVQRISSIKELESVSPFVVSKLLARNKTRFTPCIARGVVPELEAEIA
GLKKNVLEPTALEELASTPNGIIVGELLAKKLALRYRDRMVLVTKKSEEF
PVTIVGRFSSGFNRKDESEAYINLALAQRMEGISSNSVSGIGLRTTSVDK
ASITADEVEKLSGYKSESWDETNRNVIEFYNRNALITLVLVGFVFIVAGL
GVSSVMTTVVLQKIKDIAILRSMGMMAKSITRIFMLEGLMIGILGVLVGS
PAGHIICHLIGTIRFEASTAGSIKSDRLTVSESPEVHLIVIVFGILIAVL
SSLSPARKATRYVPVNILRGNIGG
>Cag_0709 hypothetical protein
MFYRKNFLGIPEQVLHGDGFTIELSRNEVVLIDIYNADLLLSKLADEVLT
AKAS
>Cag_0383 oxidoreductase, short-chain dehydrogenase/reductase family
MTKRSYLHSISYSGYSYAIPSGLKTREKTIGNMQQRHNSLGIVITGGSKG
LGFALAARFLAEGDRVVLCARNGERLEAALAALRQQVPTGEVYGIACDVA
DTAAPPLLAQFAVAKLGNIDRWINNAGTAGLQKRPLWQLAGSDIAETCTT
NLAGTMAMCAEAVRVMQRQPSAPQACYHIFNMGFSAVGASFSRSAVPHKA
SKRGVAEITHFLARELHEAAIRSIGVHELSPGLVLTDLLLRDAPADTRRF
LQVVAQTPEAVAAVLAPKIRKVRGLNRTVRYEPLVAMVFRMVAGLPRLLR
SASATS
>Cag_0702 probable phage-related lysozyme
MQTSDNGLNIIRQYEGLRLKTYFCPAGKLTIGYGHTGTDVTSGMSITEAQ
ANELLQEDVKRFATSVNKMVTTEVTQGMFDALISFSYNIGAGNLQKSTLL
KKLNAGDKQGAADEFLKWNKSNGKPLAGLTARRTAERELFLA
>Cag_0105 conserved hypothetical protein
MQEGFNHYQQQRRAYTLYQQHSYLQAEQAFHTLAAQAPSPKEKASAHFNE
ACALAMQGNHTQALPLFTLSRKGTTLTEPLRLQALFNEGTLLAAQAKKSS
ARQEKMTLYQRSLHHFKQVLLQSPTDVDAKINYEIVRRHMAALQPKPPQS
PKQQPNRAAITPAGGIGNDVAQRLLEQAARNESSLMREMAQQGKSSTPRS
TKNLRDW
>Cag_1194 putative plasmid maintenance system antidote protein, XRE family
MKNNYKSKEDIAVAREIISCPGDTLAEHLEYMGMSQAELAERMGRPKKTI
NEIIQGKAQITPETALQLERVVGISATFWMNLEHNYRLLLAELDEAEKRI
VDAEWAKQFPLQEMIDKGWITVDNGCDNAINTILSFFKVATPQAYQNYCH
NQLYATAYRMSETCSKDPHAVAAWLRQGERQAEYLKAVLFDRKKFEEMLL
TIKKLIVQDDNFFEALQDCCLQAGVKVVHTPCLKKAPLNGSTRWINDSPL
IQLSNRFNRNDIFWFTFFHEAAHIIKHNKKDVFIEGMDYSFDGKKKEDEA
NMYAEEYLISRKEENELLASTSFQKDDIQHFAEKFSTHPAVVIGRLVNKG
KVKAELGHLYGFYKKVELH
>Cag_1495 conserved hypothetical protein
MLRLLLQWLINALAVYATAQILEGIHIRGFATAIAVALVFGLINTLVRPV
LLFFSFPVIVLTLGLFLLVINALLLQLAALLVGGFSIDGFWWAVAGSVVI
SAISWLLSSLFRVG
>Cag_1527 conserved hypothetical protein
MNHEKYNRRSIRLKGYDYWQVGAYFITICTQNKECLFGKITDGKMVLNDA
GNIIQEFNAITESHFKNIAISPFVVMPNHYHAIITVGAGSPRPNNPHNEN
DHICDDGRVRVDDGRVRVDDGRVRVDDGRGNPAPTLGQIVGYFKYQTTKR
INTICQTGGKKLWQRNYYDHIIRDEKSFHAISTYIINNPAQWAKDELYL
>Cag_0956 Prevent-host-death protein
MNITTDIKPVTYLKANAADLFAQINDTRRPVIITQNGEPKAVLQDPKSYE
ETRNVLGLLKLLAQGEEDIRKGKLRQQEEVFRDMEYLLKEQAL
>Cag_0978 ATPase, ParA family
MKTIALYSIKGGVGKTAAAVNLAWLAAQHNSPTLLCDLDPQGASSYYFRI
SASKKFSSSKFLKGNKKIYDNIKATDFEQLDLLPSDFSYRNLDIELAEEK
KPQKRLKKNIEELSNDYSLLFFDCPPNLTLLSETVFTASNIIVVPIIPTT
LSVRTFIQLKEFFVKNNLDETKIIAFFSMVEKQKKLHRDTMQELQQFPEL
LKSTIPYNADVEKMGIYRVPLNAMQPNSAAAKAYIKLWNEINDRLHTK
>Cag_0663 hypothetical protein
MVGEIYLAQIYFTDLSEYKIRPVLIVKELGDDCMCLQLTSQLNYDGILIT
NNDLFDGYLKKDSMILMPKNFTLHKSILKKYLARIKLDLIERIMNQLCKA
LGCV
>Cag_1003 hypothetical protein
MVLPESRPSRARGLKPRIGNLFTDGMESRPSRARGLKLGYADRAAIEVFE
VAPFAGAWIETAAEGLKKKPVQQVAPFAGAWIETPEDVDDYEKAASRPSR
ARGLKHVRCQSQSSALCRALRGRVD
>Cag_0280 conserved hypothetical protein
MDYLIDTHALIWFINGDTQLPDKAQKIIKNIDNKCYISIASIWEIAIKIS
LDKLDLNGGFDEISKIIVRYDFELLPISIEHIAEIIGLEFHHRDPFDRMI
IAQGLVENISIITKDKIFTNYNVKIIWD
>Cag_1374 hypothetical protein
METTMQTNANNNYTTDSIIREVRCLKEDNAAEYGFDIRMIAAAVQLKQRQ
HPERIVTRILSDVEQKYGKQPLTRLVTENEL
>Cag_0045 Aminodeoxychorismate lyase
MKSPRSFITRLILAVTLLIAAFPLGFLLIPGLNSKSKPTQLVVHREMRFS
DVLDKLQASGAIRERWQPELIARMVPKFRTIKAGRYTIPPNTSNFGLLWY
LRTHPLDEVRVTLPEGIDRRKMARILSRKLDFDSTQFMAATENPRLLAKY
GIRASHAEGYLLPGTYDFAWGSSPDEAASFLIRQFKKLYTTERQQRAAAL
GFNEHSLLTLASIVEAETPLDKEKPTVASVYLHRLRIGMRLQADPTVQYA
LGGTTRRLYYKDLAIASPYNTYRNKGLPPGPICNPGKASIIAVLNAPQSG
YLYFVATGTGGHYFGASLQEHHANVQKYKQARSSNE
>Cag_0008 1-deoxy-D-xylulose 5-phosphate reductoisomerase
MRALSILGSTGSIGLSTLDVVRQHRDRFTIVGLAEGHDVAALAAQIEEFK
PLAVSVRDAESAKKLQELLGAHKPEVYYGLDGAATIAALDGVDMVVSAIV
GAAGLRPTVAAIKAGKHIALANKETLVVAGELVSRLVAEHKVHLLPVDSE
HSAIFQSLAGHRAEDVERIILTASGGPFRKTSAEELKQVTLEQALKHPQW
SMGAKITIDSATLMNKGLEVIEAHWLFNMPAEKIGVVVHPQSIIHSMVEY
IDGCVIAQLGAPDMRAPIAYALSWPERCESGIHKLDLAKIATLTFEEPDM
ERFPALRLAFDALKAGGTYPAVLNAANEIAVAAFLERKIGFLDIAAMVEK
TMQAHEAFTPVELEEYLQVDRWARDTAKTFLP
>Cag_0967 superoxide dismutase
MAYQQPPLPYADTALEPHISANTLSFHYGKHHVTYITNYNNLVAGTPFES
MSLEEVIMQTANDAAKVAIFNNAAQAWNHTFYWNCLSPNGGGKPTGALAD
KIEADFGSFDKFKEELKNAAVTQFGSGWAWLVLEGETLKIVKTGNAQTPL
TSAQKPLLTIDVWEHAYYLDFQNRRPDYVSAVIENLLNWEFAAANFAQ
>Cag_1060 Molybdopterin-guanine dinucleotide biosynthesis MobB region
MTFHPFEIAFCGYSGSGKTTLLTKVVRSLAERYSVACYKHGCHHFSLDKE
GKDSWLMRQAGASAVMIGDPQQQALMAERGVFSLALERHAFAFADMLLVE
GLKELPLPKLLMIDAERRIVKLWQQGSISNVLGFISPDDPQHYTDYGLPV
MQRDNVEAIAHFVESILLERAKAHYPLNGLLLAGGQSSRMGSDKALLSYH
GDNQLQHTAALLREVCCNVYVSCRQEQAEHYCQFGIPLITDAYLGIGPMG
GLLSAQQAHPNAAWLVTACDMPLLTPKTLQQLAEERQPLRFATAFRHPQT
QRLEPLFAIYEPKTRIPLLLQHAEGNNSLASFLATARIAALMPDAAQSLL
NINTPDEQKSFEQNQGRA
>Cag_1732 Glycine cleavage H-protein
MTIPEDLRYTKDHEWMKLLDDGTALVGITDFAQSELGDIVFVELKADGTT
LKTHESFGTVEAVKTVADLFAPVAGEIVASNPELASAEVVNQDPYNAWLI
KMKVANPAEVEALLDAAAYRQLIGE
>Cag_1723 hypothetical protein
MLQFWDVKTRWCQRLHFFLRGGIAIAGSLLVYGTLFAADNTLAPQSISSP
WRIALGTQWSHEHPELSYTTLNDSTNPELSINQWSTNDFTPHVSRGTLEK
QLSSSSAFSFSYNQESATSLLVGKKNVLIPLPFLRFLLRRPAIMVRTTMQ
APVTLNIHQLRLRYEHAITRQAGIELGGALGMQALYTQAKAAFPVLGYSK
EEYFLLLPLVSLYARSEPLQRVRYTLRAEYLPIRIASTEGTVADVECTME
YKLNARFFVGVGGRYALKSFSREDENEHLDVSYPLLGAMFYTGIFL
>Cag_1514 uroporphyrinogen-III synthase
MKSVLVTRPKDQAASFVEQLAAYGLLSVVFPTIEIKPVGGWLVPTMAQFD
GVFFTSPNSVHAFMAQLLQEAPQELAALQRLKVWAVGKTTSADLAAHGVT
TQPVPKIADAVTLMEEIGDEAINNKSFLFLRGSLSLGVIPDVISKRGGRC
IELTVYENVPPPLEASAEVKAMLQDGKLDCLSFTSPSTAINFFEAIGTTA
IPDGVKVAAIGTTTAKALEKLGVRVDIIPEYFDGPSFAKAIAEALKG
>Cag_1879 chlorosome envelope protein X
MKITINNNSYEASVGQRILDVARVHHEHIGYFCGGNGMCQTCYITVLEGM
ENLTPLSREEKALLSDTLISENTRMACQTYLEKEGTIRIKSFVEDVKDMF
EQNPTALVGYAGKMGWEALVKFPDTITLQSQREWHLMQFISDVLNGIGDA
FLMVSKACGKKV
>Cag_0448 DegT/DnrJ/EryC1/StrS family protein
MCCRDSTEKNEMVVYFVRSYSFHLDRQHIMQFIDLLTQKERIKGALLRRF
EDILDRGQFIMGPEVTELEAQLAAYAGTRHCVSCSSGTDALLMPLLAKGI
GAGDAVITTPFTFVATAEVINLAGATPIFVDVLPNTFNINPELIGEAVAE
AKQKGLQPKAIIPVDLFGLLADYERLNAVANEHHLWILEDACQSFGGSFN
GGKAGSFGLVGATSFFPAKPLGGYGDGGAIFTDDSELDMLLRSVRVHGSG
ADKYSNDRIGINGRLDSLQAAVLLEKLTIFDEELATRQSIAEIYNERLDE
RLVVPTIPDGYRSAWAQYSVLAASAEERDMLMQALQQEGIPSMIYYKIPL
HLQKAYRSLGYNTSDFPVSEDLSRRIFSLPMHPYLKDEEIEQICTVLLQT
K
>Cag_0387 RNA polymerase sigma factor rpoD (Sigma-A)
MRQLKISKQITNRESLSLDRYLQEIGKYDLLTAEDEVRLTKAIKEGFDMP
VDTPEYKRAKRALDKLIKGNLRFVVSVAKQYQNQGLTLGDLINEGNLGLI
KAAKRFDETRGFKFISYAVWWIRQSILQALAEQSRIVRLPLNRVGTLNKI
SKAYSQLEQEYERDPNTRELASLLEMDSQDVADTLKIAGRHVSVDAPFAQ
GDDNRLLDVLQNDGHMPDYGLTRDSLTLEVERSLSVLAPREADVIRSYFG
IGMDNPLTLEEIGEKFRLTRERVRQIKEKAIRRLRQSAYSKVLKEYIGG
>Cag_1523 DEAD/DEAH box helicase-like
MTISEAQTRSQLINKLLAQSGWNVNDQTQVVAEFDIAISHTQHIAEPLTP
YHSHQFSDYVLLGKDGKPLAVIEAKKTSKNAALGREQAKQYCYHVQRQQG
GVLPFCFYTNGLETYFWDLENYPPRKVVGFPTRDDLERFHYIHRNKKPLT
QELINTAIAGRDYQIRAIRAVLEGIEQKRRDFLLVMATGTGKTRTCIALV
DALMRAGHAEKVLFLVDRIALREQALDAFKEHLPNEPRWPNKEETLIAKD
RRIYVATYPTMLNIIRDEAQPLSPHFFDFIVVDESHRSIYNTYGEVLDYF
KTLTLGLTATPTNVIDHNTFQLFHCEDGLPSFAYTYEEAVNNVPPYLCNF
QVMKIQTRFQMEGISKRTISLDDQKKLMLEGKEVEEINFEGTQLEKQVTN
KGTNTLIVKEFMEECIKDQHGVLPGKTIFFCSSTKHARRIEEIFNALYPE
YKGELAKVLVSDDSRVYGKGGLLDQFKTNDMPRIAISVDMLDTGIDVREI
VNLVFAKPVYSYTKFWQMIGRGTRLLETSKPKPWCTAKDVFLILDCWDNF
EYFKLNPKGKELPSQLPLPVRFVGLRIDKIEAAIDRNRVEIAEREISKLR
AQIAQLPQNSVVIKEAATALAQIEAEHFWDLLNHQTLEFLRTEIKPLFRT
LSDVDFKAMRFERDLLEYSLAALREEKEKAETLKEAIVEQISELPLSIPF
VKAEEELIRAAQTNYYWAKDDAIALEETLDKLNSRLGGLMQFREQTEERE
TVHLDLRDEIHRKEMVEFGPQHESVSISRYREMVEGMIAELTEHNPILQK
IKMGEKISAIEADELAAMLHAEHPHITEELLQQVYNNRKAHFIQFIRHIL
GIEQLKSFPETVSEAFEQFIQQHSNLSSRQLEFLNLLKGFIIEREKVEKK
DLINAPFTVIHPQGIRGVFKPSEINEILKLTEQLAA
>Cag_1960 Rhodanese-like
MIRYTDFVAQCLPHIKEVLPWVLVERMAANPELLILDVREPYEFERLHIK
NSINVPRGVLETACEWDYEETVPELVQAREREIVIVCRSGHRSVLAAYVM
QLMGYTNVFSLRSGLRGWNDYEEPLSTSAGAIVDPQYADDYFVSKLRPEQ
FAPKR
>Cag_0733 conserved hypothetical protein
MNSSALSVQASNGLSQWLHQQQMSLACTTYQSNRLFFVGCKKEGEQLALH
ERLFDKPMGLWWQPDMLLMGCRYQLWELDNRLPQGQRHEGGDRLYVPRCS
YITGDINAHDVAFDKQGRLLFVNTDFSCLAVYDPDYSFVPIWKPPFISKL
AAEDRCHLNGVALCDGEPTYMTACSQTDDAAGWRNCRRDGGVVLSIPDNA
VIATGLSMPHSPRWYRNRLWLLNSGTGELGYLDGAKFIPTTFCPGFVRGL
AFNGDYAFVGLSQLRSTSFGGLQLEERLAAMGKSAECGMMVIDLNSGNII
HWLFFTSVVSELFDIVVIPEALQPRALGLQEDAIERLVTFPESNGIVTTK
PTVKRPGQSVPLRVAGLPSATSEAEHVEQEIKYQRVFHLNPDNLLPYDAM
TYPSLKLRWQTEPPRGELLGVSASVNGEMVGFAIGEKFQAEGNVTAELIS
LYVLPTLRSLGIASRLFGELQRAIGQALTDPQRLAGLMMNELSIIH
>Cag_1839 50S ribosomal protein L5
MSGNNESASYASSLARLDTYYREKVVPALMERFQYKNIMEVPRLQKIAIN
IGVGDAAGEPKLLDVAVGELTQIAGQKPQIRKSKKAISNFKLREGQAIGC
RVTLRRKAMYEFFDRFVSLAVPRIRDFRGLSDTSFDGRGNYTAGVKEQII
FPEIDIDKVPRISGMDISFVTSAKSDEEAYVLLSELGMPFKKKNN
>Cag_1317 MutS 1 protein
MAKEQSGTKEHSPMMRQYLEVKERYPDYLLLFRVGDFYETFFDDAITVST
ALNIVLTKRTADIPMAGFPYHASEGYIAKLIKKGYKVAVCDQVEDPADAK
GIVRREITDIVTPGVTYSDKLLDDRHNNYLAGVAFLKEGKTLMAGVAFID
VTTAEFRITTLLPEELPHFLAGLHPSEILFSTQEKERTLLLKKSLPSETL
ISLLEPWMFSEEQSQTVLLRHFKTHSLKGFGIETAGGNRAALVAAGVILQ
YLEETRQNSLSYITRIGELHHTEFMSLDQQTKRNLEIISSMQDGSLSGSL
LQVMDRTRNPMGARLLRRWLQRPLKKLTNIQERHNAVEELVENRTLRESV
AEQLAAINDLERSLARIATLRTIPREVRQLGISLAAIPTLQALLSDVTAP
RLQALTAALQPLPKLAEQIESAIDPDAGATMRDGGYIRAGYNEELDDLRS
IASTAKDRLMQIQQEEREATAISSLKVSYNKVFGYYIEISRANSDKVPAY
YEKKQTLVNAERYTIPALKEYEEKILHAEEKSLLLEAELFRNLCQQIATE
AATVQANAALLAELDALCSFAECAVAFDYTKPTMHEGTTLSITAGRHPVL
ERLLGAEESYIPNDCHFDDKQTMLIITGPNMAGKSSYLRQIGLIVLLAQA
GSFVPAESASLGVVDRIFTRVGASDNLTSGESTFLVEMNEAANILNNATE
RSLLLLDEIGRGTSTFDGMSIAWSMCEYIVHTIGAKTLFATHYHELAELE
ERLKGVVNYNATVVETAERVIFLRKIVRGATDNSYGIEVAKMAGMPNDVI
SRAREILAGLEKRDVEIPRQKAPKVNTMQISLFEETDNQLRNAVEAVDVN
RLTPLEALLELQKLQEMARSGGY
>Cag_1467 Histidyl-tRNA synthetase, class IIa
MSSFQCVKGTRDILPDESLLWSFVSSHFHHVASLYGFREIRTPMFEYTDL
FQRGIGATTDIVGKEMFSFQPDPAGRSITLRPEMTAGVMRAALQNNLLAQ
APLHKLFYIGELFRKERPQAGRQRQFNQCGAELLGVSSPAAVAEVMSLMM
HFFGALGLTGLTLKVNTLGNAEERLAYREALQAYFAPHRAMLDASSQERL
EKNPLRILDSKNPALQELIAAAPRLYDYLQEASLRDFEKVLFYLTERRIS
YTIDYRLVRGLDYYCHTAFEVTSNELGAQDAIGGGGRYDALARELGSATD
IPAVGFAVGMERLLIVLEKQGLLGNRHARPPRLYVVVQQQEMLDHALQLV
WRLRNGGIRSELDLAGRSMKAQMREANKLGALYALFVGASECASGKYGLK
NLATSEQTDLSIEAVMQLLHDHVTE
>Cag_0930 Phenylacetic acid degradation-related protein
MPQTSVFFAPVSLEEINTTQIVDGQMARHLGIEMVKIGADCMIARMPVDH
RTIQRIGILHGGASLALAETVGSIAASYCVDRSTHYIVGQEINANHLRPT
KSGYVYATATPLHLGKSSQVWDIKIRNEEGKLTCVSRFTVAVLRKQPSA
>Cag_0841 hypothetical protein
MGMPVRIDDTLYGQARAQAKAEHRTIAGQIEFWAMIGRAALDNPDLPIDF
VRDLLIARREGEAHSTPFVPEGHRS
>Cag_0375 ATPase
MVLLTVEGLEKKYGLKHLFEDVSFGVDERDKIGIIGANGSGKSTLLKILA
GVEQPDKGKLMVANHKRIAYLPQDSPYNPNDTVLQAILASSGKVMDLIYE
YELVCKKLEEHQGDSVALMERMSTLAHELDVCGAWELESNAKTVLGRLGL
YDLTARMGTLSGGQRKRVALAHALVVPSDGLILDEPTNHLDADSVEWLEQ
YIRRYQGAVLLITHDRYFLDRVATRMLELDGRTATTFTGGYSSYLQQKAE
LEEQAVRDERKRQALVRQELDWMRSGCKARTTKQKARMQRAESLVYSPKQ
EKSKELEIGFGAGRLGDKIIEFHKVSKSFGDKLLLKNFEYHLQKGDRIGI
IGANGSGKTTLFEMIAARTTPDSGHIEIGKTVRLGYYDQESRELDDSKRV
IESIQEVAEQITLKDGVVLSASKMLERFLFPPSTQYSLVKTLSGGERRRL
YLLRQLIASPNVLLLDEPTNDLDIPTLRVLEDYLDNYQGCLLVVSHDRYF
LDRTVEYIFAFEENGQVRRYPGNYTVYLEMKASIASAGEPKVAKKTTEPP
KPVAVQSTKPAALSSKEKRELEKLEVAIAAAESRQAEIAVQLSAAGNDFA
MQQQLGTELQALQQQLELDMERWSELAEKAG
>Cag_0628 conserved hypothetical protein
MESTSCPICNTNSFTPWLHVVDRFEPSTLWNIVQAVDSGLLMLHPRPTEA
EMAPYYAHAGYEPFLNSNKKSSLAERTLLFARSLLLHYRAMLIAKAREHP
LCKAHILEVGCSNGELLHCLQQKHHIPTAQLLGVEPDAASAEYARKRFGL
QVVDGVEKLPTTLFDTIILWHTLEHIHRVNETLAMLRERLTVNGIMVIAL
PNPLSYSARHYREAWIAWDAPRHLYHFTPTTLAALLKKHKLHIVKQQPYL
PDTLFNTLYSEQLQRQHNNAPSTPLPFANALAQVTTAIKISTKELREPNN
TSGIMYVVTHDA
>Cag_1305 Co-chaperonin GroES (HSP10)-like
MWSRMNFISHVVNNYSLKNQNERTTMNLKPLADRVIVKPAAAEEKTKGGL
YIPDTGKEKPQYGEVVAVGAGKIADNGQAIAMQVKAGDKVLYGKYSGTEV
SVEGEDYLIMRESDIFAIL
>Cag_0286 1-(5-Phosphoribosyl)-5-amino-4-imidazole- carboxylate (AIR) carboxylase
MKQETTPIVGILMGSDSDFEIMKEAYSLLQEFGIAAEISVISAHRTPRDL
EEYASSAVQRGLKVIIAGAGGAAHLPGVTAAMTVLPVIGVPIFSKKLNGQ
DSLYSIVQMPAGIPVATVGIDNARNAGLLAVQMLALSDEVVMAKLLAFRQ
QLADASRKKTEKIRAQLHG
>Cag_0931 Elongator protein 3/MiaB/NifB
MESTEQSTLPLVADPNEAVLEQLVREQKTRKKWLLVQPKSSTSMMVDSGT
VSMPLNLIMVATLAGKYFDITFIDERLGENLPEDLSGYDVVAITSRTLNA
TKAYRIGDAALKQGKKVILGGVHPTMLNEEASEHCTSVVYGEVESIWTEL
AIDVLKGTMKRVYKAKELKPMGTMQHPNFSYALASKASKKYSALIPILAT
KGCPVGCNFCTTPTIYGKSFRYRELDLVLDEMRYHQNRLGKQKVNFSFMD
DNISFRPQYFMTLLEEMAKLGVHWNANISMNFLDKPEVAELAGRSGCDLL
SIGFESLNPETLKTVHKGSNRLGNYETVVSNLHKNGIAIQGYFMFGFDND
TEESFQLTYDFIMKNRIEFPVFSLVTPFPGTPYFDEMKPRIRHFDWDKYD
TYHYMFEPQKMSGEKLLENFVKLQKEIYKGSSIMKRMKGKPFNWVWFVNY
MMHRFTKKLSPEVYL
>Cag_0480 putative ABC transport system permease protein
MPLSFARTSINKLSAQAKEFFFTMQEFFLFSLRAFMAIPKMGRYWRDVFD
QATICGTDSIPIVLVSSISIGALLAVEVGNLLEDFGAKTMLGRSTALSVI
RELGPLLMGLMLSARFGSRNGAELGAMQISEQIDALRAFGTDPVAKLVMP
RLVAALVMFLPLTALSDFAGLQSAAYMAEHYHHIDAGIFWNAVYPRLVLK
DFVLGFLKAPVFAVIITLVSSFNGFNARGGTAGVGRATIKGIVVSSGLVL
IANFYVSKLVLENM
>Cag_1700 Menaquinone biosynthesis protein
MNNRQITTLWSTLLVEALIRQGVDFFCISPGSRSTPLTIAAARNPRARTK
LFADERSAAFFALGYARATEMVAPLICTSGTAVANYFPAVVEASMDFQPM
LIISADRPFELLECGANQTIRQEHIFGSYTRWSMQLPTPSVEVPLTSLLS
TVEYAVAKALNTPAGVVHLNQPFREPFEPESVAANNHAWWQSAQQWLASK
AAHTTTTVEKKHPNNASITLLRQHLTTAKQPLLIAGSMRCKDDAEAVAAL
ANNLKIPLYADFSSGLRMKSNLPPLQLLMQSPAWRAAFHPDVVLHFGGNV
VAKHLATALREWQPAHYMVVREEPMRFSPDHNVTHRLEASIAATAHALHN
SRSTPLWRESEADHFFAQAAQELEGDVVADQPITEISAARLISRHIGTEQ
ALFVSNSMAVRDMDMYAASLHEAGIPTAINRGASGIDGILSTAAGFACGH
GKSTTLLIGDIAFLHDLNALSLLGSLTVPLQIVLLNNNGGGIFSFLPIAA
CDDLFETYFATPQHYSIPLAAETFGLHYANPTTNSEFVAAYHQAQQSPQS
TIIEVKSSRTNNLQHHRLLNARLQAIAAKLFNG
>Cag_1888 carotenoid isomerase, putative
MADKRADVIVIGAGIGGLTTAALLQERGIQTVVFEKNRYAGGSCSAFRRE
GYTFDAGASVFYGFGDNASSGTLNLHTRIFRKFGIKVATVPDSVQIHYHL
PNGFSVAASHNRQQFLAALKARFPHEAEGIERFYEELTAVCDILRAMPAG
SLEDVVHLASVGAAHPLKTVALALKSFRSMGKTARRYIRDEELLRFIDIE
AYSWAVQDATATPLVNAGICLADRHYGGINYPIGGSGAIPEGLCKAFQQH
GGTLNYQAEVVEILLEAGEARGVRLADGTVHYAKVVISNATIWDTFNRMV
KDVRYRVEEDRFLRAPSWFQLFLGVDSRVIPEGFNVHHIIVDNWQSYQQL
GGTLYFSAPTILDPSLAPAGHHIIHAFVTDEVACWSNYERGSSAYRAAKE
EKAAALIARIERIVPELSSAIKLKVLATPLTHERYLNRYKGSYGALLKPG
QTILQKPQNTTPVRNLYAVGDSTFPGQGVIAVTYSGVSCASYVARRFGKP
LEELG
>Cag_0120 conserved hypothetical protein
MEVVAKSKAHVVLDACFQESGQFFTDHERLMRCNPFCSNVRYLPAHNIFQ
WIFQVDDPRNNPFMAIFFVRQQEHPFSLDDEYGKNFVEKRIKNRERYNGN
GKRIYWEPVQEPPADVVLPKPAENGRSFVGTASSDICLLHHHDNKTSVYF
DTNITMDFDISFPLNLMPEGVLRFMTEAVMSQVMQQATEAMLCKVQADMG
CASSGLLPTE
>Cag_1354 nucleic acid-binding protein contains PIN domain-like
MMKKFAVVPDTNIFLASEKSVHSTSPNKEFVARWKREEFEVLYSEDTLLE
YITKLRQKGISETSIKKLLATLFALGREIKIEFYHLLHYPLDPDDIAFLL
CAENGKATHIITYDRHLKAIESSYTFRVCKPVEFLLELRHQYGIQPKA
>Cag_1901 3-isopropylmalate dehydrogenase
MNYKIVSIPGDGIGTEVVAGAVAVLRQLEKKYGFTVEIEEHLFGGASYDV
HGEMLTDATLEACKNCDAVLLGAVGGPKWENLPHEHKPEAALLKIRKELG
LFANLRPAKVYDALVDASSLKADVVRGTDFVVFRELTGGIYFGQPRGYDE
NKGWNTMVYEKYEVERIARLAFEAARQRQGRVMSIDKANVLEVSQLWRNV
VHAVHADYQDVELSDMYVDNAAMQIVRNPKQFDVIVTGNLFGDILSDISG
MITGSLGMLPSASIGSKHALYEPIHGSAPDIAGQNKANPIATIASVAMMF
EHSFKRTDIARDIEQAIEAALATGVRTADIAAAGDTAVSTTAMTEAIISQ
LK
>Cag_1445 Isopentenyl-diphosphate delta-isomerase, FMN-dependent
MTNPSAGTLTIERKQSHVELCLHANVAFSGKTTGFERFYFEHNALPEIAF
AEIDCSTTFLGRHIGAPLMVSSMTGGYSEASTLNRQLAEAAEHFQIPLGV
GSMRQTLESPLHRESFAVTRKYAPTTLLFANIGAPEVAQGLSQSDVAMML
DLLRADGLIVHLNAAQELFQPEGNTNFHRVLEEIHNLCATTNVPIIVKEV
GNGIGAAVAEQLMEAGVQALDVAGAGGISWQKVEEYRFLQQFGHEHRFSS
NALDELLNWGIPTTNCLLDIAELKRLQPQFQQIEIIASGGVSSGMDVAKS
LAMGAQLAASARHLLHALHAGTLTATIEQWLNDLKAAMFLTGAATVDALR
TKSLLNH
>Cag_1349 peptidyl-prolyl cis-trans isomerase, cyclophilin-type
MATVTIKTSMGDITVRLYDDTPLHRDNFTKLVNEKYYDGIRFHRVIEGFM
IQTGDPLSRFEEKRAMHGTGGPSERIPAEIKHPNKRGTLAAARDNNPQRA
SSGSQFYINQIDNDYLDGQYTVFGVVESGIEVVDEIAGVATDMNDNPITP
VLIETIVVAEAK
>Cag_1986 protease, putative
MSLLAIECTHEALSVALEHHGTIREVQSSEWKKAAESIVPLVQQVVAESD
ATFQALSAIVISAGPGSFTALRIGMAAAKGMAYALDIPLLPVPTLPAMAA
SLQASEDAVVAVIQARRGEYYYALYHATDVAANRWHNDVQRGSADVVVDA
ALMAARVGSVVVVGRKLLDLQQPLAEANIALLEADCFSARSLLAAAERQR
TAKQMVPLEQIVPDYQQAFVPTMGKMG
>Cag_0113 conserved hypothetical protein
MLVPFMLLCMAWLAPQHLRAQSTNAPIISVTVTPDTLLTGDRATCTVTVR
HAANHVATLLLPPRRSGQALDAAELVSQQRFSTENAQGSHEERFVLEFAL
FRAGQQPLPPLGVTMRQRANNFVVGTYPLTVPSVFVRALTDSTMRELRPI
KPPQPPSFPIMLIVPVLLAVFMVTAVGWLLFFVIHRVLGKQASNVDLGQV
AQRKLRKLGSRLSSGMPPHECYDELSNIMRTFLEHHYRIRALEAVTQEIE
RDLKKLGVAGYETILSLLRQADLVKFADMRPNVEESRQSLNKAAEIIRAT
RTVRSPEEPKEVEGE
>Cag_1799 hypothetical protein
MKFGFEIVPVERGHGGALYSLRFEAEEKTELDKFLDNEEIQACKEYESLV
ARLYDMVDSLGFRDYFFKLKEGSINDSVAAFHYNHGTLRLYCLRWSSILL
IVGSGGPKTTRTYQDDPLLSDAVGKLQMVDRLFDERQKSREIIIDPNTGI
ITGNLVFTSD
>Cag_1598 hypothetical protein
MLLPASCMALPTASSLMPHVQQAELSSWIFYAPLVAPVMALAVNVALPFY
QRAKRRRSIAKQRAARRERFGLATENSAVRGRVKHPDNYYRQWLARWCQQ
QQGALLPGGVAFESMDFTDLFVLPDVLLQGSTPTEQGGAVSFDTALSFAA
ESSRLLLLIGEDGGGKSALMQSLVLGNKPRLSGFSDKSQIFFLPLTALAA
IPTPFPALPDVLREWSGLQEQLSTSFFHSHLTTHRTLVLLDGLDALHNPE
QREAVCRWIAEAVGEWQKAFFVLTTSPDTFGEREQAALACHTRIAVLKQF
NNQQQHLFFRQWFRALRLRELCREHGDVAECSLTTIERHELEAEVCADAL
EYSLQKVEPTLMREVAATPLLLGILARFWEEHHYVPTTREALYDAALTYL
LKDAVQQPDCLHLLPPIAELKHLLGSLALWMQDSGVIAAPVEQVHAWLQP
EVERLNARFGDAYRVDALCTHLAAGGSVLGQQGGAYLFTHNSFRDYLIGE
QLLKEVGVEPTRAAKMAVQLGDGWWDEAFLFFAIRAEGLWFTRFLKALME
SPQSELLDEAARGLLLRMVERADPPVTYLRDKLRDRRTNRVRQLLLLECL
QVVGSVQAYQAAQEFIRHHRRADSEVLRKAAEVVIRQTSHRKIH
>Cag_0443 pyrophosphate-energized vacuolar membrane proton pump
MYGLAICVFGMIFGAVKFMEIRKLPVHSAMLEISELIYETCKTYLLTQGK
FILALWLLITPIIIGYFGWLRQVEVAKVVYILVCSLVGIAGSYFVAWFGM
RINTFANSRTAFASLAGKPFPTYAIPLRAGMSIGMLLISIELFVMLCILL
FVRPDYAGPCFIGFAIGESLGAAVLRIAGGIFTKIADIGSDLMKIVFKIK
EDDARNPGVIADCTGDNAGDSVGPTADGFETYGVTAVALISFILLAIQDA
AIQVTLLVWIFSISLVMIISSALSYGINGAFAKMKYANADEMNFEKPLIT
LVWLTSLLSIVLIYATSYVLIGYLGDGTMWWKLASIISCGTLAGALIPEI
VDQFTSTECAHVRNVVQCSKEGGAPLNILAGLVAGNFSAYWLGLVILALM
AAAFGISELGLGSMMLAPSVFAFGLLAFGFLSMGPVTIAVDSYGPVTDNA
QSVYELSLIENIPNISQNIQNEFGFQPHFENAKYFLEENDGAGNTFKATA
KPVLIGTAVVGATTMIFSIIMVLTNGLSDTAAIARLSILSPPFLFGLLMG
GAVIYWFTGASINAVSTGAYYAVEFIKQHIHLDGSQERASTEDSKKVVEI
CTKFAQKGMFNLFLTIFFSTLAFACLDSFFFIGYLISIALFGLYQAIFMA
NAGGAWDNAKKVVETELHAKGTPLHDASVVGDTVGDPFKDTSSVALNPII
KFTTLFGLLAIELAIELPPHIAQTLAAVCFAISLIFVHRSFFSMRIKVDE
H
>Cag_1989 Ribonuclease E and G
MKKSSTKQLLMNKTGDEIQVALVEEGRLVELIIERPESRRSIGDIYLGRV
HKVVEGLKAAFVDIGQKSDGFLHFSDVGTTNEDYRALIEDDDDDDAIIGD
DIESDEATGQNDSEFDEASDGETTVSAPKTVARSNSKKPSSGEQSGEKPQ
TYTQMIAGKLKPNDSILVQVIKEPISSKGSRLTSDITIAGRFMVLLPFGG
GQVAVSRRVVSRKERSRLKKLVRSILPEGFGAIIRTVAEDQEEALLKQDL
EKLLTKWKQIEEKLQDATPPQLIFKEDTIISSVLRDSLTSDVSEIVANSP
AIYKETLNYIEWAAPEMVKNVALYQGKLPLFEGYAIAKDVESIFSRKVWL
KSGGYIIIEHTEAMVVVDVNSGRYAAKREQEENSLKTNLEAAREVVRQLR
LRDIGGIIVVDFIDMLDPKNAKKIYDAVKTELRNDRAKSNILPMSDFGLM
QITRERIRPSLMQRMGDQCPACGGTGIVQARFTTINQIERWLRKYALQHP
LRFQQLDLYVSPTVLEPLQNSDMKTEMKWFLQHMLFVTVKGDESLRSDDF
RFYNRKNNKDITAEYGEL
>Cag_1255 Outer membrane protein and related peptidoglycan-associated (lipo)proteins-like
MTINHFTMKKNLLTLSLLLSATVPATAQINLKSIFDSSAKKAERNAAQRI
EKKIDKKVNNTFDSVENNLDNVKNNSENNNYAFQEPVSNSIPVKQQSTLS
WNKYDFVPGTEIIFEDDFTGERNGEFPSRWDITKGTVEIAEWGGEKVVWF
KNTNTNVPDAILPYLKNRSTDYLPDEFTFEMDVYFHADYRLNKDYYIFFY
DAKNQTKIFSPSKPIRIDYNSVTYNHIGDLYQGQNKLKPKEGWRHVAISF
NKRALKGYLDDARLLNIPNVEFNPTGIMISSHNSGGKGQPFVKNIRIAKG
AVPLYDKFLVDGKFVTTGITFDINKATIKPESMGTINYVVKMMQEHPELK
FSVEGHTDSDGADANNQTLSEARAQAIVNKLIESGIAKERLTSKGWGESK
PITNNETAEGKAQNRRVEFIKI
>Cag_1842 30S ribosomal protein S17
MEQTVPVRGRKKSWTGKVVSDKMDKACVIAVERRVQHPVYKKYFKKTTRL
MVHDENNEAGVGDLVRVVECRPLSKRKSCRLAEIVEKAK
>Cag_1804 ParB-like partition protein
MGEMQKKGLGRGLKALIPDELMPDEQVVVQPPETSPSEPATVGSICSLPV
EKIHANPFQPRKEFDATALEELKNSIIENGVIQPITVWRNGDIYQLVSGE
RRLRAVTLAGFKFIPAYLIEAPEDSAQIEMALIENIQREDLNAIEVALAL
KSLTTRCNLSHEEVARKVGKNRSTISNLLRLLKLPLSIQESIRNHDISSG
HARALINLPTEQQQLKVWKQILSQQLSVRQTEALVNRLANEQEQSTQPPR
ERATRLAALEAYLRDNLATKVKIVEKKDGKGEIHIQYFSHDDLERLLEFM
RRD
>Cag_0798 aspartate aminotransferase
MSLHLSNRHASVLQSEIRSMSIACSRVNGINLAQGVCDTPVPNEVLQGAS
EALQQGVNTYTHYAGIISLREAIADKQERFYGIRYQPESEIIVSAGATGA
LYAAFQALLNPGDEVILFEPFYGYHITTLQAAEAVPLYLPLTLPEWSFSE
HDLEQLVTPRTRAIIVNTPANPSGKVFSLAEMERIAAFAERYDLFVFTDE
IYEHFLYEGHQHHSFAALPGMKERTITVSGASKTFSVTGWRIGYALCDAR
WAQAIGYFNDLVYVCAPAPLQAGVARGMRELDDRFYNHLSVDYQAKRDRF
CATLAKAGLVPHIPDGAYYVLADVSALPGNSAHERAMHILNRTGVASVPG
SAFYQHGRGDGLVRFCYAKEDAILEEAYQRLERLREG
>Cag_0185 Glutamate synthase, NADH/NADPH, small subunit 1
MGKIKGFMEYKRALPADRQPLERIKDWQEFHEKMAPEALCEQGARCMDCG
TPYCHSGIMLSGMTTGCPIHNLIPEWNDYVYRGFWHEAYDRLNKTNNFPE
FTGRVCPAPCEGSCVLGIINPPVTIKNIEYSIIEHAFAEGWVTPKTIANR
TGKRVAVVGSGPSGLACADQLNKAGHSVTVFERDDRCGGLLMYGIPNMKL
DKVQVVERRIALMKQEGITFMEKTEVGVDYPAGKLLEEFDAVVLCTGATK
PRDLAEEGRQLAGIHFAMEFLTASTKALLDGTEPKLSAKGKKVVVIGGGD
TGTDCVATSLRQKCASVVQLEIMPKPPMERQADNPWPEWPKVFRVDYGQE
EAEALQGSDPRRYAMMTKKFLTTGNGNVSGVEVCSIAWENIDGRFVPTPI
PGTEEIIEADMVLLALGFIGAEENLLQQLQVAQDERSNIKANTQNYRTNH
ERIFAAGDARRGQSLVVWAIQEGRAAARECDRFLMGGTNLP
>Cag_0117 Anion-transporting ATPase
MRILTFTGKGGVGKTSVSAATAVRLASLGYRTLILSTDPAHSLSDSFNLA
LGPEPTKIKENLDAIEVNPYVDLKENWHSVQKFYTRVFMAQGVSGVVADE
MTILPGMEELFSLLRIKRYKTSGQYDAMVLDTAPTGETLRLLSLPDTLSW
GMKAVKNVTKYIVRPLSKPLSKMSDKIANYIPPEDALDSVDQVFDELDGI
REILTDNNSSTVRLVMNAEKMSIKETMRALTYLNLYGFKVDMVLVNRLLD
TKEDSGYLEKWKNIQQKYLGEIEEGFSPLPVKKLRMYEQEIVGLAALERF
AADMYGDSDPAKMMYDEPPIRFVRNKDVYEVQLKLMFANPVDIDIWVTGD
ELFVHIGNQRKIITLPISLTGLEPGDAVFKDKWLHIPFDLNQHVNKASQE
A
>Cag_1304 conserved hypothetical protein
MAFIKQISNAMDAQLLQITAQVLFEKLQSLRSKMQDDVATPDDRVAAAML
EEVVALARNRYCVAGNEAEFQQGKSVWVERTAATHFPHIRLHGGALEDDP
IA
>Cag_0711 Prolyl-tRNA synthetase, archael and euk type
MADKITSRSEDYSQWYIDLVRSAKLADYADVKGCMVIRPNGYAIWEKMQA
ALDRMFKETGHVNAYFPLFIPESFIAKEAEHIEGFAPECAVVTHGGGQEL
AEKLYVRPTSETIIWSSYKKWIQSYRDLPLLINQWANVVRWEMRTRLFLR
TTEFLWQEGHTAHATHEEAQEEVLRMINVYKTFAEEYMALPVILGKKSDS
EKFAGALETFCIEAMMQDGKALQAGTSHDLGQNFAKAFDCKFQTHEKTLE
YVWATSWGVSTRLIGALIMAHSDDRGLVLPPRLATRQVVIIPILKGDKES
VLHHADNIAAALTKAGISAFVDSSEQNSPGWKFAEYELQGIPLRLELGPR
DIKNGMCVVARRDTLEKTEIALDDRLVMSINEILNDIQQDMFDAALRFRQ
ERTVQVNNYDDFKVAVEKGFVIAHWDGTVETEAKIKEETKATIRVLPQED
DYCDTYGINEPGTCIYSGKPSARKVVFAKAY
>Cag_1940 Rare lipoprotein A
MPKHIYKLWLILPLILSACTTTRAPFRAISPEEAYQQGKLKQNPYVINGT
TYLPLRYEEALAYEENGLASWYGKETLIQNNYQLTAYGEVFDPSKPSAAH
KYLPLPALVRVTNLDNNNSIVVRVNDRGPFIGDRVIDLSAEAAKRLGFYE
KGMARVKIEVLNK
>Cag_1677 Ric1 protein
MIDIRRWILVFIMPPAAVLNKEAGTIMLAGLLTVAGWIPGVVFALFLMVQ
EMLQAKKQVTA
>Cag_0782 heat shock protein HslV
MKHTSLPEIRSTTVIGVIRNGQAALGSDGQMTLGNTVVKHSTKKIRRLYQ
GKLLAGFAGATADALTLLDRFEAKLEAFGGKLERASVELARDWRTDKYLR
RLEAMIAIVSNDRALIISGTGDVIEPEDGIVAIGSGSMYALAAARSLMKH
TSLSAQEIVQESLAIAADICIYTNNHIVVEEL
>Cag_0103 cytochrome c-555, membrane-bound
MSMKPFLPFVAASFLFISACGLEKPPASLELPSKEKEATTEAAPAPAAPA
AAPGDPLAAGKTVYEGSCGGCHDAGMMGAPKPGDKAAWKDRLPQGVEAMA
KKSIEGFQGKAGMMPAKGGNASLTDEEVTNAVAYMADLSK
>Cag_1462 Initiation factor 2
MTLGESEKRYRISDIARELQLSPQEVLQFVKQQGVKVASTSSMVNEEVHG
LIINQFSAEKKMVDETLKIRAEKEKRLTRLEEQSRKTLEKEQHLMEAISP
TVRASKSSAKGSESAPKSEPKKSKQAVPAAAMVDDVPAAVVQQVVAEPEV
VEPTPVVEVEAPALEPAIISEHVVDAEPIENAVTVAPVEVVVNEPIETVE
SVEPEFVVAEVTPLEPIAQSTIEIVAEGESVEVAEALHVAEPVVAIPPIT
ETAELSDSTVEPIEPIASVPSTAPAPPARREPTVNENLVSFAAPQMMGGL
TVVGTLDMHTGRGRKNRKKNFREQADALKGEFEVKAAAPVASENKTEAGV
AKKSKPAAEVKPKPATTTAADDAKKAKKGKKKKKPDVDEKVISANIQKTI
SGIDDRSGTGSRQKFRKMRRSEREREQEEGAAQRELEQSIVRVTEFASPH
ELAELMGITAKDIIQKCFGLGKFVTINQRLDRESIELIALEFGFEAEFIS
EVEATAVETEADAEEDLQTRPPVVTIMGHVDHGKTSLLDYIRKSRVVAGE
SGGITQHIGAYEVTVDGDRKITFLDTPGHEAFTAMRARGAQVTDIVILVV
AADDNVMPQTIEAINHAKAAGVPIVVALNKIDKSEANPDKIKTQLSEAGV
LIEEWGGVYQCQEISAKKGIGIVELMEKVLTEAELRELKGNYSREVLASG
VIVESELDKGKGVVSTVLVQRGVLKVGDPFVAGNSLGKVRALMDERGKRI
LLAFPSQPVRVLGFEDLPQSGDVLTVMASERDARDLAQKRQIIRREHDFR
RSTRVKLDSIARQIREGVMKELNVIIKADTDGSIQALADGLMKIQNDEVK
VQLIHQGVGQITETDVLLAAASDAIIIGFRVRPNVNAKKLAEKEDLDIRF
YSVIYHVLEDVETALEGMLSPELHEESLGSIEIRQIFRVPKVGNVGGCYV
LEGKVPRDAKVRLLRDGVQIYEGQLAALKRFKDDVKEVDSGYECGLSLKN
YDDIKVGDVVEAYRIVEKKRKL
>Cag_1917 hypothetical protein
MDFGWLSIHTSVAEAWSLILRNARSKTPESIYRNLNFQDARISRIFCVGA
KTYVFTCLLRILGNRKGLYLHCKRNHGSDFSGNMRFNTK
>Cag_1993 ribonuclease HII
MHTHYEEPLWQHYEFICGIDEVGRGPLAGPVVAAAVVFPRWFQPTEALLT
LLNDSKKLSAKERESLVPAIKAQALHWALAEVQHNVIDEVNILQATMLAM
NNAVKALPIIPSLLLVDGNRFTTDLAIPYKTIVKGDSHVFSIAAASVLAK
VHRDALMCVYATHYPHYGFERHAGYPTSAHIEAIRQHGRCPIHRQSFKLR
QLGEKV
>Cag_2019 exopolyphosphatase, putative
MTHRVACIDIGTNTALLLIADLNPTNGTIHPLCHRQTIVRLGKNVDAHKV
IDSAAQTRLMACMQQYRQLADEHGCSSIVAAGTSALRDAVNRDEVVQAVE
NASGITITPIAGESEAALTFMGAVAGVQEVPERFVVIDIGGGSTEISIGS
MQGIEHGISLNIGSVRLTERFFSSLPPSEEEFEAAKLYITMHLTTKLFPF
FAGREAVYGVAGTVTTIAQVTQGMRHFDAERVQHYPLQYQEVRNFLEKLK
SSSLEEIIAHGIPDGRADVITMGTLILHQFMRLLGAAEVRTSIQGLRFGL
AQQELVRLQMKA
>Cag_0587 conserved hypothetical protein
MVKSIIYLEGGGDSKELRSRCREGFRKLLERNGFKDKMPRFVACGGRNTA
FSDFKVAHEQKIYTFVALWVDSEEPLEDIHKTWEHVQKRDGWEKPHNSID
EQLLFMTTCMETLIAADRETLQQVFKPLQESALPSLYNLEKQPRHELYQK
LKKATQGCAAPYEKGKISFEVLGKLNAETLQQHLPSVARTWHILQQKL
>Cag_0834 hypothetical protein
MLPQSKVIIHPSFKKKIPMKSLLLAFAICVTITALCFVTLEAVGMPQDIS
KTISVMVLGAFPKLREMLEKMEGERSGGAVAVAKVQSFGDFNVSTSRALL
YVTIVGFVALEFASGITGVVLALLGAQLSNIGVALQLLTMIIAYPIIFLA
GRWIGRRCSQQPYMVAALAGLTIRLSTTIFDIAMVPMEQLIQIYQGQMQL
SMIIVSQVGGSVLFALLLMAGAFVGSRQRLVVYVQYLLSRISPQSRLALV
DLAHEEAVKMQKESGGK
>Cag_1356 cytochrome-cbb3 oxidase, subunit I
MNNSVYDDRVVRGFAFSALFWLVIGLIIGLWMAVEMFNPALNLTPWLTFG
RLRVVHTNGLGLGFGLAGIFATAYYMLQRLTRTPLLFPRLAQAHLYIFNI
AIALGALSLFAGMNTTREYGELEWPLDIVVVIFWVMFAVNVFGTLIKRQE
QQMYISLWYILGMTVAIAILYIVNNLAVPVTLFKSYTIYSGANDANVEWW
YGHNIVGFIFTFPILAMFYYFLPKSTGLPIYSHRLSIISFWALNFAYLWT
GAHHLMLTPLPEWIQTVAMAFSIFLIAPSWGSVVNGYYTLQGNWDQMRSN
YLVKFFIAGITFYGLQTLQGPLQGLRTLNAFFHYTDWVVGHVHMGTMGWV
TMVISASFYYMIPRITNRELYSIKLANLHFWLILIGQFTWTITMWIAGIQ
QGAMWKATNPDGSLMYNFLDSVASLYPYYQLRFGAGIVYFIGILVFIWNL
IMTVRQPQKAEA
>Cag_1341 Excinuclease ABC, C subunit
MEPLDALEKHGDIKKVLTEKLATLPTSPGIYQFKNSAGRIIYVGKAKNLR
NRVRSYFRNSHQLFGKTLVLVSHIDDLEVIITSSEVEALILENNLIKELK
PRYNVNLKDDKTYPYLVITNEPYPRILFTRHRRNDGSIAFGPYTEARQLR
SILDLIGSIFPVRSCKLRLTPDAIASGKYKVCLDYHIHKCKGACEGLQPE
DEYRQMIDEIIKLLKGKTSALIRSLTENMHLAATELRFEQAAEIKAQIES
LKRYAERQKVVAADMVDRDVFAIAAGEDDACGVIFKIREGKLLGSQRIYI
NNTNGESEASMQLRMLEKFYVESIEPVPDEILLQEALSEEEEETLRAFLL
VKAKNEGQEKKGIRLVVPQIGDKAHLVGMCRQNARHHLEEYLIQKQKRGE
AAREHFGLTALKELLHLPTLPQRIECFDNSHFQGTDYVSSMVCFEKGKTK
KSDYRKFKIKTFEGSDDYAAMDEVLRRRYSGSLTESLALPDLIVVDGGKG
QVNTAYKTLQELGVTIPVIGLAKRIEEIFTPHSSDPFNLPKTSPALKLLQ
QLRDEAHRFAITYHRKLRSDRTLQTELTTIAGIGEKTAFKLLEHFGSVES
VAQASREELQAVIGAKAGETVYTFYRPEG
>Cag_1385 hypothetical protein
MVIFPISLSFPRKRESNSLILLGFRRGNETTIIILLSGFQKKSQKTPQQE
IDKAERLKKEYFDGKNKQ
>Cag_1924 sulfur oxidation protein SoxZ
MRVKATLQNNVVSVKMLLQHVMETGRRKDEAGALVPAHYITEVTATHKGE
TVFHAELGAGVSQNPYLSFQFTGASAGESLTISWVDSKGMSETADSVISA
V
>Cag_0572 conserved hypothetical protein
MSLRCGIVGLPNVGKSTLFNAITAKQAEAANYPFCTIEPNVGTVLVPDER
MQQIANVVKTPTLVPAVLEIVDIAGLVKGASKGEGLGNQFLSHIREVDAI
VHVVRCFEDENIIHVQGKIDPVDDIATIETELMLADLDSMERRMDRLRKN
VRKEKELQQQVDVAEKIIAALSEGTPVRKILETPEEQQIAKQFFLISSKP
ILFAANVAEGDLPNGNAHTAKVAEIAAEHGAKMIIISAKTEAEIAELPEE
ERPEFLESLGLHSSGLDRLIQSAHDLLGLQTYFTAGEKEVHAWTIRKGAA
APEAAGAIHSDFEKGFIRAEVMYYKDLLELGSEHKVKEAGKLRSEGKEYI
VRDGDIIVFRFNV
>Cag_1277 hypothetical protein
MSLTLQKEDAHKLIDQLPMDATWDDLIHEIYVREAIEHGLRDSQTGATKD
VHEIRAKYGLPL
>Cag_0418 hydrolase, haloacid dehalogenase-like family
MLKKLVLFDIDGTLLSMTSANRRILADALLAVYGTAGSSYTHNFAGKMDS
AIIYEVLAADGLTRKEVASRFDLAKEMYITFMQKVARVDDVTLMPGIVEL
LDALAERDDVLLGLLTGNFEGSGRHKLHLASINHYFPFGAFADDAEHRNQ
LPAIAVTKAHQRTGITFSEHNIIIIGDTEHDIACARAVQAKSIAVATGTY
APHELEAHEPHLLYHNFSDTQAVLNDILSH
>Cag_0913 conserved hypothetical protein
MQKFQTRFLEEADKFISELDSKAAKKIFYNIDLAEQTNDPKLFKKLQNDI
WEFRTVFAGLQFRLLAFWDKSDNTDTLVFATHGFIKKVDKVPKNEIDRAV
RIKEQYFENKLKK
>Cag_0905 D-alanine--D-alanine ligase
MSRTTVALFFGGQSAEHEISIISAQSIAAHLDTERFTLLPIYITHSGEWL
CDGFARTLLTTNLASKLRGSSREETAAALQQMVRNAAQAPCNRNLAALGV
DVAFLALHGSFGEDGRMQGFLETCGIPYTGCGVLASALTMDKALTKLCVA
DAGIAVAQGTNILSADYLANPNAVEASVEAQVSYPLFVKPASLGSSIGIS
KVHNREELHPALQAACALDWKVVVESTVKGREIEVAVLGNADPIASVCGE
IEPGKEFYDFQDKYMGNSAKLFIPARIPESLQEEVRRSALTAYRALGCSG
MARVDFFVDESTNSVVLNEVNTIPGFTDISMYPQMMEASGISYRNLITRL
LELALEPLRR
>Cag_0912 hypothetical protein
MKQRIYIDTSVFGGNFDEEFKEHTIPLFDRIKEGEFIILYSTVTQDELEN
APEKVKELVKSLKADLTEFIETTAEAVDLATEYITEKVVGQTSFADCLHI
ALATINRADFLVSWNFKHIVNIERIRGYNSINIKNGYKQLEIRSPREFEK
YEDD
>Cag_1586 Adenylylsulfate reductase, beta subunit
MPSFVIKEKCDGCKGQERTACMYICPNDLMKLDVDRMVAWNQEPDQCWEC
YNCVKICPQQAIEVRGYADFVPLGGNVIPLRGTDAIMWTIKFRNGILKRY
KFPIRTTAEGSIEPYAGKPEPDYANLKKPGFFNMTEYPTI
>Cag_1854 Translation elongation factor G
MARLVALDRVRNIGIMAHIDAGKTTTTERILYYTGRLHRMGEVHDGGATM
DWMEQEKERGITITSAATTCFWTPKFGNYQGINHRINIIDTPGHVDFTVE
VERSLRVLDGAVALFCAVGGVEPQSETVWRQANKYGVPRIAYINKMDRTG
ANFFDAVKSIRERLGANPVPLQIPIGEGEMFAGFVDLIRMKGIIYDKEDG
STYQEVEIPHDLQNEAKAWRINMLEAVSEHDDTLLEKYLNGEDITDEEVR
KVLRQATLQVTIIPILCGSSFKNKGVQFMLDAVIEYLASPVDVPAVEGHH
PRTEEPISRKPTDEEPFAALAFKIATDPFVGKLTFFRVYSGVLKAGSYVL
NTITGKKERVGRVLQMHSNKREDIEAVYCGDIAAAVGLKDVKTGDTLCDE
NNPVVLEKMVFPEPVIQIAIEPKTKSDSDKLGMSLAKLAEEDPTFKVKTD
EETGQTLIAGMGELHLEILVDRLRREFKVDANVGKPQVAYRETIRKSVEF
EGKFVRQSGGKGQFGLVNLKVEPLEEGKGYEFVDAIKGGVIPREYIPAVN
AGIQEAMKGGVVAGFPMQDVRVTLFDGKYHEVDSSEMAFKIAGSIGFKGA
AKKADPVLLEPIMKVEVVTPEEYLGDVMGDLSSRRGHIEGMGQRAGAQFV
AAKVPLSSMFGYSTDLRSMSQGRANYSMEFECYREVPRNIAESLQEKRTS
KDAE
>Cag_0097 Ribosomal protein L9
MKVILRKDVATLGDAGDVVIVKNGYANNYLIPQSIAIRATEGTLKALETE
RKQQARKVEMQRKHAREQAQKIEQLALKVFARAGESGKLFGTVTSADIAE
ALKAQGFEIDRRKITIEAPIKALGKFEAAVKLFSDVTVAVQFEVEAEGME
A
>Cag_1324 conserved hypothetical protein
MTISTVFINNHSQAVRLPTCVRLPDTIKKVSVRVNGNERIIAPVGQMWNS
FFLGSSKVTDDFMEERNEQAQPEREEL
>Cag_2010 transporter, putative
MTQPPDTPPFMDTESSSPQGKSAITRSLLKAFPAFANPDFRRYFPGQVIS
MIGTWMQMVAQGWLVYELTGSAFDVGMAAAATTFPTLFLSLFGGLLVDRY
PRRTILFWTQSSAMLLAFILGIVTMTGTVTMGIILLLSFLLGCVNAINVP
ALQAFLSEIVRRDHLPSAIAMNSAIYNSSRVIGPALAGWLIAYSGAGIAF
IVNGFSFFAVLLSLFTMKTKRRAPTVIESNPLLAIREGVLYAWNHKLIRL
CIYYIAIVSVFSWAYVSMLPVIAKQRFGMDASGMGSLFGISGIGSVMGTI
MVSMLANKIQPLRFIAIGSLIFAVALLGFTLTENLPLAMVGLFFAGFGLV
AAVSTLSATIQGAVEDRFRGRVMSLYMMIFMGFMPLGNVTIGYLSDLFGT
GFAIQLNCIVTIIAALLLLVHSKQFLRIG
>Cag_1256 TPR repeat
MNYSRHTEEVVHSGYQSVEASQAAEKTLELLHQALQLHQDGRLEEANALY
LQTLELQPKNLDAVELLAAAALQKSHFDDSDAYRLLTALSQQQSSFLDIV
ALFESSATGESAPEQSLNELGAALHQLKFYKEALVRFERALQHNPVYLEA
HFNRGNTLFVLERYDEALQSYSKAIELKKDYAEAYYNAGTLLFLLKRYEE
ALAHFDAALAIRPDYAEARTNRDYVQKELDALDAKTHHVHTAKLKAIAGT
HPTSRLIDVLTFKNIFSLPGLVLLLGTFFCLSPWASPPVALLLGIVCAQL
FDHPFMHLGHKVSSMLLKASVVGLGFGVNFNNAVKAGSDGFVATVVSIAF
TLLLGYVFGKIFSVEKKTSYLISTGTAICGGSAIAAIAPVIDAEENEVSV
AVGTIFILNAVALLIFPEIGQWLHMTQHQFGLWSAIAIHDTSSVVGAASK
YGHEALLVATTVKLARSLWILPLVIASSFLFKSKIKKLKAPWFIAMFIGA
SLFHTYFPEFHPYTEFMVPLAKSALTVTLFMIGTGLSWTILKGVGYRPLV
QGLLTWIAVSIVTLIMVMSMASF
>Cag_1493 Protein of unknown function UPF0074
MKILTKKADYAIRALLMLAAKEGGFMSAKAIAEAQDIPYQFLRSLLQELM
HHHLVASKEGAHGGFRIETNPDNISVKQLIEIFQGEVQVSECMFRKQVCR
NRSSCVLRHEIMRIEEVVSHEFGQLTIGSLLRKLQMQQGTGAA
>Cag_0537 Glutamate synthase (NADPH), homotetrameric
MTMDAKTITTKQRLAIPRQAMPAQDPAVRTHNFLEVNLGYTPELAQQEAL
RCIQCKDPVCIKGCPVNIKIDQFIKLIAEGDFLGAARKIKEDNVLPAICG
RVCPQEDQCEKVCVLTKKYTPIAIGNLERFAADYEREHGDIELPSVKAPT
GKRVAVIGSGPAGLSCANDLIQLGHDVTVFEALHELGGVLMYGIPEFRLP
KEIVRTEIDGLKKLGVKFVTNTVVGRSVTVDELMEEENFDAIFIGVGAGL
PWFMGIPGENLLGVYSANEFLTRVNLMKSYNFPDNDTPVFNCEGKNVAVF
GGGNTAMDAVRTAKRLGAKNAYIVYRRSEAEMPARIEEVHHARSEGIEFL
MLMNPLEFIGNDEQWLTGAKCLRMELGEPDDSGRRRPVPIKDSEFILPID
MAVISIGNGSNPLIKQTTPDISVSKRDTIVVDLNTMATSKENVYAGGDIV
TGGATVILAMGAGRTAAAAIHEKLMGGSAS
>Cag_1048 hypothetical protein
MVTWGSNHGSNIYGQIFKNDGSKEGNQFQINTYTSQQPDWNPFEKYAPSM
AALANGGFVVTWNNYWQDGDGSGIYARIYDNSPSTGLLSLDLLDTNWSND
SDNIYRATGRATLGLLSGNSSMLYIQDAQYTLTGNALIIEGEFSAIIEGT
TRSLFIGKAVFDVTTGFAELTVGSNLHNPAGLGFEFTALDLSNNYIDIHF
SPLELPAGISNTNIILGSDSFVIKENYPQLGFYGSVNFPDKTFELFDMLT
VHAYDWSVSYDSPDNELHILGGFELQTGWDNVPGINAELTGDGLVIRNGE
FADVALTITLDDFSVKGWGFNNVSVTLDTEKNSIVGSAGIKLPMFASSLE
TTIGFIVNPFELDTALFHIPFINPGIALGTTGWFLTAIEGGVSNLASSNK
EPLLFKGGVELQLPEVMDLSVNGELDSKHIAGFVEGTIIDKDAIDFQGQV
TLNWNKDYVRVNGSASFAQGMIVGDFGFTSNLNLDFTAKGSATVKFDTID
QILSGHYYLNYSNDHKDSNDYIAAWAETVLHIPVFGDKTVTIGTKYSFDG
TWRAFGAAEVPLYSSWIVDETISDLMVTVNWDNPVNDVETRVVVYDDLEK
TKIRQIISEAEYGEHDIAIISEWSGSTAKVIYINTPEAGLWDVEVINSDG
LGEVVYSATTSLKPLVLTVGELNLNNDQLSLSYIANTPETDGVISFYLDD
NDNGFDGQMIESMADPDGNGQWVWNTTGFHGGTYWLYATLADGKSAPVMS
YAAQSIFINNAPVALDDSIITNEDTAVVIDVLVNDSDFDDNPLRISSITT
PSNGTVSVTDDNKILFAPYADTYGESIFTYTITDGYGGESTAVVNITINS
MPDAPKGSVIIEGHLKQGEILRADVSTLSDSDGMGTIAYQWKVDGTNIEG
ATNETYTLTAAEVGNIVTVEVSYTDGNGKLESVASLATNAVTPINAPNHH
DLDGTITFWQTGDALADVATTLTQPNNGAATATTDVHGYYQIPDVQSGTY
QLSAIKATDTATTNAVTTDDVHAVLKIAAGINPNPDGSAVSPYQFLAADI
NHDGKVRAADALLLLKMIVDYEGAPEPQWYFAPYNIGNEATMDRSHVDWS
VTNPQTVTINDNTTVNLIGILTGDVV
>Cag_1487 putative plasmid maintenance system antidote protein, XRE family
MATLRNIHPGEILMEEFLLPFGISLRKLSYDIVISQQQLEAIVEGRERIT
ADIALRLSQYFGNSAKFWLGLQDDYDIEEGMEQNKVDILSILRFKQPVST
SVE
>Cag_2026 glycerol-3-phosphate dehydrogenase, NAD-dependent
MTIAVLGAGSWGTTLAVLLARKEYEVRLWAHRSEFATALEQERENRRYLN
GVHFPDSLHIVATLPELIAWAEVIVTAVPAQALRETVRAFRDIPLEGKIV
VNVAKGIELGTGMRMSEVLLDELPQLQLSQVVALYGPSHAEEVSKEQPTT
VVASSPSRNAAETVQELFHTNRFRVYINTDIIGVEVAGSVKNIIAIAAGI
SDGLGFGDNAKAAIITRGLAEISRLCTKLGGDAMTLSGLSGIGDLVVTCL
SHHSRNRHVGEEIGKGRTLEEIIHSMSMIAEGVHSSKAVYELSRKVGVDM
PITRAVYEMLFEAKPAAQAILDLMNRDPRSERD
>Cag_0682 transposase
MMHPSPDHMVHYGVEGNCECGLALSESAISIGECRQQWDIPAPRIEVTEH
RQLIATCRCSKVHKGEFPSSLPPYISYGARLKAYTVGLVQGHFISLARVT
EIVSDQYGVKPSDGSVQRWISQASKNLTTTYTAIGETISNSAVAHFDESG
IRAQGKTQWLHVAATTEAVYYTAHAKRGQEAMSAAGILPLFNGVAVHDHW
KPYFRFDHVLHSLCGAHLLRELNAFDETLQHRWPVQLKQVLIDAKNAVAQ
AKKAKQTSLPPEQIADLKQRYEQWLNYGLLIFSERPKINKQQGKGKQHPA
RNLLCRLRDFKDSVLRFIERFDVPFDNNTAERAVRPVKVKLKVAGGFRAM
GGAEAFCVIRSVWQTDKLQQQNPFETLRLVFR
>Cag_1670 glycosyl transferase
MKIALYAGTYVKDKDGAVRSIYQLVNSFKKAGVEVVVWSPDVDPTYNHGS
LVVHQMPAMPIPLYPDYKLGFFSRATRQQLDAFAPDIIHISTPDIIGRTF
LLYAKERAIPVASAFHTDFPSYLEYYHLGFAVKPTWRYLRWFYNKCDVTL
APNESVQQKLESHGITNVASWSRGIDKELFDPSRRSEAQRATWKVDGKTV
FIYAGRFVPYKDTEVVMQVYERFMQSDYANRVAFVMIGSGPDEEEMCRRM
PDAIFTGYLTGADLPTAYACGDLFFFPSTTEAFCNVTLEALACGLPSIVS
DVGGCRDVVERSSAGLVARSGNSDDFYAKCLELLNNPERYQVMRERGLAY
AEQQSWAAVNGALIERYRRMVNQAQR
>Cag_0102 transcriptional regulator, XRE family
MTDKAFKYWHSMSDQALAAHIGNFVKHHRLEQNKTQNALSHEAGISRSTL
SLLERGETVTLATLIQVLRVLNQLQVMDAFEVQQRLSPLVLAKAEQEKRK
RARSTKTQDSEKSDW
>Cag_0565 Adenine phosphoribosyl transferase
MSRIVQLAPCFFTNHKIHTSMPIKSRIRSIPDYPKKGIMFRDITTLIKDP
VGFRLVIDHLTQHYLEAGMDFDVIVGIEARGFIIGGALSYTLGKGFVPVR
KPGKLPADVVSQEYELEYGSDKIEIHTDALVEGQRVLLVDDLLATGGTAL
AAAALVEKVGGIVAEMAFIVNLPDVGGERKLLEKGYNVYSLTDFEGD
>Cag_0810 DEAD/DEAH box helicase-like
MNETETRAELIDPALKVAGWGVVEGSRIRMEFPINKGRLIGYGQRSKPDK
ADYVLQYKNRNLAIIEAKARDKYYTEGVGQAKDYAGKLQVRFTYSTNGIK
IYRIDMQKGEEGDVASFPSPDELWAMTFADQNKLQPLTAIWRDKLLSVPF
EERGGTWQPRYYQENAITKVLDAIAEGKQRILLTLATGTGKTAIAFQITW
KLFHAKWNINHDGKRSPRILFLADRNILADQAFNAFSAFDEDALIRINPH
DIKKKGKVPKNGSIFFTIFQTFMSGPNDTPYFGEYPQDFFDFIVIDECHR
GGANDESSWRAIMEYFSAAVQLGLTATPKRTTNNITNSDTYKYFGEPVYI
YSLKDGINDGFLTPFKVKQIDTTLDEYIYTSDDTVLEGEIEEGKRYTEAE
INRIIEIKEREEYRVKIFMNLINQKEKSLVFCATQLHALAIRDVINQYAE
SKNPNYCHRVTADDGKLGEQHLRAFQDNEKSIPTILTTSQKLSTGVDAPE
IRNIVLLRPINSMIEFKQIIGRGTRLFDGKDYFTIYDFVKAHHHFSDPEW
DGEVELPPPTPPKEKKPCAICGQIPCICKIKEDPPEPCPICGYSYCRCNT
QQKQIVKVQLADGKVRQLQHMVNTTFWSPDGKPISAEEFLHSLFGTLPEL
FKNEGELRTIWSKPDTRKKLLEELSEKGFAKQQMIELQKILCAENSDLYD
VLAYIAFHSNIIERKERAYKAKIYLYNYDLKQQEFLNFVLNQYVQQGVDE
LDEAKISDLLVLKYHAIADAKKELGDISTIRNIFIEFQEYLYWQKVS
>Cag_1947 DsrK protein
MSKYAPKVSELNEEFEKKKPNILKENYSGKEWWDLPVEFRDGNWSFPAKP
EVLEELHFPNPRKWMATDADWQLPAGWEKTIKEGLRDRLKRFRSFKLFMD
ACVRCGACADKCHFFLGTGDPKNMPVVRAELLRSVYRNDFPLAEKIFKGF
AGSRKLTPEVIKEWHMYFNQCTECRRCSVFCPMGIDTAEITIMGRELLNL
IGVNNNWILAPVANCNRTGNHLGIEPHTFVQNIESLADDIEDLTGVTVHP
TFNRKGAEVLFVTPSGDVFGDPGVYTMMGYLLLFEHIGLDYTISTYASEG
GNFGFFTNNEMMKKLNAKMYHEAKRLGVKWILGGECGHMWRVVHQYMNTM
NGPADFLEVPISPITKTKFEQAAGTKMVHIAEFTADLIKHNKLKLDPKRN
DHLRTTFHDSCNVARGMGMFDEPRYVLNSVCNTFHEMPENTIREQTFCCG
SGSGLNPEEFMDMRMRGGFPRANAVRHVKDKHKVNSLVTICAIDRASLPS
LMRYWNPGITVYGLHELVGNALVMKGEKKRTEDLRENPMAGFEDGDDDE
>Cag_1698 Outer membrane protein and related peptidoglycan-associated (lipo)proteins-like
MFFCVNYKPTSLMKKSVSPFVRSVMVPGMLLMGACTCQPVALEPALQPAP
APAPVVVPPPPPPAPPAPKPAPAPAPVVVAPPPPVVVAPPPPPPAPIVVT
KAMILGDILFDFDKSFIRKDAVPQLQDVAAWMKEHPTKNVTIEAHCDSKG
SEAYNIALGKRRADAAKAYLVNKGIDSNRLKTISYGKDKPLMNGIDESAR
ARNRRVHFVVE
>Cag_1333 4-diphosphocytidyl-2C-methyl-D-erythritol kinase
MQAKAFAKINLGLFITGKRPDGYHNLETIFAPINWYDTITFEAADTISMS
CTNLDLPVDENNLCIKAARALQQAAGVQHGIAMTLEKKVPFGAGLGGGSS
DAATVLRVLNHLWQLNLPHATLHNIAVKLGADVPYFLFSKGIAYAGGIGD
ELEDLQTSLPFAILTVFPNEHIATVWAYKNFYRRFELQRPNLNTLVKNLC
TTGNTSALPSFENDFEAAVFDHFPTVRDVKTMLMENGALYASLSGSGSAL
FGLFANEAEAYAAIESLPATYRTNITPARFVMDDGTGL
>Cag_0245 hydrogenase/sulfur reductase, alpha subunit
MSIERNLDINIHHLTRVEGHANIHIHVKNGELVEAQWAIIETPRFFEVML
KGMSAEQAPFLTSRICGICSISHSLANIRALERAMQITPPPLAEVVRRLA
MHGETLQSHALHLFFLAVPDFAGLPSVLPLMESHPQVVRAGLQLKELGNS
ISIAAAGRATHPVSLVLGGITKAPELQQLKELLTMTQERKLALAQAVTFF
EELHIPNFTRETEFISLCNGVTYPAIGGQLVSSDGITRDENDYLLMTNEY
TRDFSTSKFTRLSRNSSAAGALARFNNNYHLLHPNAQAVAARFGLQPVCH
NPFMNNIAQLVECVHILEDAEELIGMVMDMKGEETKCHYTPQAGAATGAV
EAPRGILYHYMETNDEGMVVRADCIIPTTQNNGNIYDDLQALAAELLREG
KCDGEIQLQCEMLVRSYDPCISCSVH
>Cag_1846 Ribosomal protein L22
MEAKAVLRNTPTSPRKMRLVAGLVRGKQVNQAKAILLNSTKAASKNVMLT
LKSAVSNYTLNNPDERVSDQELFVKAIFVDGGMTLKRTLPAPMGRAYRIR
KRSNHLTIVVDKVKNPVIN
>Cag_0080 conserved hypothetical protein
MEAVWLKQSGADELLLFFNGWGMDRRSAEYLYHIVIRDGWRGDCLSFFNY
KDFAIEPSLIDAISNYKRCNLLAWSFGVWAARHVALPPIECAIALNGTIF
PLDAERGIAPELVAATCNGWSESNRQRFERRMCYSRQLHEQFADITSQRT
VADQQAELATLQPLMLVSQAAALIPSPKSSPLPQASSQLSRSASAAIAPL
STWHYQHAIIGGRDLIFPPQAQQTAWQGTPTTFIADMPHLPFFHLPALTE
LLAWHNV
>Cag_0452 fic family protein
MWEELYILKKRYLEMGLSEAIDYEKFSMISIVYNSTKIEGCSLTESDTQL
LLENGITAKGKPLADHMMVKDHYAAFLFLKEIAKQKQKISIELIKKVAAL
VMKHTGGLVNTISGTFDTSWGDFRKAQVYVDRKYFPDFSKVENLLLKLID
NVNQRLDTVFGNEILKLSADIHYNLVNIHPFGDGNGRTSRLMMNYIQMYH
QEPLIKIFTEDRAEYIDALNKTEELEDISIFRDFICSQQIKFYKAEIKKF
EQKDKGFSLMF
>Cag_0033 hypothetical protein
MLGAWRNITWIFMNSRTNILMPLTNARGLGKVTFFTGLDFPIVMKPSLMG
EPVLTAQAFYFLERLDSVVTASFPSNKVPLPHEIDYIIDNYLFEYSKRHP
DKKITSKITEFVFWQEDPDNAYFSYDWKLTECLVLDTLNDTSIIDCNDPK
RTMGEIFDWCLYKPYFEDALEEYKNKLEEAAKYVANVKTQNHSSLGTGEY
QLPIIRVNSKPLTLAQVNMLEVVSADRKIDLTSDTYEVNRGMKSSTNYFL
PQEVITVNNRHNPQLLAYYFSAVRDYSPISQFKNYYNVLEYFFEEAPNHL
GITAKTEAEQIIAVLKLFIDPVELNKKFNEIDKATLALIEKPQITSSGEN
IAGIDFSVTDILAEYGRHIYQIRNACIHSKKTRKGKSTPRFIPSYDEEKI
LEYEMPILQWIAIQCIEKESII
>Cag_0552 TonB-like
MNKPVADESPFLAYGITVALALLATLWLSAILLQNNAPLFVTDEGAHSST
RSGKMVIRTISLMSNSPDTPATTNTSTNSQTDLTTSTQNNTSSIAPTTPS
TNSAAPTEVAAQQPTVSNQQNGGESNQITRSTISNASSEGGESGTTTTAT
TSSTSDNAIQATACDVMPRFVVAKKPIYPEQARRAGMSGKVYVNVLLSEE
GRPIKAMVVKRQPTECTLFDAVALKAVMESRYSPGIQNGKAVKVWLTVPM
RFELK
>Cag_0364 putative signal transduction protein with Nacht domain
MVFTIRRHVVSLFIRIKRMSCHSGFLVLATLLFGYAGIAVAQEPSAIDSV
GVVTKQLKEFGWSVEDITLIGGPFGAVILGVWAFFKFFYPDRKKKTEERN
TVNQYAERYKETLKAQIEKQTLSTTAFESVSVSLADTFVPLRLSEKWRNE
TRFMPESLFSAKNSDAILTPEEVLQLALKKTYKRLLVIGEPGSGKTTLLY
YYALLCLEPNRAKELGVRIPELVFYLPLRELSKTKDRYNLLPENLWAFSE
KHSHSIPIEVISGWLRSTTTLVLLDGLDEISELEERIQVCDWIDNAVTSF
PKAYFIVTSRTTGYRKVDNVELKHFVRVDIMDFTLKEQKDFLHRWFEAAF
LEAETPFADTTEAWKKSQQEKANAKAEAIIAFLTADENKSLQVLAGIPLL
LQMMAMLWKNCDDLPRSRVELYREALNYILDYQNKPKKKKPLLPAEQALA
VLTPVSLWMQEELHKDEVKKEAMQQQMQIALCELRDHPNPPTPLTFCENL
IDRAGLLVEYSDKEYLFRHKSFREYLVAVQLVNNIKQRNKSLDGGVDYLS
ILVAHFGEAWWKEPICFFAAQADADMFNTFLQKLFDSPVSKDFSQEQQDL
LERVIQEGSKEELVALHTKLLDSEMNANQQRYLLQCLEINKHPSTGDVVR
QFVEKKLAKDDDIRRRAGDFTVTGRVDKQGAQYLLIQGGQFIYSVTKKQE
TVPDIYVARYTVTNKLYRRFIAYLDGKEESYARLLPLERYRKNLEKMALG
IKGFRRYLRGNENLATLFRSRYDDDRRFNGDEQPVVGVSWYAARAYCLWL
SLLDGGNTSLYRLPTEIEWEYAAGGKEQRRYPWGNEEPTSTRANYGDNEG
ATTPVGRYPEGKTPEGLYDMAGNVWEWMENSYVFGLVSFIVFIIATLIAS
ISDSRIYSVFSTRSLRGGSWENVSDNLRCSSRSNNGFPRFRLYFYDGFRV
IHSSHSSLKI
>Cag_0991 hypothetical protein
MRLMTSDQLPTTQQPDDYDSPWKEAIEHYFPEFMAFYFPNAYTAIDWSTP
YHFLDQELRTIVPQSAQGKRVVDKLVKVQLLDGKERWLYIHIEVQGRREV
NFPKRVFICNYRIFDQYGVPVASFVILTDTDYNWRPTSYSYEFAGCKHTL
EFPIVKLLDYEPRMEELLASDNAFGLITAAHLLTQKTSDNAFHRLDAKKQ
LILLLYEREWERDRVKELFRVLDWFLELPKELNQQLQTEIQQIEEGQKMK
YISTFERYAMEEGIEKGKELGVLEGMERGKVEGKLEGLEEGLMKGRLEVA
QRLVAGGMSKAEAASFAGVSVDLL
>Cag_1473 conserved hypothetical protein
MASKPSILLFSEDFPPNYGGIGQWAIGVAQSIHRMGYPLHLLTRYMNPEA
EMLQNREPYPVIQVHGKRWSQFRSFYTYSAIKNIYKKGIKPDIVIATTWN
IARGITRILKKNKTKLVIVVHGLEVTRTMPWLKTRWLQQTLNAADAVIAV
SNFTRDRVIERCNINPSKVHFLPNGVDPQRFFPRSNTTHLQEKYNLHNKK
VILTLARLQERKGHDKVIEALPTVLKEIPNAHYLISGALKGTYYKTLQQQ
VSNLRLNEHVTFTGFVDSADLNAFYNVCDVYIMPSRELEKKGDTEGFGIT
FLEANACEKAVIGGRSGGVADAIDDGKTGYLVNPLDSNEIAEKLIYLLSN
PELATQFGKQGRQRILTSYTWDAVTKKLLATIA
>Cag_1289 Histidinol dehydrogenase
MLPLYRFPQEASALQERLVRHVSFDEAAHKAVDEILAKVRQQGDRAVLNY
TEQFQGVRLTSMQVDEEAIEMAYRHADPSLIATLHEAYANIVRFHEHEVE
RSFFYEAEGGVLLGQRVRPMERAMLYVPGGKAAYPSSLLMNAAPAKVAGV
CEIAVTTPCDATGVVNPTILAAAKVAGISSIYKIGGAQAVAAFAYGTESI
PKVDIITGPGNKYVALAKKQVFGHVAIDSIAGPSEVVIIADESAHAEFVA
LDMFAQAEHDPDASAVLITTSESFAQAVQQAVASLLPTMLRHETIASSLL
HNGAMVLVPSLDDACAVSDMLAPEHLELHVVQPWDILPKLKHAGAIFMGS
YSCETIGDYFAGPNHTLPTSGTARFFSPLSVRDFVKHTSIISYSPEQLRS
KGAQIAAFADAEGLQAHAEAVRVRLKTL
>Cag_1449 conserved hypothetical protein
MCINMWGCSNPETERRTSNSMLSGADHPSQEGWNIYMVLSESGREKAIIQ
AGHGAEFEEAQHLDNGVTLQLFDSNGSNRTTITANKAVIFNNQDIEAEGD
VTIRSTSATGQTTRITTEYMKRTANDQMIRSDRLVTITRGEEVLRGNGFE
SDQYLKRFRIFRGSGEAVKQ
>Cag_0562 putative PAS/PAC sensor protein
MIKNTPLSNDLALLRAQAEELLTKGQQQLSDPSISPGDMQRILHELAVHQ
IELEMQQEQLLQARVELEESIECYTELYDFAPLGYVTLSRNGTIQQVNLT
ATKLLGIERSRLVGDYFGRLIVAEDIKAFQAMLEQVFVQQKHTSCEVKLF
NTRQQSTSPLLPSPAHNETLPHTIRIDAVLSTDGEECSVVVSDISMQKQI
ERENQALKEQLNQTLLPELTPNEITNSEIDDNNAADFNHALLDKVIHSRI
RFAILSYLSTAGKASFVTIKRQINTTDGNLSVHLRMLESAAYVSSDREIN
ENKMQTTLYTITEAGNHALQAYKRQLRAFLGL
>Cag_0513 putative DNA-binding protein
MCMDGTKNEIVLYQSNELTSHIEVKVEDDTVWLNRQQIATLFGRDVKTIG
KHINNVFLENELNKSSTVANFATVQNEGGRVVERQVEYYNLDVIISVGYR
VKSKQGTQFRIWANQVLKDYLLKGYVLNQRMNCIENSVENLACKVKEIEL
QITSNAIPNQGVFFDGQVFDAYELASRIIRSAKQSIVLIDNYIDESTLTH
LTKKEKGVRVLLLTKNITKQLALDVQKANEQYGNFELKSFAKSHDRFLII
DTNEVYHIGASLKDLGKKWFAFSQMDKSSVSTILTSIDTML
>Cag_0357 DNA-directed RNA polymerase, beta subunit
MADATTSRIDFSKIKSIINSPDLLKVQLDSFHNFIQDSVPLAKRKDQGLE
RVLRGAFPITDTRGLYLLEYISYAFDKPKYTVEECIERGLTYDVSLKVKL
KLSYKDEADEPDWKETIQQEVYLGRIPYMTERGTFIVNGAERVVVAQLHR
SPGVVFSEAVHPNGKKMFSAKIVPTRGSWIEFQTDINNQIFVYIDQKKNF
LVTALLRAIGFARDEDILALFDLVEEVSLKASKREQLVGQYLASDIVDMQ
TGEVISARTAITEEIFEQILLAGYKSIKVMKSFSNNEKGMDKSVIINTIL
NDSSATEEEALEIVYEELRANEAPDIEAARSFLERTFFNQKKYDLGDVGR
YRIKKKLRREFEELYSFIGEKPELKALSDTIEEKILQTIQTYSDEPIGED
ILVLTHYDIIAVIYYLIKLVNGQAEVDDVDHLANRRVRSVGEQLAAQFVV
GLARMGKNVREKLNSRDSDKIAPSDLINARTVSSVVSSFFATSQLSQFMD
QTNPLAEMTNKRRVSALGPGGLTRERAGFEVRDVHYTHYGRLCPIETPEG
PNIGLISSLSVYAEINDKGFIQTPYRLVDNGQVTDTVVMLSAEDEENKIT
VPVSIPLDENNRIAVETVQARTKGDYPVVAATDVHYMDVSPVQIVSAAAA
LIPFLEHDDGNRALMGANMQRQAVPLLTSDAPVVGTGMEGKVARDSRAVI
VAEGPGEVVDVTADYIQVRYQLDADNNLRLSMLDPDEGLKTYKLIKFKRS
NQDTCISQKPLVRIGDKVEKNTVLADSSSTEYGELALGKNVLVAFMPWRG
YNFEDAIILSERLVYDDVFTSIHIHEFEANVRDTKRGEEQFTRDIYNVSE
EALRNLDENGIVRCGAEVKERDILVGKITPKGESDPTPEEKLLRAIFGDK
SSDVKDASMHVPAGMKGIIIKTKLFSRKKKIGLDIKEKIELLDKQFAAKE
YDLRKRFAKWLKHFLDGKTSTGVYNDKGKVVVPEGTVFEESVLAKFATAQ
FLESVDLSRGVVSSDKTNKNVTRLIKEFRFLLKDIADERENEKYKVNVGD
ELPPGIEELAKVYIAQKRKIQVGDKMAGRHGNKGVVGKILPVEDMPFMAD
GTPVDIVLNPLGVPSRMNIGQLYETSLGWAAKKLGVKFKTPIFNGATYEE
VQRELERAGLPTHGKVSLFDGRTGERFDDEVTVGYIYMLKLSHLVDDKIH
ARSTGPYSLITQQPLGGKAQFGGQRFGEMEVWALEAYGASNILREMLTVK
SDDVVGRNKTYEAIVKGQNLPEPGIPESFNVLVRELQGLGLEIRIDDKVP
>Cag_1820 hypothetical protein
MGEAYLQEIYINKLSEGLSRLIACELFLDRYYSSDSIFEAEAAILQIRKA
MECVAYAAVAPNKSKYAEFRSQADKAIDFTKDFHAGTILKMLSKINPDFY
PKPVSAPLNVSLGKWHFDRRNDKSLSQKQFESFYDRLGKLLHADNPWGNE
KGLRNLLADIPSTIESIRLLLSLHFTVIRTSEFNGVWIVESPNNGQQPRV
IVGQAIGEFAVEE
>Cag_0929 4-diphosphocytidyl-2C-methyl-D-erythritolsynthas e
MNATAIIAASGIGKRMKLANGGCKQMLEIGGFSVIYHTLKAFEQAPSIHN
IYLATRAESIPLIQELAASANIGKLTAVVEGGKERQDSINNCIKAIEVKR
HQSGVTPDVILVHDGARPFIQPEEIEEIARLSLLFGACVPATRPKDTIKF
IGHDPEFFGETLDRNRLVQVQTPQGFRSELLMQAHQQAEAEGWYSTDDAA
LVERFFPQQLIKIFEMGYHNIKITTPEDILVAEAIYQQLQAHGNATSEAA
SAS
>Cag_0321 Exonuclease VII, large subunit
MEAINAMDALSVTELTAHIKSELESLFPFVRVRGEISNCKQHSSGHIYLT
LKDSGAQLPAVIWKSTASLLSIRPKDGMEVVAEGRLELYPPSGRYQLICR
HVAQAGVGALQQAFAELVQKLAALGYFDENRKKTLPTIPTTIGIITSPTG
AVIEDMSKVLARRFPAARIALYPVKVQGAGAAEEIAQALDFFNHTKKKQW
KPQVIIVARGGGSLEDLQPFNEEIMAHAIYRSAIPVISAVGHETDITIAD
MVADVRAGTPSIAAELVVPDSAQVLRDVEQMVAYAQQILNNKIEGAEREL
HSLCNSYAFNRPILKMQQCYENLDRFEASMMRSVETTYRQQIQRCTASIQ
QLNLLDYHKTLERGYALIKKNGRFVTSAKALQPNDTIELLLHDGVRKASV
KPPDAFA
>Cag_0398 conserved hypothetical protein
MLLMKRYQWLGQLVATTLLWLLLYNNLELGADQLLRLIGLTRAVPFGEAL
HFFVYEVPKVLLLLTGVVFVMGVIHTFISPERTRALLSGRRTGVGNVMAA
TLGIVTPFCSCSAVPLFIGFLQAGVPLGVTFSFLISAPMINEVALALLFG
MFGWQVALLYMGMGLAIAIVAGLIIGKLGMERYLEEWVQQLQNSGMADEN
NEDNAMEVPERLAYGWKHVQEIVGKVWFYIVLGVGLGAGIHGYVPENFMA
SLMGNNVWWSVPIAVLLGVPMYSNAAGILPVIQALLGKGAALGTVMAFMM
SVIALSAPEMIILRKVLKPQLIAVFASVVALGIIIVGYVFNAVL
>Cag_1246 nitrogen regulatory protein P-II (GlnB, GlnK)
MKEIIAVIRMNKVNATKQALIDAGIPAFTATGRVMGRGKGLVHSDILKGA
EAGHPEAIAQLGSSPRLVSKRIVTVVVPDELSKTAIDTIIKTNQTGNHGD
GKIFVTPILETIRVRTGEEGDVALN
>Cag_0819 hypothetical protein
MTAIDIKKTLAIEVEKLSVDALQEVLDFVQFLKIKQWRNREQVSFSQQRI
ADDLHAFDINSVVHLEEEFADYKKEFPYE
>Cag_1355 Initiation factor 2B alpha/beta/delta
MFFSQCNNRFVTGRLFFPIKSTSRMIDAISFNNGTLHYLDQRFLPLQEQY
VDTKDYLQAIEAIKTLAVRGAPLIGVAAAYTIILGINSFKGSKEEFPAYF
KEMVAEVEASRPTAVNLFFAAAQMKKIFAEHYEADTLEVLMQKMREGARK
IHDDEINNCDLMARHGVDQIKIDLAEVLKTRKLNVLTHCNTGTLACCGTG
SALGVIRLAWQEGLIERVITAESRPLLQGLRLTAWELQQDGIPFVSISDS
SSAFLMQRGMIDFGIVGADRIAANGDSANKIGTCAHAISLYHHKLPFYIA
APVSTIDITIPDGTHIPIEERNADELRTIYGTQVAMPNTPVLNYAFDVTP
SYLIRAIITDKKAVVGDYVNGLAALM
>Cag_1790 conserved hypothetical protein
MAILTRSMVLQRQASPYQFRFALIRCIAVLMLCAPISLYAAEEVQQAESA
AWKRDVALRLEKYCITVFRSPGSGKTRENNLRVARFYATRSYQPLWSSTT
MTQELATSLNAAFEHGLTPAEYDVAGELPRWMALTNRSAAAQARYDVLAT
RAFLTLATHLRYGKLDPVRFEPTWNFSSPPNLFHFDELLARTLQRTSPSE
VLNGLLPRDPGYDVLKKELARYREIAKNGGWSAIPAGTLLQEGSRDARVP
LLRQRLAASGDISSSAVADTTTLYNPDVTKAVKRFQQQHGLWSDGVVGAT
TLRAINVSADERIGQLRVNLERCRWLLHDISPTSVIVNIPAYTLHYFEQG
DRRWSTRVIVGQPKRPTPVFRADMQLLILNPRWVVPSTVLAKDVLPAVIK
DPAYLRKKKLRVVDENGTIIDPATIKWSSYSASTLPYRLQQKSGDDGALG
RIKFLMPNRYTIYLHDTPDKALFQKTQRAFSSGCIRVQHPEELARLVLRH
SNRESRPSLESRIKSGATSTIRLPQQIPVYLIYLTALPCNNKAEFREDIY
HRDPQILKALDAK
>Cag_0671 UDP-glucose 4-epimerase
MNYKNKQVLITGGLGFIGSSLARSLVKQGAHVTIVDSLIPQYGGNLFNIS
DIRGKLTINVCDVRDPFAMDYLLQGQDYLFNLAGQTSHMDSMSDPKTDLD
INATAQLSILEACHKTNSDIKIVFASTRQLYGKPDYLPVDEKHPIRPVDV
NGINKLAGEWYHLLYNNVYGIRACALRLTNTYGPGMRVKDARQTFLGIWV
RLLIEGKPIKVFGDGMQLRDFNYVDDCVDALLLAGVNDSANGKVYNLGST
EVVGLKTLAEMMVNFYDGATYELVPFPPERKAIDIGDYYSDFSLITKELG
WEPKVGLQDGLKKTVAYYQVNHAHYWA
>Cag_0419 Chromosome segregation protein SMC
MYLAKIELLGFKSFAQKVRIRFDKGLTAIVGPNGCGKTNVVDAMRWVLGE
QKSSLLRSAKMENIIFNGTRTLKPLSMAEVSLTIENSRNVLPLEYTEVTI
TRRLYRNGESEFFLNQVACRLKDILDLFTDTGMSSDAYSVIELKMIEEII
SNKSEERLKLFEEAAGITRYKQRRKQTFKLLESTSRDLLRVDDVLAEVEK
KVRSLKTQVRKAERLREIKERIRALELALAWRSMEELHDKLEPMQRSISK
EELINHEQSATIARMESASQEMELTLLKQEGALAELQKVVRESNQKIHEL
EKQQLYVEEQQKTLTADVGRMQAARHRKEEKSRELQQQDLELRQRLEPAK
STKERLLSEYQERSREHEEHNATLIEQRNRLKEQRQASSALLHKANQFIL
QKQALESEQQQLRAALLRLEERLTATRHKLQPTAQAAEETTQALQQISTQ
LAHSQAEEERLAEQQQALQQQIEQQKEALLRHRSTRDALNNRIALANALL
DTFEGMPEGIAFLEKERQKNGGLEGHEGHGGNNSTSRCLADMIEVDERYR
AALGSALGDHLNCYLCPTLSDAHHSIALLQKAQKGKLLFLVQELAASASL
PTLPTIEGAEPLANLVSVPPAWQHALRLLLGTTFVVANAEQAEALVKRYP
AYRFVTLAGELSLGGGLLRGGSNKSNEGLRLGKQAERAALLEEVAKEERS
MQHLEQTMQQLRNELAMLPIPFMRRQIEQFQREKSAIEKKAARLEAEEQS
MQQELMRGEKEHATLLQEQQQSAEKLTALMPFLIDATAQSEAAVTAIEQA
QQALQAKEATYHQVGRDAQSCQSRYRDAELALEKLTISLNACTEQQRTTQ
HELRRLHDDLATAERKQRNLQEQQAALQQQLESARHQNAEEQQRFDSAES
AYREAQAQHHTTLATLRDMRRKQEVSRQLLDTLLREKRELEQSLDHLLTE
TRIKYECDLALMEEPPIARTLQPNEAQHELASLQEQRSQFGAVNELALEE
YEEEKKRLDFLLTQKNDLLSAEAQLRSTIEEINKTALEKFETTFSNVRNN
FIAIFRELFDEDDEVDLLLHSSTDPLEGRIEIIARPKGKKPLSIEQLSGG
EKALTALSLLFAIYLVKPSPFCILDEVDAPLDDANVSRFIRLLKKFENNT
QFIIVTHNKKTMASCRLLYGITMEEEGVSKMIPVRLEKAP
>Cag_1321 hypothetical protein
MLIYRHLIWTEIKNSRLMRISLINVRRKMKTTLSLFFLLLAFSGTCRADT
ETLFTIRDLKGEMQVYSGAIGSVLPQIKKNDDAQSDAIYIQREKEGEKEK
FYIAHNNGRHPVILIQGEKSISFLESYGDNNFIWTVCLDKRNPDGSSLAI
VANIKAAGVAGYTTSSIMSGGAYTLLQPRTNK
>Cag_0142 Aspartate kinase region
MKIYKFGGTSLGSAQLMKNAAALIAAALPSEPIIVVVSAIHPVTDLLLEA
ARQASRGDAAYCEKLQTVEALHATLASLLLEGENFSTTQQTIELELAELG
KLMQGVYLLRELSEKSEALLVSFGERLSAKLLSAYLSAQQVAASFVDARA
LIVTDASYSDARVDMAASTERIQHFFKSIAQAVPVVTGFIAAAPDGSVTT
LGRGGTDYTASILGAALGASEIVLWGDVDGFFSADPLRVRDAQVLPAISY
QEAMELSHAGACVLHPLAVQPAMKAGIPLLIKNVTNPTAHGTCITAKGAL
PHRPTLSVTALTSLNNIVLLTMSGSGMAGMPGIASRLFSVLARHRINIIF
ISQASSEQCITLAINPQQAEKAGALLEAEFAREREARQIEPLGIRRNLSM
IAVIGNNMSGHPGVSAQLFETLGKNGINVVAVAQGANEMNISLVVESHDE
EKALNCIHESFFLSQSKVHLFIAGSGTIAKSLLGQIHGHRQNLHQTLNLD
VVVSGIANTRMMAFSESGINLDAWETALEPRQTQQGIEGYFELIREKNLH
NAIFVDCTASAAVAAYYPTFLQSNISVVTANKLGMAGSRELYSAIREAEK
SSNARFLYETNVGAGLPIINTLNDLKNSGDQILSIQGVLSGTLSFIFNEL
RKGGRFSEIVRRAREAGYTEPDPRDDLSGADFARKFLILGRELGYQLNYA
DVVCESLVPPEYGGNMSVDEFLERLSSVDEWYEREAEQAAADGKTLAYAG
EIVDGKASISVKRVPLSSPLATLNGSENMVVFTTSRYLTTPLVVKGPGAG
GEVTAGGVFADILRVASYLV
>Cag_0450 hypothetical protein
MNITIDTSSLIAVIGNEESKEKIIKITEGSSLCSPLSVHWEIGNALSNMF
KKGRILLEQAQLALFAYNEIPIKFIDVSLVKAINLSHSLNIYAYDAYVIQ
CAKQTGTPLLTLDNGLKVAAQKSGINLLELQS
>Cag_1112 conserved hypothetical protein
MTTVSTLYGTLSGVEFRSLFSNGKVDGCLVTEPNTLSTPYGALMPQYEAE
DMGRRSVKPLYFYKDGALKSIALQTQTMLTTPIGTIPAELVSFYKNGTIK
RIFPLDGKLSGFWGWKNEFALAIDITFSSPLGLLTAKVIGFQFYESGALK
SITLWPGETLKLPTPVGTISVRKGVAFYESGALRSCEPARKIEVTTPIGT
ITAYDNEPNGIHGDINSLQFYENGSLEALSTIDQSVEVTCSNNCQELFEP
GVKRNVCGDERRISVPMPIRFSKTWVMFNNSPTASFNLQECSFLVQKAEL
KTEAPSYSCAG
>Cag_1171 carbonic anhydrase family protein
MKKRASALLLALTLAASPLVAAEHGNATHSSHAATDSVEPAKALEMLLEG
NRNFATKGKVHHLGTMATNARRNAIATKQKPFAVVVACSDSRVAPEILFD
KGLGEIFVIRVAGNIVGSHELGSIEYAVEHLGAPLVMVLGHERCGAVTAT
YDAHVAGTKVEGNIGSLVQAIDPAVTTTLTRNASGKKAEVVEQCTLENVR
NVATQIATTSPIIKEAIANGHVQVVKAYYDLDDGKVTVVK
>Cag_0411 transcriptional regulator, AraC family
MQASHKAPIHQNSAFSKKVDEAALLSPIHSLVPSEAGEVVCNTVQLSAGM
SMQWCTLHCSNAVLHEGTLLPIATLTDHWEEPMLCIYHSANGRFSFADAV
EGLDGEEMMFGGKECLITYTQERHALLCMEATKGKVELSTLILSKSLLLE
LLDSNYAMLLLDGLQVSSAPSAHLHALPPHVTKLLEPPQQSPSPNLPNLP
LDSLACLTLQVRVLDYLAALLQMVLHRAEHEKNAATLSERIEHLHSTLMQ
YEGKIPPLQELAKRYNLSVRTLQTAFVARYGSTIGAYITRYRLEQAHRKL
LESDLPMKIISRNLGYSHVNHFITAFSRHFGYSPGSVRKTASPKKGNS
>Cag_0104 Elongator protein 3/MiaB/NifB
MAEHVKKVALVFLPSESGVDGARSLYANKAAQHPLKEWANNAFRGIIKRS
QFAIPPLSLMILSSLEVAGVQQVICDLRFEDFDFEMKWDLVGISVQSGMA
RKAFELADALRAKGIKVALGGAHVTLFPESCQPHADVLVPGEADEVWEEA
LRDLVANRLQPLYRAESFPNLQHARPVSKQALQPERYFTTNLIQTGRGCP
YNCDFCNVHVLNGHTLRQRRITDVVQEVARFQQDDQRIFFFVDDSINADP
AYALELFQSLTPLKIRWFGQATTTLGQQHELLSAFADSGCQALLVGIESI
ENASRTAHAKQQNRANELVSAITTIRQAGISLYGSFIYGLDGDTLETPAA
ILDFVAQTKLDVPGINILRPTPGTRVFERLRNEGRLLFDPNDVTAYRYSF
GQEMLYRPKNIPLDDFIESYSQLTRTLFSWQNAVKRGLNAPRAKSAVLLF
NLFYSHLYTLSRNDLQAQKLS
>Cag_1797 hypothetical protein
MDNTTQIGIQYSGFRFKSFFFQGLFDEQENEAFEFQTSLDIRTGSDRVII
GVMVLVNRKSDAQTYAKAETESLFLVEGVERTKDESCSLIIPQVLLITLV
SLAISTSRGALLVKGAGSFLEKIPMPIVDPKVFVSEIQFLGQS
>Cag_1867 glycosyl transferase
MKGDIEGTLAIVVLNWNGAADTIACLHSIIPTLDASVHLLVVDNGSTDCS
VERIRAAFPHIEVLELPHNLGFAAGNNAGFRRVQALGAEYLLFLNNDTVV
APYFYRPLLNLLQQHPDVGIAVPKIFYHHQPQRLWYAGGEVNLATALIRH
VGLRQFDAPQFNVATSTDYATGCSLAIRVADFEQLGGFDERFTMYAEDVD
LSLRVRAQGKRIAYEPSSMVWHKVSASLGNNSLQKLWMKSKAMVRLCIKH
RAWSGLLLYFVLLPFRLVRSVGGSLLFKIGKMR
>Cag_1606 hypothetical protein
MSTSGAIASLDAFLHRWKAKAGNYTGDYITTPEGLVRNNMDDEQGRGGYY
QEYACTSESQVMMARGYLRAYQATGESRYLQNARTAMQALIRYFFFGKVP
STATAWRSHWIVNAGAPFKSKENGRTTDTIAFGEAYECWPTWRKLRPNEF
ATAGDSMHWFIENFHLFSQLETEDQKGQWLAARDAMFREFKLLLSPKWQA
KYKGAIPFEYTNKGDNLTVRSTSIFRGPYYTGYQNPLPWLYMQDYTAAAN
MLQLLVESQVAYTKSTGVKGPFAPVYHYDASLLGSAKKNVFTWNGPDPNT
FWGGFQYRPFADVAHFWYHCKRSNIQNAAVSNASKVCMSFLSWLDGWLTA
HPNNEYVPTEFREATQPSAPPANGDNDPHMIALALKGALFCKNAGADAAM
VGRVIARLYAMVMKRQSKAGDMAGAFMHDPYSHIFKGFWAGDIMEALALY
IMHHEKG
>Cag_0466 Succinyl-CoA ligase, alpha subunit
MSVLVNKDTRLLVQGITGGEGTFHTSQILEYGTNVVAGVTPGKGGTLYHG
NDKDKFCRPVPVFNTVRDAVEKTEANVSVIFVPAPFAADAIMEAADAGLK
VIICITEGIPVNDMMKAYAFVQQKGAILVGPNCPGVITPGEAKVGIMPGF
IHKKGTIGVVSRSGTLTYEAVHQLTQVGLGQSTCIGIGGDPIIGTRFIDA
VKLFAADDETEGLVMIGEIGGSAEEEAAAYIKENFKKPVVGFIAGRTAPP
GRRMGHAGAIVSGGKGTAVEKIRAMEEAGIHVVESPADIGEAMIKALGR
>Cag_1576 hypothetical protein
MLVSTSIRINQELYEQAKQDAKLEHRSIAGQIEFWARVGRAALDNPDLPV
SFIAESLASLAEPREHATPFIPRSSKQ
>Cag_0086 transposase
MKDTVLFQQALCLPMPWFVKSSAFDIEQKRLTIQLDFQKGSTFSCPTCGQ
HDLKAYDTAEKQWRHLNFFQHECYLTARVPRISCPTCGVKAITDLPWARR
DSGFTLLFEAMIIALVPSMPCKTIANYVGEHDSRIWRIIHYYLDEALEQQ
DLSAVTKVGLDETASKRGHNYVTSFVDLESSKVLFVTEGKDATTVEKFHK
HLLAHKGKAENIKEICCDMSPAFIKGVTTNFPETHITFDKFHIIQVLTKA
VDEVRREEQKERPELAKSRYLWLKNQVHLNQSQQVKLEKLQLKKLNLKTA
RAYQVKLNFQEFFKQAPAYAQSFLNQWYYSASHSRLEPIKEAARTIKRHW
YGILRWFTSNITNGKLEGLNSMIQAAKARARGYRTTNNLIAMIYFIGSKF
EFTLPALTHSK
>Cag_1814 Light-independent protochlorophyllide reductase, B subunit
MRLAFWLYEGTALHGISRVTNSMKGVHTVYHAPQGDDYITATYTMLERTP
EFPGLSISVVRGRDLAQGVSRLPNTLQQVEQHYHPELTVIAPSCSTALLQ
EDLHQLAAHSGVPPEKLMVYALNPFRVSENEAADGLFTELVKRYATAQDK
TAMPSVNILGFTSLGFHLRANLTSLIRILQTLGIAVNVVAPWGGSIGDLA
KLPAAWLNIAPYREIGANAAAYLEEQFAMPALYDIPIGVNPTLRWIELLL
EKINAMAVARGVAPIEMPPLKAFSLDGMSAPSGVPWFARTADMDSFSNKR
AFVFGDATHTVGIVKFLRDELGMQICGAGTYLAQHADWMRKELEGYLPGA
LMVTDRFQDVASVIEDEMPDLVCGTQMERHSCRKLDVPCMVICPPTHIEN
HLLGYYPFFGFAGADVIADRVYVSCKLGLEKHLIDFFGDAGLEYEEDAPA
SNVASGVEPSTPSVSSEVSASSSASPEASAPTPSPDGDMVWTDDAEAMLK
KVPFFVRKKVRKNTENFARGIGEPTITLEVFRKAKESLGG
>Cag_0554 conserved hypothetical protein
MSIWMLHHKHHRHSIRLPEYDYSTCGAYFITICTQNRACWFGEIIDGEMI
LNNVGKMVKDEWLKTEQLRTNVQCGTFVVMPNHLHGIIVINETVGAIHEL
PLKMSQKQRRNMILPKIIGRFKMQSSKQFNQLHNTPGQQFWQRNYYEHII
RNEQDYHRIHDYIVNNPLKWECDSLHP
>Cag_0158 CrtK protein
MSVNIPKLALAQAICFSAAFGGSLFTPQQNSEWYYQILQKPAWNPPDWLF
GPAWALFFLLMGFALYQVLEKGANKPALRPALVAFGIQLVLNFCWSALFF
GLHSPLFALVDIVLLWLAILFTIIKFKPISPLASNLLIPYILWVSFASVL
TFTIWQMN
>Cag_1044 conserved hypothetical protein
MKKIQRMVIKGVAALSLLGCTTLSAFALDKETARAKGLAGEVDNGLMAIP
PGASEEAKELIITINNGRRAEYAKIAATNNLPFDTVGTMMAEKIYERLPA
GTWVQQKGVWVQKKP
>Cag_1197 conserved hypothetical protein
MIIPEKYKVWVEARKKYKLSDAQIQMARELNLNPKKFGKIANHKHESWKV
PLPEFIEELYIEHFGKNKPDVVKSIEQIIESKKS
>Cag_0180 prephenate dehydratase
MTNLLTAYQGEPGAYSEIAALRLGTPVPCASFEEVFAAVESERVDYAVIP
IENSLGGSIHQNYDLLLQHPVIIEAETFVKVEHCLLGLPNASLETAGRVL
SHPQALAQCRNFFATHPHLKAEVAYDTAGSAKMIAEEKDPTKFALASKRA
GELYGLHFFGFNMADEEWNITRFFCITHAAKPKPLRLKEGTATLDNSHYK
TSIAFTLPNEQGSLFKALATFALRNIDLTKIESRPFRQKAFDYLFYVDFL
GHQDEEHVCNALKHLQEFATMLHVLGSYGVVAE
>Cag_1303 hypothetical protein
MKRVLILDTSILCVWLEVPNMKQCGADNDRWDKPRIDAKINAELQNQQTT
LVLPLASIIETGNHIAKAPHSNYERAGALAELIRKSADAQTPWAAFSEQS
TLWSQEQLKALANSWPILAAQKLSLGDVTIKDVAEFYANSGYSVEILTGD
NGLKAYEPIVPIEKPRRR
>Cag_0148 conserved hypothetical protein
MLLLHKLLPLFVLPLGVVLIAIIVGTLVRKQVLLWYAALVLWAFSIPTVA
DGLMHFVEGNRTVALPQTLHQADAIVVLGGMIRRVEGAAEGEWNDAADRF
EAGVTLYRAGKAPLLLFTRGRMPWSPDAVPEGELLVERALQRGVPEAALG
LTAPVANTEDEAADVARLLRERGAERIILVTSAYHMRRSQLLFEHTGLSV
EPYPVDFRVDSYPEPAVLRFAPSAEALYRSETALRELIGWLYYRVKLFFV
K
>Cag_0926 conserved hypothetical protein
MHNSILRERSLTTCNSLITLQDIRTLKALYQLKEQTRILRLPVVNNIIKQ
RVVGQGCIESLKNALYSLQTIYIDDDTGQRRLQLDEAKDIAVDLTYERQE
LQKDIFYLEYGEDKFIEYLSKFSPNFIDYVNKGIEMFKGKHFNAFITDRD
GTTNNYCGRYRSSVQPIYNSVFLSRFAKNRCDYPIIITSAPLKDFGILNV
SINPAHTFVYAGSKGREFIDLDENFHSYPIDEKKQQLIERLNGRLETLLG
KEDFEKFNFIGSALQMKFGQTTVARQDITRSIHEDESVAFLEKVASMVRE
IDPKGENFRIEDTGLDIEIILTIDADAGHQEAKDFDKGDGLAFIAQTINI
KPNGNPVLVCGDTRSDIPMLTKAMEMYDDVWSVFVTRDERLIEDVMNICP
NTLIVPYPDILLTILGLLSL
>Cag_0605 glycosyl transferase
MELSVVIPLMNEADNIEPLFSALNKALRNIEHEIVLVDDGSTDNTVETIQ
RYATATTKLVVLNKNYGQTAAMAAGIEQASGELIATMDGDLQNDPDDIPM
MIRYLYDNNLDVVAGRRAARQDGMLLRKIPSKIANAMIRNLTNVHMHDYG
CTLKVFKRNVAKNLGLYGELHRFIPVLVQLYGAKMAEVNVRHHPRKFGTS
KYGIGRTCRVLSDLLFMLFFQKYSQKPMHLFGSLGFISLAIGMSINAYLL
AIKILGEDIGGRPLLSLGIILTFIGIQLITTGFIAEFIMRTYYESQNKKT
YIIKDVVSKN
>Cag_0436 multi-sensor signal transduction histidine kinase
MRAKASLHIGLVFSALLFALLCGGWWLLHHQFKSALITTTRTELQHNMQL
CRQGLMAQPLTFWQSPQAVSQWLGESARLLNVRITLIESNGTVVADTMMP
SHKLHQAENYHMRPEVKAALKHGFGEHIRFSYATQEQQLYTTLPMVFPDG
RRMVICFSKPLYDVGWYKEHVQGNVPLLFLGMFVMSLGVGMGSGFLLTRP
LRQLAAVARQRLQGDFSAALSIKPKHEFGELAHALNSMSDSVITMRRHEE
WYLAVFSAIREAIIVTDAAGDIIFANPSAARTFRMGQTIFTSRPVKHLPD
PTLQELFNRVHTTRVMVRKEEVALSTARGERIMQINSMPLATMGKTYEGC
VFVLNDITTVRNLEKIRRDFVASVSHELRTPLTVISGYTETLLEGALHDP
AHAVPFLKTILQASQQLTALVNDVLDLSRIESGAIDYQFTSVDIGGVVRK
AVEFLKPSLEKKQIRLDVRITAGLPTIYADARYLDIVIRNLVDNAINAVD
ERNGRIRISAFAMNKEVVRLEVEDNGVGIAKADLDRIFERFYRVDKGRSR
QYGGTGLGLSIVKHIVLAHQGDIVVNSKLNHGSVFSVLLKVAHSK
>Cag_1706 Protein of unknown function UPF0011
MASPSLQPATLYVVATPLGNLEDITLRAIKILQQVEIIACEDTRRASILL
KHLAISGKRLISYHTQNEPRAIAQIVALLEEGNNVALITDAGTPAVSDPG
FALLRAVHERGIVALPIPGASAVTAALSVCPLPLNTFLFGGFLPHKKGRK
TKLAQLSAIGQPFVLYESPYRIHKLLDELEAILPNAQLFIGREMTKLHEE
YLTGSIEEMRQHLTSSKTKGEFVVIVHPTAEKTINPESDTDADY
>Cag_1730 Pyruvate:ferredoxin (flavodoxin) oxidoreductase
MSRTYKTMEGNEALAHISYRTSEVICIYPITPASPMGEYSDAWAAEGVKN
IWGTVPKIDELQSEGGAAAAVHGALQTGALTTTFTAAQGLLLMIPNMYKI
AGELSPCVIHVSARSLAAQALSIFGDHGDVMSVRGTGFALLASSSVQEVM
DMALISAAATLESRVPFLHFFDGFRTSHEISKIEVIDDATIQAMIDDELV
IAHRNRRMSPDAPIIRGTSQNPDVYFQGRETVNAYYNACADITEKVMRKF
GELTGRNYNLYDYYGAPDAERVIVLMGSGVETARETMEYLNSKGEKVGVI
HARMFRPFDVTRLVQAFPATVKSVAVLDRIKEPGSAGEPLYLDVVNAMHE
GAQQGLLASVPSVVGGRYGLSSKEFTPAMVKAVFDNLAAEKPKNHFTVGI
NDDVTNSSLDFDATFSIEPDDVFRALFYGLGSDGTVGANKNSIKIIGENT
NNYAQGYFVYDSKKAGSITTSHLRFGPNQIRSTYLISEAQFIGCHHWVFL
EMVDLVRFLKKGGTLLLNSPYSADDLWDKLPKVVQEHLIRKEAKLYTIDA
YKVAHANGMGQRINTIMQACFFAISGVLPREEAVEKIKESIRQTYGKKGD
DVVNLNIQAVNNTLTNLHQVTIGSVADSTKKLRQPIVGEASEFVCNVLAK
IIAGDGDTIPVSDVPADGTYPTGTSKFEKRNLADAIPVWEPDLCIQCSKC
SLVCPHAAIRVKVYDAEYLASAPCTFKHTDAKGGHWEGMKYTIQIAPEDC
TGCEVCAHVCPGRDKNAPERKALNMHAQAPLREAEVDNWNFFLHIPEFDR
TKLNIKVVKEQQLQQPLFEFSGACAGCGETPYIKMMTQLFGDRLVIGNAT
GCSSIYGGNLPTTPYTTNAQGLGPTWSNSLFEDTAEFALGFRLSIDKQRE
FAGELLKRMASQIGDTLVGEILNAKETSEPEIFEQRQRVSILKEKLRQID
SAEARNLLSVADMLVKKSVWGVGGDGWAYDIGYGGVDHVTAGDKNVNLLV
LDTEVYSNTGGQASKATPKAAVAKFAAAGRAVTKKDLGLISMSYGNAYVA
SVAFGARDEQTLKAFLEAEAYDGPSIIIAYSHCIAHGITMANGLEHQKAA
VDSGHWILYRYNPQRLLEGKNPLILDSKKPKIPVAQFLNMENRFRMLKKS
HPDIADAYFAAIQQEVDHRWSHYEYLAARSFDEIKQESK
>Cag_0645 hypothetical protein
MKLLKQTSIAALLGMMALPSLTASAAPYSSTLYMPNSHGKSVTTPTAWGA
SGNVGFVGLGGTYQSPYTDDADGAAVFGVGLGDSKENLGVQIALISLDIS
EWEEYSSAFHVFKELGDADAIGIGVENVMLTDGGDSEKSFYVVYSRGVQN
DWALNKNSNQTKFHYSIGAGTSRFGDKSPADIADGKGKHGTYVFGNVAYE
IAEEFNVIADWNGVNFNIGASKSFIINNKIPVGVSVGLADLTTNSGDGVR
LVAGAGFGFKL
>Cag_1325 nucleic acid-binding protein,contains PIN domain
MITYLLDTNIVIYTIKRRPIEVLETFNQHATRMAISAITLSELFYGAEKS
SNVSANLSVIEDFCSRLQVLPYGAKASQHYGAIRAILAKSGQQIGMNDLH
IAAHARSEGLILVTNNVKEFVRVPALQVENWVE
>Cag_1456 Guanylate kinase/L-type calcium channel region
MAVEPSGKLIVFSAPSGAGKTTIATMVLQRIANLSFSVSATTRKQREGEQ
DGVNYYFLDKATFEKKIEQGGFIEHEFFFGNYYGTLLDATESVLASGKNL
LLDVDVKGALNVRKLFGERSLLIFIQPPSMEVLIERLQGRGSEDDAALQE
RLERARFEMSFADQFDTIIVNNNLTAAVDDVEAAIVNFIG
>Cag_1454 glutathione S-transferase, fosfomycin resistance protein, putative
MMKLTGINQITLRVNDLRASEAFYCDILGIRLDHRVGVNIAFLRLNSDML
VLVKAETAGSAEARDIRVDHFGFRLSSDAEVDEAARHLENKGVHLITRPA
NRREGRAFFVMDPDGNLVEFYSMHDGGILPATTDVDTTDPTIPAAQERKQ
KAADLAAAKRARRARK
>Cag_1306 Chaperonin Cpn60/TCP-1
MTAKDIFFDTDARAKLKVGVDKLANAVKVTLGPAGRNVLIDKKFGAPTST
KDGVTVAKEIELADAVENMGAQMVREVASKTSDVAGDGTTTATVLAQAIY
REGLKNVTAGARPIDLKRGIDRAVKEVVAELKAISRSISSKKEIAQVGTI
SANNDPEIGELIAEAMEKVGKDGVITVEEAKGMETELKVVEGMQFDRGYL
SPYFVTNSDTMEAELDNPLILIYDKKISNMKELLPILEKSAQSGRPLLII
AEDIEGEALATLVVNKLRGTLKVCAVKAPGFGDRRKAMLEDIAILTGGTV
ISEEKGYKLENATLSYLGQAGSVSLDKDNTTLVEGKGASDAIKARINEIK
GQIEKSTSDYDTEKLQERLAKLSGGVAVINIGASTEVEMKEKKARVEDAL
HATRAAVQEGIVVGGGVALIRAIKGLNNAQADNEDQKIGIEIVRRALEEP
LRQIVANTGTQDGAVVLEKVKEGEGDFGFNARTETYENLVEAGVVDPTKV
TRSALENAASVAGILLTTEAAITDIKDDKMDMPAMPPGGMGGMGGMY
>Cag_1979 bacterioferritin
MGTRGREIVGSHLERVLELLNKAFADEWLAYYQYWVGAKIVNGPMKDAVI
AELLQHAADELRHADMLSTRIIQLGGKPLISPQDWFTWSNCGYDAPENPF
VQRILEQNISGEQCAISTYSAIIEEIGLKDPITYNIAVQIMQDEVEHEED
LQSLFEDLSLFVKK
>Cag_0091 conserved hypothetical protein
MFTQHNFSLNLAVRDYECDLQGIVNNSVYLNYLEHVRHEYLKHVGIDFAT
LTREGIHLVVIRAELDYKASLTSGDSFCVGLTFLRESPLRFSFLQDIYRL
PDNKLILKAKIIGTALNEHGRPFLPLQLEQLFQST
>Cag_1729 conserved hypothetical protein
MQNNTPSYSGKVLVAGATGKTGQWVVKRLQHYGIAVRVFSRDPQKAETIF
GKDVEIIVGKIQDTNDVARAVTGCSAVISALGSNAFSGESSPAEVDRDGI
MRLVDAAVAAGVTHFGLVSSLAVTKWFHPLNLFAGVLTKKWEAEEHLRKH
FSAPNRSYTIVRPGGLKDGEPLQHKLHVDTGDNLWNGFVNRADVAELLVI
SLFTPKAKNKTFEVISEKEELQTSLAHYYDTL
>Cag_1637 hypothetical protein
MDMQKDINYRLALAQGFLQEAEDSYTTQHWRACVSSAILVIENAGLAVLM
LFGVSPMTHKPGMHLKHLVSEGTLHADLAELIAQLLPYLEQYDSHEKMLA
KYGDETTYELPWQLYDAAKATTALDAARNAVRISTTMAERGV
>Cag_1957 sulfite reductase, dissimilatory-type, gamma subunit
MAIEIGGVRYETDENGYLVNLEDWSEDVAKILAEGEEIEMDEVHWDIVNF
LRRYYAEYQIAPAVKVLTKAVAAEKGMDKKEASEFLYGLFPKGPGLQACK
IAGLPKPTGCV
>Cag_1419 CDP-diacylglycerol--glycerol-3-phosphate3-phosph atidyltransferase
MKEPHETILTIPNQLTMLRIVLVPVFVLLLLQPDAWLKLLGVVVFTVASL
TDLYDGYHARKYGVTTRLGAFLDPLADKLLITAAFLLYVWMGFLLLWMVL
LVVLRDVLITALRIYAEYKNRPVITSVEAKYKTLAQNLFVYLLMAMVLLR
ESSFFGHELASSINSFLTSDYLDGIMLAVTLFTVYTGISYLVSNWGIYFQ
KPAEGN
>Cag_1671 conserved hypothetical protein
MKPLPVGIQTFSEIIKQDYLYIDKTSLANELIKRYKYVFLSRPRRFGKSL
FLDTLKNIFEGKQELFKDLLIYKQWNWEVTYPVIKISFSGGIHSKADLEE
DLIHILNANEKRLELKCENRSKAKYFFAELIQQAFQKYQQSVVILIDEYD
KPILDNIENIAEALIIRDGMRDFYTKIKESDEYLRFVFLTGVSKFSKVSL
FSGLNNLEDISLNPDFGNICGYTQNDVDTAFAPYFEGVDMEQVKRWYNGY
NFLGDKVYNPFDILLFIKNHKMFKNYWFETGTPKFLIDLIKKNQYFVPEF
NGLKADESLINSFDIEKLALETLLFQTGYLTIKQLLLSDVGVSYELGFPN
KEIQISFNNYILQSITQNSQKESIRHELLAIVKAGDVANLEPIIKRLFAS
IAYNNFTNNYIESYEGFYASVLYAYFASLGFDMIAEDITNKGRIDLTLKT
IDKTYIFEFKVIKQEPLEQVKKMEYYEKYDGERYIIGIVFDPKDRNVSKF
EWERV
>Cag_0651 GDP-L-fucose synthetase
MHSSKIYVAGHRGMVGSAIIRILKEQGYSNIVVRSREELDLTDQAAVRAF
FASELPNEVYLAAAKVGGIHANNTYPAEFIYQNLMMEANVIDAAFRCGVK
KLLFLGSSCIYPRMVPQPMQENALLTGLLEPTNEPYAIAKIAGIKLCESY
NRQYGVSHGVDYRSVMPTNLYGVGDNYHPDNSHVIPALIRRFHEAKVNNS
QAVTIWGTGTPRREFLYVDDMALACVYVMNLDNEVYSKHTEPMLSHINVG
CGYDVTIHELALLIGKLVGFAGNIVFDSSKPNGTPRKLMDSSRLNALGWK
ATVDLEQGLGLAYDDFLRQKLSH
>Cag_0034 conserved hypothetical protein
MLLSLPNWIIHISSSLEWGIGAALLFHYGQLTERRDIRTFALAMLPHWIG
SFCVLAYHISGDTIPLLLDMSELINLVGSTALLWATYKLFQSTGGWKAAH
GIVPSIAPIGYLSAIVIAGKPQSWLGEDIFDTILQLSSIVYLAFLLLLIV
IYRRDKTIFSGLTVAGFWFVLVFISITIFCMYLATQMRGYPTLSHDDLLH
GMAESLLSISNLMIVLGAHRQIKAFKGQRG
>Cag_0494 Peptidase S1C, Do
MKKSEKISSRIKKVLLVLSGVAVGALVFSNMEYSVSFNGTTFSNTPSFAT
ATSNIADAPISSLRNFNEAFVQIAESATPSVVTIFTEKTVNQRVVSPFNF
FGSPFDDFFGRPDGNSAERKNVRRGIGSGVIVTADGYILTNNHVIDGADV
VYVRTADKRRLDAKVIGTDPKTDIAVIKVNQQGLKPIVIGDSDKLRVGEW
VIAIGSPLGENLARTVTQGIVSAKGRANVGLADYEDFIQTDAAINPGNSG
GALVNINGELVGINTAIASRTGGFEGIGFAVPSNMAKSVLTALITTGKVT
RSYLGVSIQDIDDNIAKAMNVKAGEGALVGTVMENSPAARAGMQTGDVIL
EFNGAKVTSSAALRNAIATQTPGSMVYIRVLRDGALKSFAARLEEQTPKT
ASSTTPAKKADINSALGFRAEELTPELAQRLKLKGSSGKVVITAIQQQST
AYRAGLRPGDVILSVNKQAVSSVASYNALVKNLAKGELLLLLIERGGNKS
YIAFTL
>Cag_1742 Bacteriochlorophyll/chlorophyll synthetase
MKPVTWFPPMWAFACGVVSTGESITDNWSVLLRGILLAGPLMCAMSQTMN
DYFDREVDAINEPDRPIPSGKISKSASWLITFGLIVTGFLVAFSIHPYVM
AIAFVGVLMSHAYSGPPIRAKRNGWFGNLIVGLAYEGVAWLTGSFAITQG
VPSGNTIALAIIFSLGAHGIMTLNDFKSIVGDNIRKVASIPVQLGEKKAA
LLASVIMDVAQLAAIAILVAKGAMVTTVIALLLLLAQLPMQKILIDHPRE
KAVWYNAFGTLLYVISMMVCAVGIRP
>Cag_1308 conserved hypothetical protein
MPLSRSYKETIQDRAQHDPEFRVALFDEAINALLEGETNVGKALLRDLVH
TTVGFEGLASELAKSSKSLHRMLAPSGNPSMENLFQIINAVKKHAGISVQ
VASSCIQQNTQQIVA
>Cag_1376 Phospho-2-dehydro-3-deoxyheptonate aldolase, subtype 1
MEQLHDLRVSRIIRLPSPRAIKEQLSMNDAAAATVAAGRHEVERILTLHD
SRLLVIVGPCSIHDIDAAREYADKLARLRHELQNELCIVMRVYFEKPRTT
VGWKGFINDPHLDDTYDIEHGLLHARQLLLDINSMGLPAATEFLDPITPQ
YVADLVSWAAIGARTIESQTHRQMASGLSMPVGFKNSTDGRLGVAIDAIR
SAMHPHSFLGIDQDGCSSVITTTGNQFGHIVLRGGSAPNYDAESIATTER
MLEKAGLVQAVLVDCSHANSGKKHEQQASVWENILQQKEEGNRSIVGVMV
ESNLFCGNQPFPEEREKLRYGVSITDECISWDETERLLRQGAEFLRQNAR
>Cag_1121 hypothetical protein
MANELSHQHIGLFEKIRQTDENGNEFWSARDLSKVLEYSEFRHFLPVIER
GKEACINSGQQIADHFEDILEMITTDKTEHREIEGIKLSRYACYLIVQNA
DHGKEVVALGQTYFANLSNIQLLNKSRISKFIYTIEGQQIILDRDLAMLY
QTDTRTLKQAVKRNIERFPSDFMFELSEQQIETMVSQLVIPSKSYFGGAK
PFAFTEQGVAMLSAVLRTSVAVEISLQIIRAFVEMRKMINNNALILQRID
RIEIKQIETEQKFEQLFQALEQKNSKPQQGVFYKDSIFDAHSFVCDLIRQ
AQTSVILIDNYVDDTILTILSKRKNGVRATIYTSKKDKQLELDIKKYNSQ
YPEIMVIEFKEAHDRFLIIYEKELYHFGASLKDLGKKWFAFSRMDSFVNE
VLAKLKNNGNNE
>Cag_1835 Ribosomal protein L18
MSQVDKAARRQKIKDRSRVSVQGTASKPRLCIYRSLAEMYAQLIDDVNGK
TLVTASTMTKNNKAFEGTKSDASRIVGQQIAEKALAAGITNVVFDRNGFR
YHGRVKALADGAREAGLIF
>Cag_1132 acylneuraminate cytidylyltransferase
MQTVAIIPARGGSKGLKYKNIYPVAGKPLLAWTIEQARASQFVDKVFVST
DSEDIADIAKEYGAEVIERPADIAGDKATSESAILHALNVIQAEHHITVS
AVVFLQATSPLRKQGDIDGAIELFRRENADSLISVTKADDLTIWEQRKSG
EWASVNFDYRNRGMRQDRPAQFIENGSIYMFTPETLHRFNNRIGEKLVAY
EMEFWQTWEIDTLNEIELVEFYMKRKGLM
>Cag_1904 hypothetical protein
MLNVTEIHPEYVTDMNGVKKSVILSLSDFYALLENLDDLAAIAERKDEPT
MSHQQVVEELVLDSSLRSE
>Cag_1238 hypothetical protein
MNRIENIHTLTKRLIVMTAISVLAFSALFGISYFYGERFMLTWAGFLCGI
VGGFVSIQQRVKNVSDEELQLLTSSWFQILLIPIFGGIFALVLYCLFLSE
IISGSLFPLFYIPKPTGVIPDTHFLIDVFTKTYPLTGKDLAKFLFWSFVA
GFSERFVPQIINNVANKASE
>Cag_0792 basic membrane protein A
MAAQRVHRFSPFTILLLLLTQLLVVGCSKQEQTASLPSSASAPMRIGLVF
DVGGRGDKSFNDSAYNGLELAKQQHGVDFVYVEPQGEGADREAALREMAA
NPDINLVVGVGLLFSEDITRIAADFPDKKFICIDYIHQPNVTIPANLQGI
AFEELKGSYLAGALAGLTTKSNTVGFIGGMESGIIKKFETGFIKGVKAVN
PNAQVISGYIGMTGAAFANPAKGKELALGQYGRGADIIYQAAGASGLGVI
EAARETKKLVICTDRDQEPDAPGFVLSSMVKAVDRALLKSVESVLDGTFK
GGEVKVYGLADRYTDYVYNEKNAPLIGEATHKKVEELRNNIISGKIELSE
ALHQ
>Cag_0354 Ribosomal protein L1
MAGKKYREAATKIERFRDYELAEAIEKVKEVTTTKFDATVDVAVKLGVDP
RHADQVVRGTVMLPHGTGKTVSVLVICKETKAEEAKEAGADFVGFEEYIT
KIQEGWTGVDVIIATPDVMGQLGKVAKILGPRGLMPNPKSGTVTMDVAKA
VKEVKAGKIEFRVDKAGNIHAPVGKVSFDSEHLNTNIVAFLKEVVRLKPS
AAKGQYVQGIALSSTMSPSVKVKMDKFIS
>Cag_0535 conserved hypothetical protein
MKPLPVGIQTFSEIINQDYLYIDKTGLASNLINKYKYVFLSRPRRFGKSL
FLDTLKNIFEGKQELFKNLLIYNQWNWDVTYPVIKISFSGGIRDKESLRK
NLFYILKDNQKRLNIICEEKEDPNQCFAELIQQAFEKYQKKVVILIDEYD
KPILDNIEKIPEALIIRDGMRDFYTKIKESDEYLRFVFLTGVSKFSKVSL
FSGLNNLEDISLNPDFGNVCGYTQDDVDTIFAPYLEGVDMAQVKRWYNGY
NFLGDKVYNPFDILLFIKNQRMFKNYWFETGTPRFLIELIKKNNYFIPKL
GKIQVNEFLVNSFNLENLNLETILFQTGYLTIKQLLLSDVGVSYELGFPN
KEVQMSFNDYLLHDITTVSEKEPIRHELLAIIKAGDIANLEPIIKRLFAS
IAYNNFTNNYIESYEGFYASVLYAYFASLGFDIIAEDITNKGRIDLTLKT
FDKTYIFEFKVIAEEPLEQIKRMKYYEKYDGERYIIGIVFDPKERNVSRF
AWERV
>Cag_0071 Beta-phosphoglucomutase hydrolase
MFIAMQRSAFIFDMDGVLTDNMRLHANSWIELFRDFGMEGMDADRYLKET
AGMKGVDVLRYFLGQSISAEEAERLTEFKDFLYRVTSRNKITPLTGLQPF
LEQAQQQAIPMGIGTGASPKNIDYVLELLELEQTFQALVDPSQVSNGKPH
PDIFLRVASLLGAEPQHCIVFEDALPGIEAARRAGMQCVAITTTNNADEF
RHFDNVLAIVNHFQELTPQGLLMLLTEKQNTLVA
>Cag_0001 chromosomal replication initiator protein, DnaA
MHDHSPILVTDPHSLKGQKQSSMEQQVWDTCLAVIKESINPLAFKTWFLP
IRPLGFVGGELTIEVPSQFFYEWIEENYSLLLKQTLRDVIGSEARLMYSI
VMDKSQGQPVTIELPQQTTSPFTYEQAPLKVDRIEEQRHESYERNVSRFE
SHLNTKYIFDTLIRGDCNSLAFAAAKAVSQNPGQNAFNPLVIYGGVGLGK
THMMQAVGNSVRENRLTDRVLYVSSEKFAIDFVNAIQNGKIQEFSSFYRS
IDVLIIDDIQFFSGKEKTQEEIFHIFNTLHQSNKQIILSADRPIKDIKGI
EDRLISRFNWGLSADIQPPDYETRKAIILSKLQHNGVTLDDAVIEFIATN
VTENVRELEGCIVKLLAAQSLDNRDIDLAFTKSTLKDIIRHTTKQLTLDT
IEKGVSSYFSITSNDLKGKSKKKEIAVGRQIAMYLAKMLTDSSLKTIGLH
FGGRDHSTVIHAVSTISKRVEQISEERKRIEEIKKRIEILSM
>Cag_0453 ATPase
MRGMGQMGGGGGGGFRTQQDDTLSKGKKKSLDGYIISRLLQYVKPYRNLV
YIAVALTIAGSFLGPLRPYLTKIAIDDHISNGDLHGLGVISALLAGTIVL
DGVKQYITTWMTQIIGQKAVLDIRMDIFSHLQKLPTRFYDRNPVGRLITR
TTGDVEALNEMLSSGIITILGDLLQIFFIVGLMVWIDWKLTLCVLAILPL
MIYATIFFKQRVRQAFQDVQTHIARLNTFFQEHLTGMNIVQLFNREEREA
ARYAAINADHRDANIRTVFYFSIYFPLIEMLSSLAAGIVLWYSAVRIMQA
DISLGVVVSFVQYIWLFFRPLQHLSDRFNVIQSAIASSDRIIRLLDEKEA
TDPTAIEKDLDAFRNRIEFQNVWFAYDDENWILKDLSFTINHGEKIAIVG
ATGSGKTTMINILSRLYPYAKGSVTIDGTPLQDISERSIRNLVGVVMQDV
FLFSGTIRENLAFGNPNVTDEELRKAARIVGADRFIEQLPGGYNYKLLEN
GTGLSSGQKQLIAFVRALLYNPSILVLDEATSSVDTETEQLIDAATNRLM
QERTSIIIAHRLSTVQKADRIIVLHKGVIREIGSHQELLAQRGLYYKLYL
LQHPEQTLKTEAKG
>Cag_2002 putative ferric uptake regulator, FUR family
MRRKFSSPILERYVAFCLEHGLKITPQRSAIYRVMMESTDHPCTEVVFRR
VLAYYPAMSFDTVNRSLLTFAELGLIDIVESVSGVRRYDPNVERHHHLHC
IKCGTIVDFCHPDFDAIPAPPAITENFTILGKRVIFSGICSLCAANAATQ
ATEEIAMES
>Cag_1735 SpoU rRNA methylase family protein
MNLPLPPLSKAALRRYAALHHKKHRDSDKLFLAEGLRTVRELCQQLPHED
FLVALLLRPHAEAEALTFAQPYAHKLFSITAKECAQLADTTTPSGIFAIF
RQPTFQPTIVNPQRRSLVVALDDVQDPGNVGTIMRTAAWFGVDALLCSQG
TADLYNPKVVRSCAGSLFALPHYRCNLMVHELSRWQQQGYSIVCSSLEGR
DICGGEAFGDKTVIVIGNEANGVSQAVQASADMLVRIPHAGSKPAVESLN
ASIAAAILIERCVLR
>Cag_1776 hypothetical protein
MTEVEEIVHRVQKLSKDDFAHFKQLVQDIDNDYWDQQIATDFRQGKFEQL
IKKARQEFAEGKARAL
>Cag_1806 Putative oxygen-independent coproporphyrinogen III oxidase
MSLSLYLHIPFCRERCPYCDFYLITGTGQVEPFFAALARETAFRADELQG
ATISAIHIGGGTPSLVPVAMLSRWLEQLACYATFAPTMELALEANPEDIT
PALLDELQSLGVNRLSIGVQSFSIQKLTVLGRKHSATDALRVTEMALERF
ASVSLDLMCGLPHETLAVWEGDLSTALALQPHHLSVYMLSIEPKTRFHWL
VARGELPAPMEAEQALFYETAIHTIKLQGYQHYEVSNFCLPNFHSRYNLA
SWERKPYLGFGAAAHSFIVQQNREIRQANIESLSRYLAHPENAVAFREEL
GCNERFTEELFLTLRLNRGLSRSFFSRTASRDVVETLFATFQEQGWMYED
NERFYLTERGFLFADYIAEELLAK
>Cag_1066 hypothetical protein
MNYIAVSEATYLDGYRISLTFNTGESGEVDLGDLIHRYAIAEPLRNPQNF
ARFYLDSWPTLAWECGFDVAPESLYARATGKLFPLPQPSNSPL
>Cag_0082 Dethiobiotin synthase
MKGKVFAVTGIDTGIGKSVVTGLLARALQEQGRSVITQKIVQTGCTNEIA
DDIVEHRRLMGIPLQEVDREGITAPYIFPFPASPHLAASLAGATIDPMQL
RRATFFLQKRYEIVLLEGVGGVLVPLTPDLLFADYIAQAGYAVLLVTSPR
LGSINHTLLSLEACQWRGITIRALFYNHFGYEDILIANDTRQLLTSYLHT
CGIDAPIIDVENGALSREGVELLHQLF
>Cag_0549 MotA/TolQ/ExbB proton channel family protein
MKQSLITALLIALTYAVSLGFYAWMGTMPHGTLWYAVWKGGWMVSVLLTL
ILLVIAYSVERLLAFNKAIGNGNLPNLVQSVQQDVQAGAIDRALERCNQH
QSLHATVLGAVVERYKWLNTQQISEHEKRRQELEKAATDATTIAMPTLER
NLVVLSTIASISTMVGLLGTTLGMIRAFAAMATNGAPDAAQLSLGISEAL
FNTALGICGGIFGIVAFNLLSNRVDRIGYEMDEAALKLIQTFATPSR
>Cag_0979 ATPase
MIITMLDAQQLSKSYTLAGKRTINILQGVTLRVHEGEMVTIVGASGSGKT
TLLNLLGTLDTPDSGTIVFNNQPLFQNNRYTLSRKALAAFRNRQIGFVFQ
FHHLLSDFTALENVAMAEFIATGNLQAAKVRAAELLTNFGLSERLDHLPS
ELSGGEQQRVAIARALMNNPKLVLADEPSGNLDQRNSQLLYELMARISKE
QRTAFIIVTHNYDYAAMADRCFCMENGLLGEYKSYEKSTAV
>Cag_0867 hypothetical protein
MIESNLDFYRPIVEQIVERWAVGKPPLPTTGKPSGYYRLTNYLLNYLVEH
DAFPTGIHQMPEGLDAQQQVEPSFPVDFNVVIGETRLPKISVNKGEKL
>Cag_0218 conserved hypothetical protein
MATVSVYVSGQTEQNDVIEFFQKGMIGADEHPIAFFEGVFYESHQERVGN
IAFQDYLVYTNKAIYLWARGASKDYLDRFNLGAVSINSRNKDRDFATLNL
KVRREDKEPIYVIFDMVELREAELITRLHTLVETIIEDRLGLNYRQQIPD
EIAVYILHSAKSLCPPQSITFSAGEPNAPQQDSQIGYGQDLLEQYKASLG
YPSPEPSPTQAQSRATAAAPEGFSPADALKGLEHLLPTDPAAIKKIAESL
KEVIGDAPFKLRDQLKNDLQHVPGMLSAVTELLTSIADNPQAERFVLNLV
KTAVKNDGVLGSVSKLMKLSSTFGGDNNSKRRSSSSQQASGRSEQGSASS
KRRNESFDDDMPKRKSIHIKQEDDEVILPDCFSGLDLPFEESAPPTPAKK
REAEEISGTKISPRKPIVIKADEDAIPSIVKTMSASDTPLPNANNSNDKL
>Cag_1840 Ribosomal protein L24
MCNILIPSNMKTGIKKVKLHVKKNDTVTVISGNDKGKMGKVLKVFPVASR
VIVEGVNIRKRHMRPLQGQTQGRIIEREFPIHSSNVKKS
>Cag_1435 SAICAR synthetase
MNKKEQLYEGKAKKVFATDNPDLVIQEFKDDATAFNNKKKGTIADKGVVN
NAISCKLFTLLEQHGIRTHLVEKLSEREMLCRHLDIIKAEVVVRNIAAGS
LVRRYGFAEGTVLECPIVELYLKDDDLDDPLMIEAHAVALGVGTFEELAH
LKQQAEVINTVLRSFFAERKLKLVDFKLEFGRHNGEILLGDEISPDTCRF
WDLATNEKLDKDRFRFDLGSVEEAYSEVERRVLEL
>Cag_0502 conserved hypothetical protein
MAPKQAKSEESAMPLSTINYIMIACGVVVIAATYWGMALERSVDGFFSLV
VSPILLIGSYLWIIVGILYRGKSSSNAKKR
>Cag_0623 Riboflavin biosynthesis protein RibD
MPALEHTFYMQRALELALRGAGRVSPNPMVGALLVQEGEIIGEGWHERYG
EAHAEVNAIAAVTNEAWLREATLYVTLEPCSHFGKTPPCSDLIIAKQIPR
VVVGCRDPFPAVAGRGIAKLRAAGIEVIEGVLEAECLQSNEAFIKSHTVG
LPFVTLKLAQTLDGKLATVTGASRWITGEEARAEVHRLRSVYDAVLVGGA
TALADNSQLTVRQANGRNPLRVVLDRSLQLPLESLIFNHEAPTLLFTSLS
QQHSPKVEALQKLGVSVHAVSESAEGLQLREVLEELHHRHILSVLVESGS
RLGAALLQAGFVDKLLIFIAPKLFGGDGLSAFAPLGVTVPDEAIALRFEL
PRFFGKDLLLEAYINS
>Cag_0794 conserved hypothetical protein
MDAFWLSLVMIFLAELGDKTQLVALTLATCYNTSVVLWGIFWATLAVHVF
SAAIGWFIGDQLPTEWILFVAGVAFIAFGFWTLRGDSLDEEEESCKRGIN
PFWLVFTTFFMAELGDKTMLSTITIASTHPFLPVWLGSTVGMVLSDGLAI
VLGKMVGKQLPETLIKRGAAAIFFLFGAYSMYDGGATFSPLIWAIAGMVV
LLFGYFFLRKPKA
>Cag_0658 transposase
MKDTVLFQQALCLPAPWFVKSSAFDIEQKRLTIQLDFQKGSTFSCPTCGQ
HNLKAYDTAEKQWRHLNFFQHECYLTARVPRISCPTCGVKAITDLPWARR
DSGFTLLFEAMIIALVPSMPCKTIANYVGEYDSRIWRIIHHYLDEVLEQQ
DLSSVTKVGLDETASKRGHNYVTSFVDLESSKVLFVTEGKDATTVEKFHK
HLLAHKGKAENIKEICCDMSPAFIKGVTTNFPEAHITFDKFHIVQVLTKA
VDEVRREEQKERPELAKSRYLWLKNQVHLNQSQQVKLEKLQLKKLNLKTA
RAYQVKLNFQEFFKQAPAYAQSFLNQWYYSASHSRLEPIKEAARTIKRHW
YGILRWFTSNITNGKLEGLNSMIQAAKARARGYRTTNNLIAMIYFIGGKF
EFALPALTHSK
>Cag_1529 conserved hypothetical protein
MEKFKGLYRIESARMQGWNYGWAGLYFITICTKDRVCWFGEMVNHKLSLS
DIGTIVEMEWRNTFEMRPDMNLYMGEFVIMPNHFHAIIGIGTNRYNIQYD
DHRRDAMHCVSTHHCVSNTPPKTTISSQSNNLASIVRGFKASVTKQARML
HVDFAWQSRYYDHIIRDEKSFHAISTYIINNPAQWAKDELYL
>Cag_1059 conserved hypothetical protein
MDNKKRVFVVGSTGYIGKFVVRELVARGYHVVSFARERSGVGAATTAEQL
RQDLKGSEVRFGDVGNMQSLRANGIRGEHFDVVVSCLTSRNGGIQDSWNI
DYQATRNALDAAKAAGATQFVLLSAICVQKPMLEFQRAKLKFERELQESG
LTWSIVRPTAFFKSIAGQVEAVKNGKPFVMFGNGRLTACKPISEADLARY
IVNCIDDSSMQNRILPIGGPGPAITPLDQGMMLFELLGREPKFKKMPIQM
FDVIIPVLALLGKIFPQFKEKAEFARIGKYYCSESMLVLDPKTGNYNAAI
TPSFGSDTLREFYGRVLKDGLKGQELGEHAMF
>Cag_1118 Tyrosine recombinase XerD
MSTLSSSYQTTLNSFLNYLIVERNFSANTRSSYHNDLHRYLLFVQEQATP
IAEITSKVIDRFLAELVALGLETTSMARNISTIRSFHKFLHNERLSSNNP
AERLHLPKKAHYLPAVLNLSETLALLEAPSIMQPAPTYALRDRAMLELLY
ATGVRATELISIQQEHLYSDAGFIRIFGKGSKERLVPIGASATLWVQRYQ
KELRVQLVKAHSNDFLFLNSRGGKLSRMSLFEMVKTYSVVAGITKSISPH
TLRHTFATHLIEGGADLRAVQEMLGHSSIVTTQIYTHLDRSFIKEVHKTF
HPRG
>Cag_0674 hypothetical protein
MNYMVNTRSFYYYSLLYALIAHFGFLGYGIDVYDAYSIAYGWGIGTFEPI
GWYLSTFRLYSANDIYLGVFFVSLIVSSGLIYASFYFLGNENKFSKIEII
FIIFFMHFVHVTVFSSVNALRQGLAMSFFMFGIVKLLSGSFRKTLFLFLL
SILCHNAVLFIIVPLLTLINVNKKLLQLGIGFIFIVLTPFALNLGVAEKT
QVSTTLNYSIIYFILFFAYSFFYWQSFSNSTKFKFSEVRRRYQFGFFVVM
LMMLFLLNRESHLQRMVMFIMIPLVYEIFVFLPAINPLRNVIVFSFLLLW
VVITLTSSAFSSFREYSTMPL
>Cag_0570 RNA binding S1
MTTTVTPAAEKKVIAKPGRKKFKFFANYEPAELARMEQLYTSTLNEITEE
EIVRGRVVSISNKDVTIDVGFKSEGIVSLLEFRDDDEVQVGDDVEVYLES
IEDKMGQLILSKKKADVLRIWDKIYDSIENDTIINGKIINRVKGGMTVSL
SGVEAFLPGSQIDVKPVRDFDALVGQTMEFRVVKINPVTQNIVVSHKVIL
EEAYAARREEMLANIKVGMVLEGTVKNITDFGIFVDLGGLDGLVHITDIT
WGRINHPSEVVELDQPIKVVVVGFDENSKRVSLGMKQLESHPWENIEIKY
PVGSKAQGRVVSITDYGAFVEIEKGIEGLVHISEMSWTQHIKHPGQFVTL
GQEVECVILNIDKEHTKLSLSMKRVSEDPWIALSEKYIENSIHKGTISNI
TDFGVFVELESGVDGLVHISDLSWTKKIRHPSELVKKNQELEVKVLKFDV
HARRIALGHKQINPDPWGEFEQKYAVGAETPGQISQIIEKGVIVILPGDV
DGFVPVSHLLQGGVKDIHSSFKIGDELPLRVIEFDKENKRIILSALEYFK
DKSKEEIEAYLAAHPNEKQEIEAATAELEPPVKSHDKKGGDK
>Cag_1557 conserved hypothetical protein
MQYSEAKSGRVFVLRLEDGDVVHECLEQFAHKHGIERASFIAVGGADKGS
VLVVGPEDGRTSPVVAMTHELYDVHEICGTGTIFPDDSGRPMVHAHFACG
REENTVTGCIRSGVKVWHVMEIVLTELLDNHASRKTDAATGFKLLAME
>Cag_0187 Phosphoglycerate mutase 1
MIKLVLLRHGESQWNRENRFTGWHDIDLTDQGRIEASNAGKLLRAEGFTF
DIAYTSVLKRAIRTLWHVLDEMDLMWLPVTKSWRLNERHYGALQGLNKAE
TAQKYGEEQVLVWRRSYDTPPPALEKSDARYPGSQARYASLSEAEVPLTE
CLKDTVARFLPLWHETIAPEIRKGRNVIIAAHGNSIRALVKYLDNVSEDD
IVGINIPTGIPLVYELDDDLKPIRSYYLGDQDALKKAQEAVAKQGKA
>Cag_1857 conserved hypothetical protein
MKKQILLFSAIFSGYTFLLLLLYFPLVFQSQVLTAPDSLIPQASSMALDK
LQAESGSYPLWQPWIFSGMPTVEAFSYLSGLYYPNLLFNLFHTDGVLLQL
LHLAFAGAGTFLLLRDLRLSLLASIAGGLIFLCNPFFSAMLVHGHGSQLM
TTAYMPWMLWAAMRFMDRGGVAEAGIFALIAGLQLQRAHVQMAYYSWLMM
LLLVVVLFATRRWVVPQAVQRGGLFVIASVTAIAMAAAIYLPASHYAEAS
VRGAAVGGGGAAWEYATLWSLHPLEAITFLFPGFFGFGGVTYWGFMPFTD
FPHYAGLVVLLLALMGLIMRRREPMTWLFAGVGFLALLLAFGRFFSPIFD
LFYSFAPLFSRFRVPSMALIMLYFALAALAAIGLHELLERKPQRLLKVLR
LSSIVVALLLLIFLALEEVAEHAARSLFPLPQVDSFELVSAINSIRWEQL
SSSVIVTLTLLLLVAGVLWLLLSGKISSKYSASLLVLLAVGDLLWVTVQV
IYPSAHSLRTPLFADKQQVAPAFQHDDVTRFLASQPKPFRIYPAGNFFTE
NKFALFGIESVGGYHPAKLKSYDDLLQVSDNLASIALLRMLNVHYIVSPA
PIEHPTLTLATSGTLQRANGSAQAFVYRLQEPAPRAWFVSRVVPFSNKQE
LYSHLLDDTASLSVAYVEAQQWQGAQRFSEGTIQSVTTQPESIKLNVNAP
NSSFLVLSEIYYPNGWQVMLDGKATSMLRVNGVLRGVNVPAGNHAIHFSY
NRHLFEQSQWIALAGFIIALLMIAGGLLWKHLLLSGEKRVVRGFHTIR
>Cag_0845 transcriptional regulator, XRE family
MKCPSCGFAEMSDKVLEETLSYGGKAMTLHAMHGQFCSNCGEGIWDEESY
RRYTEAQEGLVRAVKGDPAADIRRIRKKLNITQANLALAFGVGKVAFSRY
ERGETRPPAPLVKLLKLIDRHPDLLDEMQKM
>Cag_1919 hypothetical protein
MADFITLSNPSYSYNASKGKTTVQFDICFASNEMAGSKITGAKIDLQYNT
SLVTAYQITNPTFSFEGDFGTETASVWVAAFDQTANLTGSGATGQIAMLA
TSNDANPIIVNGKVLTVTLSVTGEVNADTFAITLQKTADGGNNSISNATT
EYFLDGGTYVAPPTSLADVTVDLHVNSDTGTSNTDNLTNDDTPTVTVNLT
GKSLSEGQTLQIIDTSNSNAVVGTYTITSTDATNGITTKDVTLSTLTSGA
HALKAQLNAGSTAGTPSATATTVTIDTTAPTSLADVPVDLKSSSDSGLST
DNITNATTPVITVNLTGKTLVAGDIVQVIDTSNGNAVVGSYTVTTGGTGS
SLDITLTTPLSLGAHALKAQLVDVAGNVGTASTNALTVTVDTTAPTAPTL
ALATDSGSSNSDGITNVGTVNVTGIETNATWQYSTNGGTNWSNGTGTSFT
LAAATYAVDAIRVRQTDVAGNVSGEGKIATAVTIDSSAPVAPTLAFTDAG
TSTTDGHTSNNTITVTGIESNATWQYSTNGGTNWSNGTGTSFTIVDGTYN
ANTIKVKQTDVAGNVSGEGSLAPAITVDTVRPTVTVTPVTTALSAQGTTT
ITMTMSEAVTGFAADDIKTSSKYSISNFSATSSTVYTATYTANEATTDVA
KELKFETNWTDAAGNQPKFGPTVDITLNDAALKIGETATVTFTFSEVPTG
FDSSDISVTNANGQLSGLAVKSGSNGLVYEATFTPTANVTSATNKITVGK
DWTNAEGVAPTNDTTDSPNYAVDTFRPTATIVVADIALKAGETSLVTITF
SEAVSGFDNADLTIQNGTLTNVASSDGGITWSATLTPTADISDTTNVITL
ANTGVNDVAGNAGTGTTDSNNYAIDTARPTATIVVADTALKAGETSAVTI
TFSEAVTGFDNSDLTIANGTLTPVASSDGGITWTATLTPTTNLEDATNVI
TVNKAGVTDAAGNAGVSTTDSNNYTIDTQAPAAPTLSITDNGQSTSDNLT
NNGTVTVSGLETGATWQYSTNGGTNWTTGSGTSFTLGAGTYAANNIQVKQ
TDAAGNVGIVGQITSQVDVDKVAPTLKSVVVNGTSVVITYNEALDATNKP
ATTDFTVSNNTVNNVAVDSTAKTVTLTLGTTVVSGADVTVSYADPTTVDD
SNAIQDVAGNDAAKFTTTISGTKTTTKVEVPQSSTAVNNIPIGTNAAGNP
VIQLDIPANVDVIAKEVTDTNATTLTDKLNDSLDALTTASTGQIDSIAVQ
TGIDNYVATLSTPDQANVVVRTLELKSANATTGAELVVTGNSAIGSNEAL
VIDTRGLQPGSVLNLENVEFAIIIGDNVTIRGGDGANIVYAGAGRQDIKL
GDESDTLHGGTGDDIVASEGGDDWLYGDDGNDTVSGGADNDHLFGGTGDD
SLDGGTGNDTLDGGDGNDMLNGGTGNDVFTGGVGTDTIQFAGLFSNYTIT
YNPGAHQYVLTDTTGATKTVSSTDFELFSFTDGVKSDDDVYAVAANPYGQ
PEHIIANDPAFVGVAGLGLIAALLFL
>Cag_0696 conserved hypothetical protein
MKASELDKKFDDNQEEVLDYFDISKIKMLNEEPTRVYIDFPSWMVDSLDR
EAKHIGVSRQAVIKMWLAEKLQSLNSQAEVI
>Cag_1498 ATPase
MSVAIELKHITKKFGSFTANHNISLEIAEGSIHAFVGENGAGKSTLTKLL
YGMHQPTSGDIVLHGTSTHFSSPRDAIRAGIGMVHQHFMLIEELTVTENI
MLGYEQASLFGSLPLKKAKERIAADALASGLTINPDARISSLSVGEQQRV
EILKLLFRNASIMLFDEPTAVLTPAETEQLFTTLRALRAQSKTIVLITHK
LDEVLSVADTVSVMRQGEIVATKSVAGVTREELARLMVGRDVLLRVENSP
HATNAPVLEINNLTYRALNGNEKLRQLTLTLHAGEVYGIAGVEGNGQSEL
LQALWGLVPEGVKVSGNITMQGSSLLGKSAAAIAALGVSHAPENRLHHAI
IGDYSVSDNLIFGRHREATFHHGMGFNRATVERFSNAMIADFDIRCTNAL
RQPISALSGGNQQKVVIARELTRPNLKLLILAQPTRGVDIGAIETIHKKI
LAARTSGIAILLISSELEEIIALSTRIGCIYKGTIRHQFSAEEVERNRQH
GRAFSERIGTYIT
>Cag_1705 Oligopeptide/dipeptide ABC transporter, ATP-binding protein-like
MDAQQELFSVSNLQVRVAVKQKGTFGSRFVEVPIVDGVNFTIHKGETLGL
VGESGCGKTTLGRALVRIGRTVVSGTIRFEGRDISALSNKEFRPLRKEMQ
LMFQDPFASLNPRLMIEQMLLEVMKVHHIADGDEARKRVRELLEVVGLNR
EFLYRFPHELSGGQRQRIGIARALAVNPKFIVCDEPVSALDVSIQSQIIN
LLKDLQRSMGIAYLFIAHDLSVVEYISHRVAVMYLGRIVELAEATELYKN
PLHPYTRALLSAIPSTEVGAKKERILLSGDLPSITTVPQGCVFHPRCPQA
KAECRHHIPQLQEHSSGHQVACLLYE
>Cag_1162 SmpB protein
MSKQPQQRYAEAIVNRKARHEFEIIETFVAGIQLAGSEVKSVRLGQASLN
ESFAIILRNEVWLENMQITPYKHNRMEVLEAKRSRKLLLHKKEIAKLQAK
VSEKGLTLVPLKAFFTPHGLLKIELAIARGKKLYDKRETLKNRENQRHLD
QLRKQYS
>Cag_0377 conserved hypothetical protein
MMNKLFNLYCDESTHLQNDGMPYMMIAYIRSPYNEIEQHKEYLKFLKAKH
KFKGETKWTSVSAGQYLYYADLIDYFFSTDLCFRSIIVDKSQINENCPEF
SYDDFYFKMYYQLIHHKVDLGYHYNIYIDIKDTRSNKKLAKLHEILKLNT
SIKNCQFMRSHESSLMQLTDLIMGAINYKLRGYNRVIAKNKIIEKIEQHS
KVPITRSTPKHADKFNLFFIDLK
>Cag_1805 Dihydrodipicolinate reductase
MKFTLIGNGRMGQQVAHIINNSSAHSIAAILDVDATITPEVFHGSDAIID
FTVRHAFLENLPAMVASGVPIVVGTTGWDDVQADVRQQIAAANTSLLYSA
NFSLGVNIFMRIVREAARMIEPFSHFDIALTEQHHTGKADFPSGTALRTA
QMVLAANSRKTSIVRELADGNKLTPNELHVASLRLGTIFGKHSAFIDSET
DDIEISHTAKSRSGFAAGAVEAAAWLALRHTTAPGFYTMDDFLDEKLGK
>Cag_1555 Nitrogenase-associated protein
MATIIFYEKPSCRNNTRQKALLAEAGHVVEARNLLEQEWTTEELEKFFGS
KPIADCFNTSAPAITSGQLQPESLSRNEALALMVKEPILIKRPLMTIGEH
YLQGFSMDRLHELIGLTAESEKTDFSQCPHDATSVCR
>Cag_0151 conserved hypothetical protein
MRLLPAEREIIRTLATRIFGDGTRVLLFGSRVDDSVKGGDIDLYVQSPDA
EQALTKKREFVVALKLALGDQKIDVVISSNPSRFIEQEALKHGVAL
>Cag_2004 sulfide dehydrogenase, flavoprotein subunit
MSNGLSRRDFNKLLLSGVAGSTIGLFGNSGTLFGATSKRVVVIGGGFGGA
SAAKYLRKLDPTIQVTLVEPKSVYHTCPFSNWVLSGLKNMEDIAHFYDVL
RNRYKVNVIADTAVSIDADKSSVTLQTGKTLYFDRLIVAPGIDFKYDSVQ
GYSENVANSVMPHAWQAGPQTILLHKQLQAMPNGGKVFISAPANPFRCPP
GPYERASLIARYLKEQKPLSKVIIFDAKESFSKQGLFKQAWERLYPGMIE
WRASTMGGKVVSVDAATMTVTTEFGAEKGDVINIIPAQKAGKIAVDAGLT
DASGWCPINPISFESTLHPGIHVIGDAAIAGAMPKSGFAASSQGKVAAAA
IVRLFQGKVPAPPSLVNTCYSLIDKNYAISVAGVYKLAMTGIVEIKGSGG
LTPMNADADQLEQEAMFAQGWYDNISQDVWG
>Cag_1151 Thioredoxin reductase
MERTVRDLIIMGTGPAGYTAAIYTGRANIRPLVIEGIQPGGQLMITSEIE
NFPGFPEGIRGPELMGRMREQAAKFGAEFVAGSVTEVDLSKRPFSVTLED
GQEFLTHALIVATGANARWLGIESEDRYRGKGVSACATCDGFFFRNCHVM
VVGGGDTAMEEALYLTKFASKVTLVHRRGEFRSSKIMSLRVTKHPKIEML
LNQVVEEVLGDGFKVTGVRLKNVATGEVSDHSCDGLFLAIGHAPNTVLFQ
DQLELDDYGYIQTKKSSTETNVPGVFACGDVQDYTYRQAVTAVGTGCMAA
VDAERYLETIR
>Cag_0167 es1 family protein
MTTQLSPRIGVLLAGCGYLDGSEIHEAVLTLLAISKKGAQAICLAPDMVQ
HHVVNHLTGQEVIGESRNVLVEAARIARGAIHNLSDIASLHLDAFIVPGG
YGAAKNLSSFAFDGTPCTIHPDVATAIQLFYKAGKPMGFICISPVLAAKV
LGSEKIEVTIGNDASTAASIEAMGARHINCVVTKAHVSKPHNIVSTPAYM
LEASLADIATGIEQLVGNVVELVKK
>Cag_1330 Apolipoprotein N-acyltransferase
MLQLSNYNQVRRSHYVAALLSGLLLGVAFPSYPFIRLELLAWVALVPLLL
SLRGVERFGALFRRVYFSMLVYGAIALWWVSLATLPGGMLTVFTDALFRS
IPFFLFYLLKKRVGYHFALSAFPFLWIGWEWLYMQQELSLGWLTFGNSQA
LLTPMVQYAEITGVWGVSFWLVWFNVLTVFAVIGKQRNRLAIVASMALMV
ALPLLHSWWLFANAELNSAAKRELRVTLMQPNTDPHHDYDRAVMVPHYLR
ISSQAVRAEHPALVIWPETAILFPLLEPVYHPEFQMLEGTLKEWDAALLS
GVIDRVTRAEQWSSIYNASVLLQPAGQVPQMYRKMHLVPFAERVPFLDYI
PWLGYATMSLSGISGWDKGTEHVIMRLKTPQGTVRIANIICYESIFPEHV
ARFVGRRAELLTIVTNDGWYGTSYGPYQHLAIGRFRAIENRRAVARCANT
GVTAFIDRYGRIMAEVPWWQEATLTADVPLERDLTFYTQYPDLLPQGALA
ASCLFIAMALVRRKRFDEV
>Cag_0649 Uncharacterized protein involved in exopolysaccharide biosynthesis-like
MIEAQDQQQQFFQEQEIHLADYLNILLRRRKVFIATFLGIFIGVFLVTFL
MQPVYQASSTVYVKKDSGKMGINELVMPGGDNSIQAEMEVIKSRTIAEQV
VQKLNLDWSIKSSSSSSFCRILSMSVDPSLKELKLILKDNSHYEIQNTKG
KIIGKGRNRVPVALPFGVLTVAFSGEEGDVFELQRQPLYKAVAALKSAIK
VKEVGRLTNVVEISYEHTNAVLARDVVNSLVQAYLDQSLAFKTQEAGKSV
SFIEEQLQGIRTDLNKAETNLQEYKSSSGVVLLDAEAQELIKKFSSLEQA
RVGVTLQKKQLEFALAAQQESMRTGKPYSPAVMKDDPLVASMAQQLANLE
VQKRALMVDYTSNHPSVVNVQAQIDEIQNKIRSTYKTGLNNLARQEVDIA
QRLAMYEGELRGIPLEERDLARYTRIAKVTGDIYTFLLQKHEEARIAKAS
TISNINIIDTAIIPATPIRPEKLKYLLIGFALSFIAAIALALLVDYFDDT
VKDENQAKNLLGYPYLVTIPYIGKRENGVHKDVQKSDEKDELSFIAYSQQ
KSIAAEAFRSLRTAVHFSALKQKHKITVFTSTFSGEGKSTISTNLAATIA
QTGERTLLIDCDFHKSSLYKRIGMKQVPGVTEVLAGDKPLSEALQQTVVH
NLMFLATGTPPPNPSVILGSNEMKDLIQLLKNDFDQIVIDAPPTLPVSDS
VLLTSIADVVLVVMEAGAIPAKAVTRLGEILKSAKAPVAGFIFNNKSLKG
GNGYGYGYGYGYGYGYGYGYGYGYGHEDENKKKSSFLALLQPVISKMSNI
KFRGKM
>Cag_0239 peptidase, M16 family
MALTTSTTVHLATLPNGITVITDSVPYVESITLGIQINAGSRDDPAHAAG
LAHFMEHALFKGTRTRSYLDIARSVEQHGGYLDAYTTKEQTCVYLRCLAA
HLEPSFELLADLVSNPTFPPEEMEKEKEVVLEEISSINDTPEELIFEEFD
QRSFPNHPIGNPILGTEKSVEAFSQNDLHLFLQQHYIPQKMVVTATGNVS
HHAIMQLCERFLNHLANPAESTETRQPLSVATYKPFSLTLKKRIYQAQIV
MGTAIERNDRHFYSLMVLNTLLGSGMSSLLNLELREKRGLAYNVYSSLAF
FDDLTALNIYAGTDGNKVATTLTLIKELLQSDALHHPIHEELQAAKTKLL
GSHIMGMEKMTRRMSNTASDYVYFRRHISPDEKSAAIEAVTASDVTEAAE
LLLRQATYSTLVYKPSRQG
>Cag_0168 hydrolase, isochorismatase family
MLSPNETLLLIIDVQGRLAPAVFGSDKLERNIQKLIRACHVLEVPLLVTE
QYPKGLGNTIPSLRELMAEAVVVEKSSFSCCGSQEFMRQLRALNRNDILI
AGMETHVCVYQTAIELIDFGYNVHLLTDCVSSRSEENRELGIRCIEKAGA
SLTSTEMALFELLRVAEGERFKAISAIVKE
>Cag_0341 TPR repeat
MKVRNLVLFTLMAVSAEGYAAGTKAKSIVKTPSAATQAAETMQALPLDRQ
TALNLAQSYLANGSSRQAELILSKLVLLYPDDEEILRETISLYEKSNRAE
QTLPLYQHLLQLRPNDLELTLASARAYSWTGRKAESIALYEKVLKAGNAS
EKVVTEYADFLYADKQYQKAIDLYKSVGQKGKLSKQHMLNTVNGFIALKK
FDEAAKICNANVPLYPQDTDFLRLAADINFNAKQFEEAAGHYRQLLLKNP
DDPGAYSKLADIAMAKNDFTEVARLSHKILALIPDHKTAMLSLARVSSWQ
GDFTTSLTYYDKLIASPNPEPFYYREKARVLGWMGDFKQALSVYKSAVQK
WPDDKAISAEAEAKKNYYNHTYRPAVKAYNAWLLAEPQQPEALFDLAQLY
AQYGKWNNGLNTYNSLLSQIPAHRQAALAKQKIDFAASRMFVRSGVEYFS
AKTKFIDATNHRQADTKSTSIYSSLTYPINERVSAFVNLDSKSYDFRIAK
PNTPKNPVTYGLMAGAEYRNMPNIALSAGLGMRMNPGDVDNGLTGFINAN
SQPVDNLHVGVTLRNDDIVTNTSSFNNQLEATRLQGRVAYNGYRRWQAGM
DIAFDSYADNQSYDNSSLTVGADVVAHLLYEPQRLSVSYRLQEYGFDKNH
ANHPQYYNYFWTPKSYTTHTFGLEWQHYLNRERFHGSNNTYYDIAFRVGL
EQEGDISRQIHASINHDWNSRLATSLEGQYTWGTSAEIYQDSMVKAEFRW
FL
>Cag_1866 putative DNA-binding protein
MQMKNNQSSFILFTTEDAKIAVDVRFEEETVWLTQEQMAVLFGKARTTIT
EHIQNVFKEGELNEEVVCRNFRHTTQHGAIEGKTQETWVKHYNLDVIISV
GYRVKSLRGTQFRQWATKRLNEYIRKGFTMDDERLKNIGGGGYWKELLQR
IRDIRASEKVFYRQVLDIYATSIDYDPKDEVSLAFFKKVQNKIHYAVHGQ
TAAELIFNRADAEKDFMGLMTFSGSRPYLKDVVVVKNYLNEKELRALGQI
VSGYLDFAERQAEREQAMTMKDWAEHLDRILTMSGEKLLQEAGTISHEKA
VEKATTEYKKYQQKTLSEAEYNYFESLKILESKIH
>Cag_0407 protein tyrosine phosphatase
MPLSILFVCYENICRSPMAEGIFTNLLTEHDLHHTIHVSSAGTVSYQRGS
SPDQRAIALLSDYGIDISSLKAQSIDDLTLHTYDWIFAMDYETYEAVQQS
LATQQPPYLHLMMDFVEELAGEEVPDPYYDRMEAFDGVRSMLTTAADAIL
AMLKERYRELRG
>Cag_0518 Mannose-1-phosphate guanylyltransferase/mannose-6-phosphate isomerase
MIIPVILSGGSGTRLWPLSRALYPKQLLPLVNEATMLQDTVYRLAGLEPY
DALYCVCNEEHRFLVAEQLHEAGMSAGKIILEPEGRNTAPAATIAALLIA
ERHPDALMLLLPADHVLADPQAFAAAVAEGAQAAQANELVTFGITPTAPE
TGYGYIRTATSTDSHSARQVMEFVEKPSREVAEQYVASGNYFWNSGIFLF
KPSAWLAELEKYAPDILTACRTALAQAVEDFDFLRLNAPAFAASPSNSID
YAVMEKTNHASLVPLNAGWNDVGSWSALWEVQARDEAGNVSKGDVLLHDV
RNSYLHATSRLVAAVGLDELIVVETADAVLVASRDSVQDVKRLVDELKVL
ERDETQLHRCVYRPWGTYETVDFADRFKVKRITVKPGAALSLQKHHHRAE
HWIVVKGTAHVTLEDKVVVLQENQSTYIPVGAFHRLENHGNEPLELIEVQ
SGSYLGEDDIVRVDDRYGRR
>Cag_1430 UDP-N-acetylenolpyruvoylglucosamine reductase
MTPTKHQAATWYSNVSCQITTNVALAERTNYCIGGVARYVATPTSLAELS
ALLYAVQQERLPLALMGGSTNSLFSDEDFEGVVLSLEQMAQMVWLSDDEL
FCEAGVENSDIAQALFEVGKSGGEWLYRLPGQIGATVRMNARCFGGEISA
ITRAVLTMSLSGELQWQEPTEIFQGYKQTSLMGSSAIVVGVVLRFTESAP
PTAIRAEMEHYEGERLARHHFDYPSCGSTFKNNYSAGRSSGVIFDELGFR
GVREGGAMVSEHHANFIFNYENATAGDVLKLAAQMRHAALERADIALDLE
VECIGRFERGLLEACGVPSVTNAHNSSKAWAGMLWLPNEFATTTSVAPIA
EPSYPRQLMRGALMGYARFDREFPTGVMVEVEQLCARSEAATNPSAPFMR
WRTYAAPDSSLFTLKAPEPAAFTDGLWRYGVSELFISSVTDGGYLEFEMT
PNGHWVALRFSTPRQRAAGYETLSAALWQGNITVQQGAGWFGMELSWELL
APFINAEGVIALQCAGSSGRGEFGLFPSWATPSTPTDFHQPQQFYPIRL
>Cag_0633 conserved hypothetical protein
MSIFNDAEKRKRLMKGGLPLILAVAWMPIVGMVVLVIIGAPLATLVGWPF
TFVLVGAVTLSLLWLFLKLFRKSGHKIKQGNP
>Cag_1337 conserved hypothetical protein
MKKMLSLAALLAAISYATPASAELKIGGDASLRMRNEFNAVDPGNNATDD
VMWQSRVRLNASADLGDGYYFKTLIMSEGGAAGWLNTTGNENYALAASQV
YFGRNMENCNYKFGRIPLNSFSNPIYDLTLYPAQPTDTPVNNLNFDRLYG
ASYGTKMGGGMLNTTLVVLDNSSTTAGTAAYDGMLNDGYALSVAYTTTWG
NVTVEPQIFAVLTNANVTTLGNNVTPLTFGLNASGKVGDGKLSGAAFYTS
AGDEADYSGYLLRVKGETGPYMAWVDLTSTTNDNAAGATVKDYTNTFVWA
QYKYTAYKSAAGSLTLQPTLRYRASSTETAAGTEADTSVLRGEFSATVTF
>Cag_0586 conserved hypothetical protein
MNETISTILKRRSVRAYRPDAIAQAELELMLEAARYAPTAMNQQPWHFTA
IRNHELLQKLEANCKNAFLESNVEALREIAKQEDFNVFYQAPLMVIISGD
PAALAAQYDCTLAMENMMLAATSLGIGSCWAHAVIMFHSTEKGKAIFREL
GITFPANHQLYAAAVFGLPAEPYPEAPPKNADCITIMD
>Cag_1371 conserved hypothetical protein
MNSAPFLALFLSGLAGGFGHCISMCGPVVAAFSMGETRKGILHHLLYNLG
RITTYTILGAIVGLSGSFLVLATAIEKFQNVIVILAGLSIIIMGLATAEW
LPMPKQLNSCTSGVSLLQRLMSFFKPPRSLGSYYPMGIVLGFLPCGLTYT
ALLAAARAAMDTHHPFIGMMTGALMMLLFGIGTTPALLLVGKVINTISNK
TRKWFYRIASIIMIATGVWFVLSAL
>Cag_0061 Protein of unknown function DUF152
MQLAGYVTPQIFSAFPELVAIQSTRVGGVSSGAFASLNVGHNTTDNPTCV
LENRERLCAALGIEYQNLVTADQVHGTNIYVAYEAGHYSGYDAFITNQPN
RYLCIFTADCYPVLLYDPHHKAVAAIHAGWKGSAGKIVLKTLHAMQTHFG
TVPGNCLAYIGAGISGSAYEVGMDVAHHFANDVLCSGCAIEGTNEHKALL
DLRKENYRQLLEAGIPSMHIEQSPYCTFRESDLFFSYRRDNGVTGRMVAL
IGLRH
>Cag_0617 conserved hypothetical protein
MQQVIFQEKYPLFTLELQKNETTYTNVNDILAYFRQKIDEHPITVFIANF
DHYSHTMSLPEHAMNPAIKDAKNIIFCFGKDLNDPLMIGARPRSIGVIEF
ENSFLISFLEAPNPMATNTMEAWAKGLKKS
>Cag_0262 hypothetical protein
MMPTNEPLSSIQARVEELQSTIAEKEEHIKASARHFKDELKEEVSPLRLV
RRYPLQSVGVAALAGVALGRRLRRGNRASSVVAAAPVVQSNVLLSTTQTL
LSVVGMELMRSLKEVGLSYLKQYIEKKIR
>Cag_1278 hypothetical protein
MNQIDWDQLDRQMQQFSSLFITEVKIPKEKTNKIASIIADDINKIPAKGK
KEIVNSISNPIPIQDRLNELTAFQGWMDIAHDFKNPYISRAQVIVQNYIC
FVYLGEACFKTLKQHLKPESVAKKCCNFLTNNPVRAFRNAVAHSNWKYKD
DFSGIIFYARKGHQASDSIIEWQVEDKSLAFWQALSRCTAYTAFLCLK
>Cag_1212 Cell division transporter substrate-binding protein FtsY
MGFFDKFKLSRLKEGLEKTRDTLREKLSVITKGKTEIDDEFLEELETILV
GADVGVETTLAIVDAITERAKKETYHSETELNRMLIDEIQQMLQESSDEH
PVDFDAPLPAKPYVILVVGVNGAGKTTSIAKLAHNYDQAGKKVIIAAADT
FRAAAYEQLQIWADRAGVPMIGQGQGADPASVVFDAVSAAVSRNADVVLV
DTAGRLHNKSHLMEELAKIMRVAKKRIPEAPHEVLLVLDGTTGQNAVQQA
QEFTKFVQVTGLVVTKLDGTSKGGIVLSISRDLKLPVKYIGVGEKIDDLQ
LFDRRNFVGALLGKEEK
>Cag_0020 conserved hypothetical protein
MERPMSAPNSALMSSKISSPISPTTLHHLLDIPLTTPLTLEAMCRNLRPI
YYPAPHLSPRLERFSLALVEAMQGLGIQVHSPEELALHDGRFPAGTVIVA
PGIFDDDALPINRVSTLYNNIIVGIYDEAAPVSNSSLPQERLDAIVGRLA
RDMVHILIFVTDESWTICTMNGGIATFATPLPHVADVRSTLVPKLTAQVV
PPRNEAFTFVDGALDIASPTFSAIAEDFVQCSALWSQSSALLTHTSTEGL
HYRNSFYKRIVARYLDERSGMSYGFFARQLPIPTLQPAQKKKADGLMEVQ
LAGEQWFVAIPEVSIITTRSGCRKHCLNPLEDLVALGLKEEQGKRVASIT
TPSTSCNTVIKPSFDTLAILAHALGNAIVGSILLVLQPNAPFSRHLARNG
ATITHWHGYPQKSDLPDGYWLHGAENPPVACSTPQSAAYSLLGKLSALEQ
ALTQQGIYHGDVHTEPHHGTNIVGILSLTEVARHFAR
>Cag_1811 transcriptional regulator, XRE family
MKNKSKTYQSEAFAAIHETASGMYDVGVIDKQTMREFDEVCLTPIHHVTP
SDIKALREREEVSQSVFARYLNVSKDVVSQWERGAKKPAGTTLKLLALVE
RKGIRTIA
>Cag_1437 transketolase-like
MYPANKKGVLLELTTHDELREMARQVRRDVVRMLAKANSGHTGGSLGMAD
VFTALYFRVLRHKPHEFWQNADLDMLFLSNGHIAPVWYSVLARSGYFPLN
ELGTLRQVSSYLQGHPTSEARLPGIRVASGSLGQGLSVAVGAAMAMKMDG
KQSDVFCLMGDGECQEGQIWEAAMSAAHYKLGNLIGIVDDNNMQIDGEVS
EVLGVAPLPDKWRAFGWDVVECDGNNMDELVAVLEGLRAVPNRQRPTVLL
AKTVMGKGVPFFEGLMPDKSNWHGKPPSKEDEAKALEILGTTSFGDF
>Cag_0999 conserved hypothetical protein
MVWLLGSFNPRAREGRDSLICLSFRFRSRFNPRAREGRDTGTRAKMLSKI
CFNPRAREGRDLWATVKRTTTKFQSTRPRRARLAFYLYSAISKAFQSTRP
RRARRGIEEESSNGGMFVSIHAPAKGATINSHHLCEIVGVSIHAPAKGAT
VDAYPDFTELPFQSTRPRRARHKNQECEISVYQFQSTRPRRARPIWQKLG
LGCDMFQSTRPRRARPHIRTNVFSK
>Cag_0727 NUDIX/MutT family protein
MRRLSFFPLKKSSLMTKATVAAIIAPSETEPDTILLTRRNVTPFKDRWCL
PGGHIDAEETALTAVVREVAEETGLQFSNPTFLCYSNEIFPEHNFHAIAL
AFYGVGIGPAALMPDEVTEIAWFPLREALTLPLAFNHTQILQHYAEAIHS
>Cag_0128 Succinate dehydrogenase/fumarate reductase iron-sulfur protein
MMAPIDQHLEDAKKDVTFRVHRFNPQVDSKPFYDDYTIPVERGITVLRAL
NYIKEHLDESLSFRAFCQAGICGSCSMRINGISKLACTTQVWDVLATSKE
VGVIKIDPLRNLPLIKDLIVDMDPLVDKMKHYSNWVESTMPEASMGKKEF
LISEEEFLQYDKATDCILCASCVSECSILRANKAYVSPAVLLKSYRMNVD
SRDAIHDQRLAELVQDHGVWDCTHCYRCQETCVKSIPIMDAIHGIREDAI
ECRGMKDTSGAKHAEAFMTDIEKKGKLVEATLPIRTNGVSWTLQNLLPMA
AKMIAKRRTPPPPPLVKPSKGIKVFREIFREMSEHVKQDHQSHKK
>Cag_0373 6-pyruvoyltetrahydropterin synthase
MNDLLEKPRKIYVSRSIEFNAAHRLCNPTLTEEENCAIYGKCNNPHGHGH
NYLLTITLSGTINRQTGFLFDLKALKAILEEEIVERFDHKHLNYDVPELS
NCIPTTEILAVLIWDILAQRFTTINPAITLHEVHLYETGKNAVRYYGE
>Cag_1133 conserved hypothetical protein
MRKKILFICGSMNQTTQMHQISEWLGDYDHFFCPFFSDGLLGVATKLGLL
EFTIMGKKRSSKALEYLHSHHLQVDESGAAHPYDLVVTCTDLIVPKIFQQ
RKMVLVQEGMTEPETLFYHLARNVKWIPRWIAGTSTTGLSDAYQKFCVAS
EGYRQLFIRKGVNPNKIEVTSIPNFDNCAHYLQNDFPYKDYVLVCTSDNR
ETFIYENRRKNIEKYVAMAAGRQLIFKLHPNENVERATREIKQYAPGSLV
FSEGKTEEMIANCAMMIAQFSSTIFVGSALGKEVHCGLPTDELKALTPLQ
NNSAAKNIADVCREVVEQ
>Cag_0778 DNA repair protein RadA
MAKATTRYVCSNCGAVSLKYQGKCFECQSWGTLQEMRIEPEPKHSQRGIP
EKSATVQRLHDVVPSEFQRITTGMGELDRVLGGGLMQASAVLVGGEPGIG
KSTLMLHLAPRLAPAKVLYISGEESPNQIRERAERLAIRADNLWLVPEVN
LERILSLLEHEKPALVIIDSIQTIHSSDYQSAAGTITQIRECAAALIRTA
KAHNIILMIIGHITKEGALAGPKALEHMVDTVLQFEGENYQRYRIIRSVK
NRFGSTNEIGVFRMDETGLEEVSNPSEFFIAGRNSDVPGNAILAALEGSR
ALLVEVQALVAKTNYSMPQRISTGFDLKRISIILAVLEKRLLIETWGQDV
FVKIAGGLKLVEPAADLAIAAAIASGLMNRTIDAGTVCCGEIGLAGELRA
ISDSERRIREAAHLGFQRIVLPAANTRELKASLKKLPISIIGCATLHQAL
DEMV
>Cag_0346 Translation elongation factor G
MQPVPTELLRNIVLTGHAGSGKTILAESFALSMGITTRLGSIEEGSTLSD
YSADEIEKKHSLNTTIIHGMWNKYKLNILDTPGLLDFHGDVKAAMRVADT
VVLTVNASSGVEVGTDTIWNYTQEYYKPTIFVLTKLDADRAHFKETIDEL
KEHFGHLVTPLQFPADEGLGHHTLIDVLLMKQLEFNPDKPGEMNISDIPD
LYVKQAEELHQQLVEAVAETDEELMNKFFEVGNLTEEELRDGIKSALVTR
TFFPVFCTSPLHLIGTERLLQAITNLCPSPIERGPEEAICTLDTSRQLIN
PDPEGETVAFIFKTMAEPRVGEVSYLRVYSGHIESGHELIDAQTGQLEKL
GQVHAMVGQKKIPVEKLLAGDIGMVVKLKDSHTNDTLTDKGCSCRIKPIN
FPTPVLATAILPLAQGDEEKISAGLYHLHEEDPSFAIEHNVEFNQTIIKT
LGETHLESIINRLQSKFNVAVSLTPIKIPYRETIRLTASAQGKYKKQSGG
KGQYGDVWVRIEPQERDSGFTFASEVVGGVVPTRYLPAVEKGLRETIEGG
LLAGCPIVDVKAVAYDGSYHPVDSSEYAFKIAASMALKSAFEKAKPFLLE
PIYTLEVYAPEQFTGEIVGEISSKRGKILGMEAEVKMQIIHALIPQAALT
PFHHALTRLTQSRARYSYHFSHYEEMPPDLAQQIIAERNQP
>Cag_1152 OmpA family protein
MALSLYKIAKPVALLLIPATLSVTWGCQTTTPSSNAAQKAKIGAAAGGAI
GALIGSRTGSWAKGALIGAVVGGASGAVLGNYMDKQAAAIDQNVEGAQVQ
RVGESIRVVFDSGLLFTTGSSTISAASRSNIEQLARILNTYGDTNVVIEG
HTDSIGNEATNQVLSEKRAESVATLLKVYGVAPNRMSAVGYGETRPVATN
ETEAGRNLNRRVEVLIYATDALRQQAASGQLRL
>Cag_1228 transcriptional regulator, Fis family
MLITQQNDYGSISLLAEVSRTITHEEDINKVLRLVLFIMSENMHMQRGMI
TILNRSTGEIVINESFGLTEEERERGRYHIGEGVIGHVVKTGKAVIVPSI
QDEPLFLDRTGSRAQAKKEELCFICIPIKAGSEIIGTLSADRHVEPVTAD
EHRRKTRQTDDRDERIDKLQFYVDQLSIIAAMISQAVRLKQLAYEAGSKN
VQDLSNLPPFAASIPPRSEDRQVNAIPPPSPDRPANIIGNTKPMVSLYSM
IDKIAKTSATTLVLGESGVGKELVASAIHFKSRRAEKPFIKFNCAALPEN
IVESELFGHEKGAFTGALATRHGRFEMANNGTIFLDEVGELSLSVQAKLL
RILQEKEFERVGGSKTIKVDVRVIAATNRNLEELIRQGQFREDLFYRLNI
FPITVPPLRERKTDILLLADYFVEKYNKANQKGVRRISTTAIDMLMRYHW
PGNVRELQNCIERAVILSEDNVIHGYHLPPTLQTAESSGTPYTGSLQQKL
DAIEKEMIIEALKRTQGNMSRAAMQLGLSERIMGLRIKKFNIDYRKFRV
>Cag_1945 polysulfide reductase, subunit B, putative
MNEQRREFIRLAGMGFLASVGATCGLMSSAPLEATDLTVFADSGNGAVRW
GMLVDTRKCREGCNKCIESCHTVHNVPDFASAKERVQWIEKRPVASLFSE
RKALPAQSATLPALCNHCAKAPCVTACPTSAIFRRYDGIVGLDFHRCIGC
RACMTACPYSAISFNWKAPRPAIKSLSDGYPTREQGMVEKCNFCSERLAK
GLMPACAESCPEHALVFGNLNDPASEIRTLLASSRTLQRKAEQGTKPSVF
YII
>Cag_0234 Nicotinate-nucleotide pyrophosphorylase
MNHKAMNDFYEGCRARAIMLALEEDRYTGDITTLATVPPEQAGRAVIKAK
EQGIVAGVDVALQVFKACDPALQVQCHAEDGAVVQRGDVVLEVQGLLAPL
LVAERTALNFMQRMSGIATRARAYVDAIAHTNARILDTRKTVPGMRSFDK
EAVRIGGGYNHRTGLFDMMLIKDNHIDAAGGVVEALERAKRYRQEQQLEV
LIETEVRSLAELQEVLPVAPDRILLDNFTCAQLCEAVQLVRESGSKVELE
ASGNIGMHNVVQVAESGVDFISLGELTHSVRALDLSMTIQLA
>Cag_1037 conserved hypothetical protein
MVKNHKEVLAATHQKLIEFNELNKLHGPELAWEKMLEGLPEKQKKRMATF
LAEPTLFKAFTRAIPFFESAGMEMEIVDLSNKGSDAVLEIQKYCPYLQIC
KEYNIETPCHIICDIEIESTRQAFPEMKGEILARQAFGSCVCLFKYERPA
K
>Cag_0054 cell division protein, FtsW/RodA/SpoVE family
MIKEPLTDSLPESPFSFVAEGAIPPKKEVLAGKMLMLIVALLVCIGIVIV
YSSGAGWAESRYDNAEYFLWRQLSFAVLGMGVVFGVSFIDYHRLEKYSKI
FLFVSIGLLVLLLLLKFAGLISGAARWIGYGPLKFQVSDVAKYALIIHFA
HLISEKQPNIKDLHITYYPLLILLMTVVSLVALEPNFSTASLIAMIGFLM
MFIGGVDIRHLGATVAMVIPIGIAYAISAPYRVARLVSFASGKEEGLSYQ
VVQALIGLGNGGLFGLGIGASKQRELYLPLSYNDFVFVVIGEEYGFIGAL
FVISLFIGFFACGVIIAKHAPDDFGKYLASGITVAISLFAFINIAVASHV
LPTTGVALPFISYGGTALLFNSLGVGILLSIASHRKRSTKAIENVAPLQE
RSVV
>Cag_0428 transcriptional regulator/antitoxin, MazE
MLTQVKKWGNSFALKIPKAIVSDAKIERDSFIDINIVKGQITIENVTTQR
SILDELLAGITKDNLHNEIDSGSPIGNEIW
>Cag_1266 ThiS, thiamine-biosynthesis
MITIHLNGELKQLPDSTSINDLLASVEANGNSLAVVVNEHIIRPEERSTH
CLHDGDVVELLMFAGGG
>Cag_1268 Elongator protein 3/MiaB/NifB
MAEIPAWLHTTNDANALASLLAPNATRSLESLAAEASAITRRRFGRTITL
YAPLYLSNHCSNGCAYCGFASDRTTPRRRLEMEEIRREIAAMKALGISDI
LLLTGERTPAADFDYLRQSVALAAEEMQRVAVEAFPMSVAEYRALAESGC
TSVTIYQETYNRKQYEALHRWGAKKDFLYRLETPARALEAGIKHVGLGVL
LGLSDPIEDALCLYRHVRHLERRYWRAGFSISFPRLRPESGGYQPPFPVD
DRQLARLIMAFRIALPNIELVLSTRESARFRDGMATLGITRMSVESRTTV
GGYAENETIKSSAGQFEICDDRNVEEFCAALRTQQIEPIFKNWERAYNAP
SMSCFL
>Cag_1383 hypothetical protein
MRLSPIAIQQIKEQVNRFFGQQAVIWLFGSRLDDNKRGGDIDLYVQTEQF
NLLDELRCKVALQEQLDIPIDLIVRSKNDTSVITTHAIKNGVPL
>Cag_0358 RNA polymerase I subunit A-like
MILSQGTSPLKGDFSRIKFSIASPESILAHSRGEVLKPETINYRTFKPER
DGLMCEKIFGPTKDWECYCGKYKRVRYKGIICDRCGVEVTTKSVRRERMG
HISLAVPVVHTWFFRSVPSKIGALLDLSTKELERIIYYEVYVVINPGEPG
EKQGIKKLDRLTEEQYFQIITEYEDNQDLDDHDPNKFVAKMGGEAIHMLL
KNIDLDETAVHLRTVLRESNSEQKRTDALKRLKVVESFRKSYEPHKKSRK
KPNALFAEDDMPEPYVYEGNKPEYMVMEVIPVIPPELRPLVPLEGGRFAT
SDLNDLYRRVIIRNNRLKKLIDIRAPEVILRNEKRMLQEAVDALFDNSRK
ANAVKTGESNRPLKSLSDALKGKQGRFRQNLLGKRVDYSGRSVIVVGPEL
KLHECGLPKSMAIELFQPFVIRRLVDRGIAKSVKSAKKLIDRKDPVVWDV
LEKVIDGRPVMLNRAPTLHRLGIQAFQPHLVEGKAIQIHPLVCTAFNADF
DGDQMAVHIPLSQEAQLEASLLMLSAHNLILPQSGKPVTVPSQDMVLGMY
YLTKSRGGEMGEGRIFYSAEEVRIAYNEQLVGLHAQIFMRYDGQVDQKFD
AIRVMETFVEPTSEKAAWLRKQLDEKRMVLTTVGRVIFNQHVPESIGFIN
RVIDKKVAKELIGRLSSEVGNVETAKFLDNIKEVGFHYAMKGGLSVGLSD
AIVPETKAKHIKNAQRDSTKVVKEYNRGTLTDNERYNQIVDVWQKTSNIV
AEESYQKLKRDREGFNPLYMMLDSGARGSREQVRQLTGMRGLIARPQKSM
SGQPGEIIENPIISNLKEGLTVLEYFISTHGARKGLSDTSLKTADAGYLT
RRLHDVAQDVIVTIDDCGTTRGLYVHRNIEEETSGQIKFREKIKGRVAAR
DIVDTVTGNVLVQAGEIITEELADSIQETAGVEEAEIRSVLTCESKVGIC
SKCYGTNLSVHQLVEIGEAVGVIAAQSIGEPGTQLTLRTFHQGGAAQGGI
SETETKAFCDGQVQFEDVKTVAHTAMNEDGMEDVRTIVIQKNGKINIIDS
ESGKLLKRYMLPHGAHLACEDGVLVKKDQVLFRSEPNSTQIIAELPGTIK
FADIEKGVTYKEEVDPQTGFSQHTIINWRTKLRATETREPRLLILDENGE
VRKTYPVPIKSNLYVEDGQKVLPGDILAKVPRNLDRVGGDITAGLPKVTE
LFEARIPTDPAIVSEIDGYVSFGAQRRSSKEIKVKNDFGEEKVYYVQVGK
HVLANEGDEVKAGDPLTDGAVSPQDILRIQGPNAVQQYLVNEIQKVYQIN
AGVEINDKHLEVIVRQMLQKVRIEEPGDTDLLPGDLIDRSAFIEANRAIS
EKVRISERGDAPARIQENQLHKLRDITKLNRELRKNSKKMVAYEPALQAT
SHPVLLGITSAALQTESVISAASFQETTKVLTDAAVAGKVDYLVGLKENV
IVGKLIPAGTGLKKYKTIKLVGEESAPAPVVVEAVVSHEAEASPMVEEEV
AE
>Cag_1169 TPR repeat
MKPFFTFLRYAFIGILTLSLVTPEVVDAAKKSKKKSSSRKKSSKRNARAK
KGSNKKTSARQARLRVVDGVETERNSINLTASPSSASRQLNKRAMGFYEQ
GRYAEAEPLYRELLTLDEKQLGSRHPEVAVTLNNLASLLQQQGRYNEAEP
LYRRALSIREENFGADDASVAQSLNNLGSLLQDQGRYYEARQLYSRSLAI
DEKVLGTDHPDVAADLNNLASLLQAQGRYAEAEPLYRRSLAIREQRFGAE
HTLVAMSLNNLGVLLQAQGRYSEAEPLYRRSLAIREAQYPANNHSIVATS
LNNLASLLQARGKLTEAEPIYQRALSINEQTLGENHPSVATSLNNLAGLL
RAQGRYADAEPLYRRSLTIREEQLGENHPDVAMSLNNLGVLLQAQGRASE
AEPLYRRALLIDEKVLGATHPQTIRLRNNLNALLNPSAIPLTTQ
>Cag_1562 conserved hypothetical protein
MEQKEVVAHATFVQLVESIRNVHQELIAQANRAVNVSLTLRNWLIGYYIA
EYELQGKDRAEYGDRLFSELARALKSLSNCNRRQLYRYYRFYTFYPIIVE
LLPPQFKSLSLLSSIEIVGTVSPLSRPSSTASLNIAKKLSYSHFEELIAL
DDPTKRAFYEVECIRGNWSVRELKRQIGSLYYERTGLSFNKTKLAELTLQ
EREMQPLFNIRDPYIFEFLGLKPVEVMSESHVEQQLIEKLQDFLLELGHG
FCFEARQKRLLIGDEYFFIDLVFYHRLLKCHVLVELKLDHFKHEHLGQLN
TYVSWYRQHVMSKGDNPPIGMLLCTSKNNSLVEYALAGMDNQLFVSQYQL
ELPKKEEMQEFIATQLRELGE
>Cag_0488 Formylmethionine deformylase
MLFARRSAQHHMILPITIYSHEVLRQKAKPLKGVDKEIETLIAAMFESMH
NASGIGLAAPQVGRSLRLLVLDVSCLKNYKDEKPMVVINPHILSVRGACA
MEEGCLSVPGVQGDVVRPSAITMKYRNERFEELTAEFSGMVARVLQHEID
HLDGKLFVDRMDKRDKRKIQHELNELTAGHISADYPVELHPATTASHAE
>Cag_1603 NUDIX/MutT family protein
MKPYSYCPHCRTELASAMIDGRDRLHCPACAWVNYLNPLPVAVAYAVNER
NELLVIRRAYEPARNEWALPGGFLEIGEDPHHGCLRELHEETALSGTIQH
LIGVYQREVEMYGSLLVIAYKVLVSDDSALSINHEVTEAGFYPHEALPTI
RIPLHQHIIRDAVI
>Cag_0872 hypothetical protein
MQKASTLYEIDAAVDRYTPVSADSPFYVDFKNLRGFFQEQRIMRLLNVRK
NTNGNYEFQYDPHRAEKTFLFIAGMRGSGKTSELSKYTKLLNTPDCFFVV
VCNIDSTLDMDRVQYMDILIFQLEKLFQRAEAVNLKLDNSIVESMNKWFQ
DRVKEINSKYGGYGEAEIEIESDAALSPTSLIKRFLGLTARLKMGLSGSR
EYAETIRTTFQNNFIDFATKFNTFIEQTNEQLRKEKKAKEILFIIDGLEK
TMSADTRKRIILEESNRIKQIKVTTLFTLPIELMKEEAHIRQFSEVITFP
FVKVTERDGSPVAEAIVLFKEFIYKRVDASLFDSEETVKKIIRYSGGSPR
QLLRIIQRANLYASEEYGKITDENVEQAINELGNQYARFIEPDDFEQLKE
LKVQLDACAPIGFNSNIQSLLEHEILFEYNDGTYKSVNPLLERSTLYKFY
VLGKAS
>Cag_1235 hypothetical protein
MNIILADALIDDVALLLSSLPAATTCYLLHAQDDAVKVIQTAFQQPNTHL
HFLGHGEEGAITLGGKTFTADDFIALAPTHSSSGAIHFWSCKTGAGTKGV
AFVNSIAQAFNTVVSACSTLVGAAHKGGSWRLDVHSNERVAVACPFGKAA
AYQHTLDASSLLRVNAVALDNGLKVEIWIAPNTAFSTASLRLSFDPSIIA
PVWVNGKVVSTSGLTGWTWLSSPIGDTVLKMNGYTLTEVNRTTEVLLQSI
SFTFAADVQNCSISLGGTYLENEVTGKIALGTLPTLTYIAPALPVWDTFA
PPEALSYTAGTTAALDFAVQATDANGDTITYKAVVGQMVESLFTPTTSLS
TITLTSSNGHLTGSVVLPRTFSAGVYLFRLYADDKTTDANLGSVLDVPFS
LLAAPNALPTGTVTISGTPTQNQTLTAANTLADLDGIGTIAYQWNADGTA
ITGAIGNSLMLTEAHVGKKITVTATYTDNRGTLESVVSTATSAVVNINDA
PTGSVSISGTPTQGQTLTAANTLADLDGLGTIAYQWNADGTVITGAIGNS
FTLTETHVGKKITVTATYTDGHGTTESVVSAATVAVANVNDVPTGTVTIS
GSATQNQMLIAANTLADLDGLGTIAYQWNADGTAITGAIGNSFTLTETHV
GKKITVTATYTDNRGTLESVVSTATSDVVNINDAPTGSIYITGAATKGQI
LTVNTGTLSDADGLNGEFTYQWQANEIDITGATSSSYTLTNDDVGKNIRV
VASYTDNHGTKESVVSTATVAVANVNDVPTGAVTISGTPTQNQTLTAANT
LADLDGLGTIAYQWNADGTTITGAISNSLVLGETHVGKKITVTATYTDGH
STTESVVSAQTTSVANVNDAPTGTVTISGTPTQNQTLTAANTLADLDGIG
TIAYQWNADGTAITGVIGENLTLTEALVGKKITVTATYTDGHGTTESVVS
TATVAVANVNDAPTGTVTINGTPTQEQTLTAANTLADLDGLGTIAYQWNA
DGTAITGAIGNSFTLTEAHVSKKITVTAIYTDGHGTTESVVSAATTAVAN
VNDTPTGTVTITDSATQNQTLTAANTLADLDGLGTIDYQWNADGTAITGA
IGNSLMLTETHVGKKITVTATYTDGHGTTESVVSAQTTSVANVNDAPTGT
VTITGSATQGQTLTASNTLEDLDGMGSVAYQWQADGMAINGATGNSFTLT
EAHVSKKITVTAIYTDGHGTTENVVSAATSAVVNINDTPTGTVTITGSAT
QNQTLTAANTLADLDGIGTIAYQWNANGVAITGAVGDNLTLTEAQVGKKI
TVVASYIDGHSTTENVVSAQTTSVANVNDTPTGTVTITGSATQGQTLTAS
NTLEDLDGMGSVAYQWNADGTAITGAIGNSLMLTETHVGKQITVTATYTD
GHGTTENVVSAQTASVANVNDAPTGTVTISGTPTQGQTLTAANTLADADG
LGTIAYQWNADGTAITGAIGNSLVLGETHVGKKITVTATYTDGHGTTESV
VSAQTTSVANVNDTPTGTVTITGSATQGQTLTASNTLEDLDGMGSVAYQW
NADGTAITGAIGNSLVLSETHVGKKITVVASYIDGHSTTESVVSAATDII
SLDDTTPPYLISTTPINNSIGISTNSNITLTFSEAINKGNGTIALYLNSP
TKTLVENYNVADNVNLSIEGNKLTINPTADLEQGKNYLLAIENGAITDSA
NNLFEITDVYNFTTETLPIERYSLSGNISFWKDKATPLQDVSIVATSAMQ
TPDLLEFRNIQLHSDGSRTVELWTTSPSNTLHAIQVELELQEGSTATWQN
STSIPSNWTTVTHVNNNGHFVLGSMGIEPIAMEAPTVKLGELTLSAPIDQ
ERFEIAIVNGLTNEQAITPFALTSSEIISNKQGHYSFTHLMESLYYLEAN
KAADALADAVTIEDARAALMIAVGLNPNSDGTPLLPYQYLAADVNRDGKI
RASDALTILKMAIGVDSVPEHAWIFASESIESAEMGRSAVDWSHTVPTVL
LNQESNTIDFIGIVKGDVDGSALT
>Cag_0115 CrtF-related protein
MMNTNELLNYNHRANELVFKGLVEFGCIKASLELDLFTHLAGEAKDTETI
AANVGAIPQRLVILLETLAQIGVLAKNDGKWSLTPFAATMFLPNNELPNL
YMMPVTKAMAHLSENFYLKIADAVRGNHIFKAEVPYPPMTREDNWYFEEI
HRSNAHFSILLLLEEANLSNVKTLVDVGGGIGDISAALLQKYPQMDSTIL
NLPGAVELVNENATEKGLADRLRGSVVDIYKEEYPKADAVMFCRILYSAN
EQITSMMCKKALDALQEGGKVLILDMIIDEPEDPNFDYLSHYILGAGMPF
SVLGFKQQERYKELLEAIGFRDVRIVRKYGHLLCEAVK
>Cag_0679 Protein of unknown function DUF132
MSYRIVFDTNCIISALLFSRQKMARLRYSWQSDAVIPLVCKETVSELLRV
LTYPKFKLTRDERLLLLADFLPYAETITVLDVPSNLPVIRDSADQIFLTL
AVVGNADALVTGDNDLLTIKDSFKMLPIMSLNEFNQWLK
>Cag_0439 periplasmic sensor signal transduction histidine kinase
MNAPHAKRFRAFFAPLTAMSLRNRIALYYTSATALLIALLLTTVYFAVNR
VVYEHFDEELRHEVAETFGHLHLLPYNFESNNNAQFRNIDEEYRWRSKEN
NAKGYHPKPYKKRKRQVDAELMQLVTVDGEILAQSATLNNHRLTIEPKRQ
GMAFFTSTVGSSAVRQVQVPLYNRQQQLTGYLLIAMPLSNAMLVLRDLQE
VLLLSFPMMLLILFALTRLIAGRSIRPIEKVITTAEQMSQSTLDQRIPLP
RHRDELYRLSSTINALFDRMQDAFLREKQFTADASHELKTPLSVVKGTLE
VLVRKPREREHYETRIHFCLQELNRMALMIDQLLMLARYESSTMQPHIET
VPLHRHIEALIERNQAAATTRGITLCTTGTINATVAADAGMVDMMLENIV
SNALKYSPSGTTITLSVQKNENGTSCTIQDQGIGIPPEKVRAVFERFYRV
DEARSSATGGLGLGLSIVKKLADLQQVTIAVESEVGKGTTICLTFRDGGR
>Cag_1545 NUDIX/MutT family protein
MASRFRGVFKQSGVIPLFDDKVVLITARKSDRWIIPKGYIELGMSAADSA
AKEALEEAGLVGKVGEHPIGKYRYNKSGRHFVVLLYPFFVETMLDVWDEV
HERERCVVSPDVAATMVAHSDVGRLIRSYCASLDDDEAVLVPPHVASAIT
G
>Cag_0285 Oxygen-independent coproporphyrinogen III oxidase HemN
MATNLAEKYSNPGPRYTSYPTIPSWSTDGVTQQQWREAMVKGFNDSNETT
GVSMYIHIPYCENYCYFCGCNAHRTQDHSFEKPYLDALLKEWQMYLDVFP
GTLNVKELHIGGGTPTFLSPENLTRLVDGLFKHVNRMDNHMFSFETNPKS
TTKEHLEALYNVGFRRMSFGIQDFDPIVQKEINRLQSFELVKEKIDLARQ
IGFTSINFDLVYGLPKQTMATITDTIQKVMELRPDRLAFYAYGHNPHMYE
GQKKFNEADLPVGDAKQELYEKGSAMLESIGYHEIGMDHFAIEGDPLYIA
AQNGTLHRNFMGYTENTTQMMLALGSSSISDTWYAFAQNERTDGEYIKAV
NEDRFPLHRGHLLTDEDLVLRRHILNLMCKQTTSWEDPKLYTDELDVALV
RLEDMQNDGMVILGEKSVTVTEVGIPFLRNICMAFDARLWSSDSLSKAYN
VSRDIQKTYIEKARLAKLQREQAGA
>Cag_1931 Hit family protein
MSHNQEQDCLFCRIVRGEIPATIVYRNEHVVAFKDISPTAPHHVLIIPVQ
HVASLNALSPEHEAVAGQLLLAAAPVAEALGIKESGYRFVINTGADAMQT
VFHIHAHLIGGQAMGWPPFPV
>Cag_0075 DNA topoisomerase I
MASSVAALSAKNKTLIVVESPSKAKTINKYLGSNYTVFASVGHIKDLPKK
EIGLDFEHNYSPRYEIIPGKEKVVKQLKKLATEASNILIATDPDREGEAI
AWHIANEIEHAKAPVARVLFNEVTKKAILEAIEKPRHIDLRLVHSQQTRQ
GLDKIVGYKISPFLWKVVLRGLSAGRVQSVALRLICEREEEIERFVIQEY
WTIAADFLTANKESFRARLVRLDGDKPEITNVEQAEAIAAIAKKGNYSVR
EITPRIQQRKQPLPFTTSLLQQAASNQLGFGAQRTMRTAQQLYEGIELGA
EGAMGLITYMRTDSTRISPEAVGEARNYIERNFGKDYVGAGSSGKPGKNA
QDAHEAIRPTSLLKTPEQVKPYLSADQFKLYELIWKRFLAAMMAPAKIEQ
TKVDVEEQSGKFLFRANGSRVLFPGFMRVYDDQQELAYEAQTSTKEEVEN
EMVVKLPEKLAVNDPLGLGALEQKQSFTRPPARYSEASLVKDLDHFGIGR
PSTYASIFSTLQDRRYVALEKRKIMPTDLGRDVAKILVANFPELFNVGFT
AFMEDELDKVASGDDAYEKVLDSFYKPLTSALALRSATPLIPQNNEAETC
DKCGTGKMILKWTASGKFLGCSNYPKCKNIRTISSNREKPASTGVHCPSC
EDGEMVLRKGRLGPFLACSNYPKCNTLLNLNKQRHIEPPKTPPVVTDMAC
PKCGAPLYLRSGKRGLWLGCSKFPKCRGRLAWTALEPAAQERWERVMAAH
QKAHPPVTLKMVDGSTVSMTSSIDDIIMKADAAGLIAPAMDLVPEAEG
>Cag_1638 conserved hypothetical protein
MRIFAWLLVIVAVVLQGCYSFSTENRLTHLHTIAVPIFNDHSGAGIAQSR
SELTKALIDRLERESALRLIPSLSLADALLEGTLVAYSDVPAQLSATTGR
AATNRITLIVQVELEERSTHELLFSERFVGSAEYAIGNMVAQQEARRRAQ
HQIAESIADRIISGW
>Cag_0157 putative integral membrane protein
MTDNILFAFALTLFAGLSTGVGSLIGLLSKEFNPKVLTISLGFSAGVMIY
VAMIEIMVKARESLVVGIGAEMGKVVTVLSFFAGIFLIALIDKLIPSYEN
PHELNVAQKLEECSENQKKKLMRMGLFSAVAIGIHNFPEGLATFMSGLSN
PTLGVSIAVAIAIHNIPEGLAVSAPIFYATQSRKKAFILSFLSGLAEPVG
ALIGYFLLRSFFSPSLFGVVFGAVAGIMVYISLDELLPAAEEYGEHHFAI
GGVIAGMVVMAISLLLFT
>Cag_0820 conserved hypothetical protein
MATRKLITFDWAMKRLLRSKANFDILEGFLSELLGEDITILDILESESNQ
ENKEYKYNRLDLKVKNSKGELIIIEVQYEREYDYFQRMLYGASRVITEHQ
KLSEPYSTIPTVISINILYFTLGEGSDYIYYGTTKFVGMHNRDVLHLSKE
QREKYGKREVSDIYPKYYLLQINNFNNVAKTSLDEWIYFLKNAEIPENFN
AKGIKKAKESFDFISMTPEEQEAFLSYQDALRDQASYFETTYEIPFEQGL
KKGIRKGRKEGKEQGLREGKKLGLQEGVLKGKELGLQEGKELGLQEGVLK
GKLEIARKLMAKGMSAEEAAGIAGVNIGLLERND
>Cag_1191 hypothetical protein
MMDCRGNPCGCPVVVFSIMNYACLLIKIELLNQFYFYSPNHSQRLQIGAI
NMKKRWILLLLALGTNVPESKADYTIHGYTSPDGQSTFHVRENPLQNPLQ
KLADDLERNNNKSERPSDAFYRGYEQAQKMKMMIEQRRLMEEQRKLLEEQ
RKLLEQTRLEQQTRLEQQTRLEQQTKE
>Cag_1502 hypothetical protein
MDNDDEILDSAPISKSDEELEKGFVAAIVEQPKLMDELAKQLVALSLAIP
GIYATALKLLAGDDAVASSLPCIIGAFVFWALSLVFAFVSLTPREWHVER
TLLRRNGASKNGAPLSIEEFFKVSARYKRALLIAATACCFVGICLACVAV
FTVTPPNLSVPQTVQP
>Cag_0374 conserved hypothetical protein
MSNPNPSPLSHAMPEAIRQLPAEAQQEVLAWVEKLLATQNEGVDELYNAI
SSIVKFIPNFMVIPLMVEQIHPRIAAGVCVKMGVEKATGYANDLPVEYLS
SVTHHLPNPMVGEILTAMKRYAAEKLITYEIEHHRTDLQALMPSVSESHQ
ALITKHLST
>Cag_1074 xanthine/uracil permease family protein
MQKYFEFERLGTNYRQEIIAGITTFFTLAYIIIVNPAILEAAGIPKEASL
TATILTSIFGTLLMGLYAKRPFAVAPYMGENAFVAYTVVHTLGYSWQTAM
AAIFISGVLFTLITIGGLRQWLAEAIPATLKHSFSVGIGLFLAFIGLNDM
GVVALGVAGTPVKLADVTQLPVMLSLAGMLVTALLLIRRVTGALLIGMAF
ITAAFLLLGLTPLPTALFSFPPSIAPIFMQIDWHGALTWGFVGVIISVLV
MDFVDTMGTLFGLSSRADLLDENDNLPDIQKPMLVDALSTIAASIFGTTT
AGVFIESAAGIEQGGKSGFTAVVVALLFALALFFAPILTIVPPYAYGPVL
LLVGMFMMQSVTRFNFNDYSELFPAFLTIALMVFTFNIGVGITAGFIAYL
LLKLLSGQFRDIKSGMWILALLSLSFYLFYPYH
>Cag_0318 Glutamyl-tRNA(Gln) amidotransferase C subunit
MSVTIKDVTYIAELAKLRFSEPEMETMTNELNNILHYIAKLDEINTDGIQ
PLSGIYDPVNVLREDVELQPLTSSEVLHNAPDRQDRFFKVPKVIG
>Cag_1168 Quinolinate synthetase A
MKKENHTSSSTLSTPELIAHIQQLKQEMNAIILAHYYTVPEIQQVADVVG
DSLALARAAEQNSADVIVFAGVYFMGETAKMLNPNKLVLMPDLHAGCPLA
DSCQPEDFKKLRAEHPNAVAITYINSTAEIKALTDITCTSSNAEKIVRQI
PANQQIIFGPDRNLGAYISRKLQRSMILWQGYCYVHEAYSAAILQKFMAA
HAGAELIAHPECREEVLLLSAFVGSTQELLNYTTSSPATTFVVATEPGIL
YEMQRRSPHKTFLAAPRSADAPQSVCTQMKQNTLAKLYDCMVERQPEIIV
DETLRLAALKSIRRMLEMS
>Cag_0336 Secretion protein HlyD
MKQKSKLIVSAIALFVVALVAWFFLFKGNKGEEERYRDVQVVRGAISDVV
ATTGVVEPKNRLKIQSSIAGRIEEIVVSEGEMVRKGQVLALLSSTERAAL
LDAARLQGKSEEAYWKKVYKETAVLAPLDGQVIVRSIEPGQMVNGSDSLF
VLSDRLMVKAFVDETDIGRVKVGQQATIQLDAYPDISVRGKVEHIDFESR
LQNNVTMYYVDIIPEQIPSVFRSGMSATITIIVKQKPNALLVPLEAVQQR
NGQSVVLQRNNASSSSAKVRYCAVQTGLRNEQMVEIVAGLGEQDAVLLPD
TAFALPSKKGGTNPFRPQRSPNRP
>Cag_0241 RecA DNA recombination protein
MTMDNPKVEQAGHAVDSAKLKQLNLAVDALEKQFGKGTIMRMGDGSAGLT
VQAISTGSMALDFALGVGGLPRGRVTEIYGPESSGKTTLALHVIAEAQKE
GGITAIVDAEHAFDPSYARKLGVDINALLISQPESGEQALSIVETLVRSG
AVDVVVVDSVAALVPQAELEGEMGDSSMGLQARLMSQALRKLTGAISKSS
TVCIFINQLRDKIGVMYGSPETTTGGKALKFYSSVRLDIRKIAQLKDGDE
LTGSRTRVKVVKNKVAPPFKMAEFDILYGEGISALGELIDLGVEFGVIKK
AGSWFSYGTEKLGQGRESVKKILREDPVLYQKIHMQVKELMTGHTEIISS
PTE
>Cag_1958 Cobyrinic acid a,c-diamide synthase CbiA
MEMVQLPRLVVSAAQKSAGKTTLTLGLLAYFAAKGIAIRSFKKGPDYIDP
AWHSLATGHPCYNLDPWLMGSEGCLAALLQRGYCEAARGSLVLIEGNHGL
HDGLSLDGSDSTAGLAALTNSPVLLVVDSRHVSRGIAATVLGLQQMAPHA
TIRGVVLNRVRTARQAAKQRAAIEHYCGIPVLGELPADSRMIIAERHLGL
TTVAETTGARAFIDTVADIVGQSCNMEAIHALFLEAPPLLVDNVVSLPEG
CTTLSKPRIVRIGVFRDAAFCFYYPDNLEALERAGGELVFINSLEATSLP
EIDGLYLGGGFPESFFEQLSSNNQLLAEVRQAIDGGLPTYAECGGLIYLC
QAAIYQGKKYQLAGVLPVTIGFNKKPAGHGYVELESKVDSAWYCKGERLK
AHEFHYGYPLEQDGGSLFQFHVLRGHGVTGNSDGFLYKNLFASFAHLHAS
ATTEWAPRFVALARANQQPQRC
>Cag_1751 Prevent-host-death protein
MNAVSLNELRNNPQQVMDMVCDRHEPVIVIRKNGEKTVMLSYSDFSSMQE
TLYLLSSPTMAERLRESLESYSNGMGIEEALLQLDLQSRALLAEKLLNSL
DTPSMTENEQLWAEEALRRYKEMKESKTQGKLASQVMQEALAELQ
>Cag_1894 photosystem P840 reaction center protein PscD
MQSTLSRPYTGNEQVRANVAGPWSGNAAHKAEKYFITSAKRDDYGKLQLT
ISPASGRRKLLPTKEMIGKVASGEIELYVLTTQPDIGINLQQKVLDNENR
YVIDFDNRGVKWTMRDIPVFYDSLRQQLCIEIDRRTYTLNEFFK
>Cag_0258 AICAR transformylase/IMP cyclohydrolase PurH
MSDPVIKRALVSVSDKTGIVDFCRELSELGVEIFSTGGTLKTLQDAGIAA
ASISTITGFPEIMDGRVKTLHPKIHGGLLAVRENANHVKQAADNGISFID
LVVVNLYPFEATVAKPNVSFEDAIENIDIGGPSMLRSAAKNNESVTVVTD
SADYALVLQEMRANNGATTRATRLHLALKVFELTSRYDRAIATYLAGKVS
AAEAAASTMSVQLAKELDMRYGENPHQNAGLYRLTDSNGTRSFEEFFEKL
HGKELSYNNMLDIAAATSLIEEFRGEEPTVVIIKHTNPCGVAQASSLVDA
WHRAFSTDTQAPFGGIVAFNRPLDMAAAQAVNEIFTEILIAPAFEDGVLE
LLMKKKDRRLVVQKKALPQSGWEFKSTPFGMLVQERDSKIVAKEDLKVVT
KRQPTEAEIADLMFAWKICKHIKSNTILYVKNRQTYGVGAGQMSRVDSSK
IARWKASEVNLDLHGSVVASDAFFPFADGLLAAAEAGVTAVIQPGGSIRD
NEVIEAADANNLAMVFTGMRHFKH
>Cag_1876 ATPase
MRAQNLANTMQFDPNKFTLKAQEALQAAATLASSNQHQQIEPLHLLLAMF
QDKQSIAVQIAQKLEASPDMLLAALERELERLPRVTGASASGQYISQPLG
KVFDVALKEAENLKDDYISSEHLLIALSEAGVAISPILRDAGFNRDAMLK
VLATIRGTQRVSSQNAEETYNSLKKYSRNLNDQARRGGLDPVIGRDDEIR
RVLQILSRRTKNNPVLIGEPGVGKTAIVEGIAQRIVAGDVPENLKSKHIA
ALDIAQLVAGAKFRGEFEERLKAVVKEVQSSEGEIILFIDELHLLVGAGS
AEGSMDAANILKPALARGELRCIGATTLDEYRKHIEKDAALERRFQTVVV
DQPNVQDTVSILRGLKEKYEIHHGVRIKDAALVAAAELSNRYIADRFLPD
KAIDLIDEASSRLRLEIDSSPEELDRLNREIRRLEVEREALKRELEPGS
>Cag_1548 S-adenosyl-L-homocysteine hydrolase
MSTVAEALAYKVADISLAEWGRKEIEIAEKEMPGLMALRKKYAGQKPLKG
ARIAGSLHMTIQTAVLIETLVDLGANVRWASCNIFSTQDHAAAAIAQAGV
PVFAWKGETLDEYWWCTRQILEFDGGLGPNLIVDDGGDATLMIILGHKIE
NDPKMLDKTPGNAEEKALYQQLREVFAEDNQRWHKVAADMKGVSEETTTG
VHRLYQMMEKGELLFPAINVNDSVTKSKFDNLYGCRESLADGIKRATDVM
IAGKVTVVLGYGDVGKGCAHSMRSYGARVIVTEIDPICALQAAMEGFEVA
TMEDAVHEGNIFVTTTGNKDVITLEHMKQMRDEAIVCNIGHFDNEIQVDK
LNNDPTVKIINIKPQVDKYIFPNGNCMYLLAEGRLVNLGCATGHPSFVMS
NSFTNQTLAQIELWQKEYAVDVYRLPKALDEEVARLHLEQIGVRLTRLTP
EQAEYIGVSADGPYKPEHYRY
>Cag_0259 3-oxoacyl-(acyl-carrier-protein) reductase, putative
MQKNISQKTCFMTGATGVLGSAIAEAIAKQGYSLFFTWNGSEAKALLLLE
RLQAISPHSAMVRCDVAQPSAIAEAFIEFRERYQRLDLLVASASNFFRTT
LPEVTEAEWDALVNTNLKGTFFTMQEAARMMQQQSFVSRIITMTDISANL
AWRGFAPYTASKAAIQHITRLFAKTFAPTILVNSIAPGTITLNPEHATEA
ALDAVTNVPLRRTGEPADIVRTVLFLLEQEYMTGQILAVDGGRLLA
>Cag_1413 nitroreductase family protein
MTFRELVTINRSYRRFDSRVLVSAEQLRDLVELACYVPSAKNLQPLRYLA
VSTPSMVDAIFPTLSWAGYLPEWQGPIQEERPTAYLVMVADTTISRDVAC
DMGIAAQTILLGATAMELGGCMVASLERQRLRQLLSIDEMYEILMVIALG
KPIEKVVIDQISNGDSVRYYCDAYAIHHVPKRMVESVLLRSL
>Cag_1680 Delta 1-pyrroline-5-carboxylate reductase
MLLRHKILSLNGTLSSFENKRPMQQKLRIGFIGTGRIAQALIGGMSNSPK
FTLFGYDKEATALATVAANYPLQPCSSIAEVGSSADVIILAVKPYQIADV
LAELKSTLQPSHLLISVAAGISTTFMAEWTSPETRIVRAMPNTPMLVGLG
MTALCCGTQATSEDMACAVELFSAAGRTVVLDEVQMDGATAVSGSGPAYM
FYLLEALAKGGEACGLVYADALLLASQTMLGAATMVMQSSKSPAELQHDV
TTPGGTTEAGLRVMDVKQVSDAMQQCVAAAAARSRELMK
>Cag_0317 Citrate synthase I
MNTIEPSHTLTVTDNRTGKSYTLPIEHGCIASMDLRRVKTTDDDFGLLCY
DPGFLNTASCKSSVTFIDGDKGILRYRGYPIEQLAEKSSFLETAYLLIKG
ELPDSDRLAKWTYNIRHHTMTHANIVKFMDGFRYDAHPMGILVGTVGALS
TFYRDAKNINSEESRKLQTRRLIGKIPTMAAMSFRHSFGFPYVLPDNELS
YTGNFLSMMFRMTEPRYKPNPVFERALDVLFILHADHEQNCSTNAVRAVG
SSGVDPYSAIAAGCAALYGPLHGGANEAVIHMLLAIGSKEKVPEFIASVK
RGEGRLMGFGHRVYKNYDPRAKIIKEIALQVFEETGRNPLLDIALELERI
ALEDDYFVSRKLYPNVDFYSGLIYQAMGFPMEMFPVLFAIGRVPGWLAQW
TEQVKDSEQKIARPRQIYLGEEVREYLPIERRPRNHVDEPLAGLCRL
>Cag_0618 Formate-dependent nitrite reductase membrane component-like
MIDGTRELISTKLNPQVLPHLHIWEWHIPLYLFLGGIAGGLLLITSVLIV
MKRHFGIGPRDEGEGCPCTTVRIGAILSPALLGVGMLFLFLDLAHKLYVW
AFYTTIQPTSPMSAGSWILIAFFPLAVLQAMVVNRKYLEKLNIELLNKII
TWTDDHLLKIALMNAHIGVGIGIYTGILLSFFSARPLWSSSILGMLFLVS
GISSAAALMLLVAPKHEKHIYSRIDANAMWIELVIVGLFILGGITGTENT
HGAMMHLIAGEHTMAGIPYMYWFWGVFVFAGLALPLLIEFLENAGFHLHI
PPIAPMLVLIGGLVLRFLVVFAGQTYHTFM
>Cag_0460 electron transfer flavoprotein beta-subunit (beta-ETF)
MNIVVALCQVPDVAPLVVDGALDLSRVSMVMNPYDEYALEEALRCRERFP
NCTVTAVTVAALPPQELLRKALALGVDRAVFVESEELRDSYSIASRVSEA
IRMLFSEQLPELCFFGKQSTDYQSGAVPAMVAHLLGLPFVSAITSLTPTA
EQVEVTRDIEGGSESFMVAYPALFSTEKGLNELRHTTVKMVMEGRKKKVE
HLVLPLQSVAPRVALQHVQPLERSTTCTMFTNEAALASLLAGYRAAL
>Cag_1789 serine hydroxymethyltransferase
MDTDILQKQDAEVFASIANETKRQTETLELIASENFTSRAVMQACGSVMT
NKYAEGYPGKRYYGGCEFVDVAENLARDRAKKLFGCEYVNVQPHSGSSAN
MAVLFSVLKPGDKIMGLDLSHGGHLTHGSSVNFSGQMFEAHSYGVDRETG
CIDMNKVEEMAMQVRPKLIIGGASAYSQGFDFKAFRAIADKVGALLMADI
AHPAGLIAAGLLPNPLQHCHFVTTTTHKTLRGPRGGMIMMGSDFENPLGI
TIKTKTGSRVKMMSEVMDAEVMPGIQGGPLMHIIAGKAVAFGEALQPAFK
EYAAQVMKNASTMASRFMELGYTIVSGGTKNHLMLLDLRNKNVTGKEAEN
LLHEAGITVNKNMVPFDDKSPFVTSGIRIGTPAMTTRGMKEAESRRIAEL
IDQVITSASKPDISAICEAVREEIKTICHNNPIEGYSV
>Cag_0669 sugar transferase
MMLAPVALFVYARPDHTRKTVEALQKNELAKETDLIIFSDAARIPDKESV
VNEVRAYLATISGFRSVTIHHRPYNFGLAKSIIEGVTQVLSEHERIIVLE
DDMVTSPYFFSYMNDALKLFANDDRVISIHGYVYPVKQQLPEAFFLRGAD
CWGWATWRRGWVLFNRDGQVLLDELKQCKLTREFDFNGSYPYTKMLEAQI
KGQNDSWAIRWYASAFLANKLTLYPGRSLVHNIGNDSSGTHCGNDTTHDV
DLSSMPINITNIDVLPSIEVRQVFESFFAKSKGSFLNKLHISFKKAFV
>Cag_0506 glycosyl transferase
MPTPHCHLSVVIPLYNEQESLPELLQQLEQALHHPSLQALFAEPLEYEII
MVDDGSTDGSASSIRRLATKHCNVRLISFQRNFGKTAALSAGFAASSGEL
VCTLDADLQDDPSAIAALITKLHEGYDLVSGWKQQRRDPLSKTIPSKFFN
AVTRLFTGLTIHDANCGLKLYRHDVVKRLELHGDMHRYIPVLAAWMGFAI
TELPVPHHPRKYGTTKYGFSRFIAGLLDFLSVLFITRYLRRPLHFFGTAG
LLSALCGFGISLYVTLDKVLLHKPVSNRPILFLGILLMILGVQLFSTGLL
GELLSTTNNRHSGFIIRETFNVTDEQVQALRQ
>Cag_1687 hypothetical protein
MNTPIEFMKPRLVGDRFSGHAIPFEMLKNLSVLEELVIEAAKWKYLKAHP
DRQRVPRGFTEGVSLQLTEVRDGSAIPIIVLTFMTTTPLFPEVGTHVTYF
EQGRDAIFATVSAAEEQVQNPNSLPPHLLGYFDQLGRALRDDEALELDPT
NQLKPARLTKETRRRILLQSDKIQELTEEVTLRGTVPEMDQEKSSFEFQV
IAGSRIKAPLEPQYFDVILNAFTAYRDNRKIVIRGIGRYDRNEKLLGLTL
VEHVSLLDELDPGARLNEFKSLKNGWLDGKGIAPTHKQLEWLVDAFERHY
PDELRLPYLYPTADGGVQAEWSLGGWEISLEINLDTQQGEWQALQVCDEH
EEFYVLNLNQPDAWQWLSAEIMKKSGVEA
>Cag_1841 Ribosomal protein L14
MIQKETNLVVADNSGAKKVRCIHVFGGTGRRYASLGDQVIVSVKTAVPGG
IVKKKDVCKAVVVRAVKESRRKDGSYIRFDENAVVILNAQGEPRGTRIFG
PVARELRDKRYMKIVSLAPEVL
>Cag_0099 sulfide dehydrogenase, flavoprotein subunit, putative
MSKKIVVLGAGTGGTIISNNLRRHLPHDWEITVIDRDDHHIYQPGLLFVP
FGLQKVSTLVRSRKKYILSGINFVIDEITRIEPDKRVVTTKKHSFPYDFL
VISTGCRVVPEDNDGLMEAWGKNAFSFYTIASAELLHRRLQEFQGGKLVL
NIAEVPFKCPVAPIEFVFLMDWMCRKKGIRNKTEIELVTPLTGAFTKPKA
SAVFNESAKAKNIKITPGFSLNEVHGKEGYIQSVQGDKVNFDTLVIIPST
QGDEVISSSGLDDGIGFVPTHHHTLQALKHERIYVVGDATNVPTSKAGSV
AHYEADVVAFNIMAEIHGIKPEEIYDGHSTCFIVYSKGTSSLIDFNYKIE
PLPGQFPMPKFGPFSLLKETKMNWYGKLGFEWLYWNVLLAGHNLGAPPTL
VMAGKELG
>Cag_0355 50S ribosomal protein L10
MKRDTKQQIVQEVAEKISQAQGIYLTEFQGLTVEKMSELRGEFRKAGVEY
RVVKNTLIRKALQDMAGADKLAPALKSTTAIAFGIDDPVAPAKVIKKFSK
ANDQLKFKMAAIDGAVYGADQLTLLSEMLSKTENIGRTAGLINNVIGSVP
MVVNAVMRNMVCALDQIAKQKQ
>Cag_1813 Light-independent protochlorophyllide reductase, iron-sulfur ATP-binding protein
MSLVIAVYGKGGIGKSTTSANISAALALKGAKVLQIGCDPKHDSTFPITG
KLQKTVIEALEEVDFHHEELSPEDVIESGFAGIDGLEAGGPPAGSGCGGY
VVGESVTLLQEMGLYDKYDVILFDVLGDVVCGGFSAPLNYADYAIIIATN
DFDSIFAANRLCMAIQQKSVRYKVKLAGIVANRVDYTTGGGTNMLDQFAE
KVGTRLLAKVPYHEMIRKSRFAGKTLFAMEEAQTEFPECLAPYNEIADAL
MQEHPIASVPVPIGDRELFKLVNGWQ
>Cag_0310 conserved hypothetical protein
MSMIDKKFCLDANVLIQAWQKYYSPKICPSYWDMLNELGANNIILMPEMV
YEEIVRTDDDLSKWLKSSTIPIRKIDEQVTKCLKDIYSADPNHKYLVDNT
KARSLADPWVIAHAIREKATVVTKEEKVTAINTTKIKIPNVCEKMNVSWI
NDFQLIIELGIMFKCEMESK
>Cag_1292 conserved hypothetical protein
MGVLTMIVMDSHVWFWWINLEHQRLSGIILDTIATANRIGVSPVSCFELA
LAHKKGRLNLPLPLDKWFRFALDGSDVELLPFDEKIATRAVKLSNIHHDP
FDRIIIATALQLNGKLASVDNRFSFYEEFSEILLS
>Cag_0505 radical activating enzyme, putative
MSQNALNALRISEIFYSIQGEAFFAGFPCAFIRLAGCGHGCNYCDTSYAE
EKGELMAQAEIIKQALSYHAPIIEITGGEPLLQPAVYPLMEELCNRGEQV
LLETGGFLSVEKVDKRVHKIIDLKAPSSGVAEKNNPANIRLALEAAPEEQ
RRFEFKMVIANREDYEWAKTLLEEHHIAAASTVTMGTVFGALSPTQLAEW
ILHDRLPVRLQLQLHKYLWEPSRHGV
>Cag_0716 ATPase
MNIDRRLFQLMKDEPKPFLFSLGNGALATVLLLAQAYFLSTLLALAFLPS
ASLHVLWQPMAMFALTSSLRLAMNWFSRQEAERGTLAIRSQLFRRLTAAI
GSLGPLYVRSVQSGRLSNTLLKGVEALEAYFSHYIPQIFFALFTPLLIAL
TIMVGDPLSGGILLLSAPLIPIFMMVIGKSAKAMTDKQWSAMSRMSGYFL
DVLQGLPTLKLFAQSHRQHEAIEQTGENFRHATMRVLKVAFLSSLTLELV
GTIGTAIIAISIGMRLMAGECNIQHALFVLLLVPDFYLPLRQLGTKFHTG
MEGATASKEIFAILDRNAEATPSATSASQAEMLQQLAMGDICLENISYRY
PESDKMALQSITLTIPVGKTTALIGPSGSGKSTLLNLLLRFQTPSEGSIK
LGSSSIHALPLDLWHQHIAWVPQHPFLFNTTLRDNIMMARPSASLQELEQ
VLQQSGLRNVVESLPQGLDTMLGEEGARLSGGEAQRIALARAFLKNAPIV
LLDEPTSHTDPQLEAALRHSMQQLMEGRTTVLIAHRLETIQTAHQIVVLS
NGVLAACGTHEELLQSNSFYQTSLNAQAEVAA
>Cag_0486 TPR repeat
MLDNNSTHIQPAGGFAAISKYNPHLWSAEQLRAIFVARTNELADLVQTLR
MVQPDTVAQHVLLVGARGMGKSTLMRRLALAVEDDPSLSANWLPLRFPEE
QYTVATLGQFWANVLDSFADTLQHLGESVIALDAAAERIAALPVTQQPEA
YIDAINHFADERKQRLLLLVDNTDMLLHNIGKDAHWGLRATLQSNPRLFW
IGGSYQSLEAESNYHDAFLDFFRVINLRPLKVEEMRQALLALAETFGGAT
ARNAMVHQLDLQPERLPTLRQLSGGNPRTTVMLYEILANGQNGNVRSDLE
ALLDNMTPLYKARMDSLADLQRKLLAHILEHWAPISFGELAAVSQVAKGT
ISPQLQRLEIEGLIEKTSLHGTTRSGYQAAERFFNIWYLMRFSPRRQRNR
LVWLVEFMRLWFSGDELCSLAKQRMSVGSNDLRSTHDLEYDRALADALPQ
FAPERHALRWSLLKLLQENNSQLVELFDFDGEDKEFKGATDYLRRLAALP
SLLRQCPHAVTEQEKTHWVETVLGSISLTLEEKEIVAQKAEYLTLFQYDE
LLKVFSEEQKRWEKQFGVAALQTVRSAVLSQDFFSDMPDSHLAYEQVRVC
FANNKEALRFVSLLFYSKHKDEWSYKAQKLALNLLHDDSKSDWFLREKLV
RYEETEKAYCKAIEFNKEDAVTWNHVGNLLKDYLGRYEEAEAAYRQAIAI
DKKFAYPWNNLGQLLHYNLNRYEESEAAYRQAIALDEKYAYPWFNLGQLL
HYKLERYEESEAAYRQAIAIDENNAYPWNNLGQLLHEWLGRYEEAETAYR
QAIALDEKYVYPVTNLARLLAQRNRKAEAETYYREAVLKDTQDTQQLFLQ
AHLFLGNRQLAMDALQALAEKAQNGNQYAFYRLKEQVWECYELGLGERLA
DWMAESNVAEFLTPFIQALYTLAGVNEKLRDLPMESQHMVDEIVRKARLR
QEKREACNMRAKSIH
>Cag_1704 Peptidase S49, protease IV
MNNSSIPQKRRGCFRPGCLWFLVVPLFIVVALFWAFRSSHDMPDRFVLVV
PLSGKLAEVNNERSSLPFMPSQGDLSLQEVLFVLHEAAKDEQVSEVLLQL
DGVEAAPAKIAEVRAAVADVRRKGKKVSAFLYRAEDSDYLLATAADTIIM
QRGASLLLDGLKAESLFYTGTLNKLGITVQAAQWKEYKSGIEPFTRTSAS
KEYREQINMLLDDVYNNYLSAVSERRKISRSAFEAIINNEALLSAERAKA
LGLVDRIATFWDVERSMTKQLTGEELSSENNALVHAADYRNAMDYPQHSS
TSDAIAVITMSGPIMRSVDNLDDGIDVATMQHSLEAALENKSVKAIVLRI
DSPGGEAIASADILQMINAAATKKTLVVSMSGVAASGGYMVALGGKTIVA
HPLTITGSIGVYALKPTIQGLAEKVGLQREVITRGRFADATSPFTPLEGE
AYNKFVASAGDVYNDFISKVATSRRMKVTAVDSVAGGRVWTGSRAKQVGL
VDRMGGLFDALALAKERAGISKDKEPTILLYPLQQGWLQSLLGGATLNSV
TKAIATALLGNVLPINVEQQPLSAMQPFYDMLIRSGKPHMVALMPAEVVV
K
>Cag_0681 conserved hypothetical protein
MSGLKYILDTNIIIGLLKANPTAIALAESVRLDLGECAISQITRMELLGC
KGISEAEESSIHQFLACCVVLMIDETVECEAIRFRKHSSLKLPDAIIAAT
AQVHNLNLLSLDERLVSQYERAVKDNR
>Cag_0603 RecJ exonuclease
MKRYRWKCFMPHEETVAALSESINVSQPIARALCNRGISTYNEAKEFFRP
VLSTLHSPWLFNDMERAVERLVRALKNGETILLYGDYDVDGTTGVALLLL
FLRHHGVEPLWHINDRFAEGYGLSPEGIDRVIASGTTLLITVDCGIKDHA
AIRRCGEHGVEVIVCDHHEADVTPEAYAILNPKVVGSGYPFRELCGCAVA
FKLVQALAERLGDSEAVWHQFLDLVAVATAADLVSLTGENRTLVIEGLQQ
MRSKPRKNFSEMFRVMKVSLGDVRMFHLAFGIAPRINAAGRMHSAHLALE
WLLASAPDAVEQHTEALERVNVQRRSLDSTIMSQADKMVESHCASYCSSI
VLYDEAWHLGVLGIVASKLIDKYYLPTVVLGGMNGLVRGSVRSIEGLNIH
AVLQHCSHHLEQFGGHHQAAGLTLKPENLAVFRKAFDEQCANQLTIEQRQ
KVMEIDAVVELEQITDKFIAVLEQFAPYGIGNREPLFMSERLQLAEPARL
LKERHVKFAVRDKQKRRFEVIGFNRPDIYNDLRAVKHPTITMLYTIERRQ
WNGMWQVQLLLKDLEVQR
>Cag_0939 CDP-diacylglycerol--serineO-phosphatidyltransferase
MVDKNQQPSFKQYPPFLKGGDGERSRRFPMVSRSFVPSVFTVMNMVSGYI
SIVMSGEGSFVAACWFIILAAFFDTIDGFVARITKGTSDFGVELDSLSDL
VSFGAAPAYLVYSYGLEGLPNMAGILISSLLMIGSGLRLARFNINVIGYE
KSSFSGLPTPAQALTITGFVLWMSATPLFAPELMRQVLAGLSITLAILMV
SKVNYDALPKPTADSFRQHPVQMSLYILALFAVLLFHAKAFFLAMLLYIL
LGVVRSIALSWRRVWQA
>Cag_0327 Ribosomal protein L31
MKQDIHPKYNEVTVVCANCGNSFVTRSTRPSIKVDICNNCHPFYTGKQSI
VDTAGRVERFNKRFAKKAQA
>Cag_1903 Acetolactate synthase, small subunit
MKHIISVLVENKFGSLNRVASMFSARGFNLESISIGETEDAEISRMTIVT
RGDDQIISQVVKQLNRLVETLKVVDLTKQQHVERELLLMTLKLNKTVQHE
IFELISVFKGKVVDIKQKSITIEVIGSPDKINTTIDIFRPYGIKELARSG
AVAIQRGDV
>Cag_0657 putative transcriptional regulator
MFCDDLKGLIASGEDSFTQFKENIFDSKKLAEEFVAFSNAEGGVILLGVN
DKRELVGLDNADIHRLNQLISNTANENVKPPIYPLVEHEIIDGKQLLIVR
VRKGYAKPYATSSGVYLTKSGSDKRKMSREELRRLFAESGGLSADETVIH
GSDIRDINTEILNDFLIKRDREIFEALRAQRVQLVTICENLDLVAHGQLT
LAGNLLFGREPQRFSKSFYVQCVHFDGNDVGCNSFISKDIIYGTLQSMYK
QTLNFLKSSLRRVQKGKNFNTLGEFEIPEICLTEALINALIHRDYFISSS
IKVFIFEDRVEIISPGKLPNSLSVEKVRLGISIHRNPILNSLGQYLLPYS
GLGSGIRRIEAHYPSVKFINDTDKEEFWCVFSRSI
>Cag_1711 Phenylalanyl-tRNA synthetase, alpha subunit
MEERILQAKQEILEAPLTTLAELESFRLHFTVRKGLVAALFGELKNVASA
DKPRIGQLLNELKQTAESRVSDAEAHLAAAQNSKQQPSLLDLTLPGRRSF
CGSEHPVQKVLGDMKSIFNAMGFTIATGPELEVGNYNFDMLNFAPDHPAR
DMQDTFFVRTGSGNTPDVLLRTHTSPVQIRVMLDEKPPIRVICPGKVYRN
EAISARSYCVFHQLEGLYIDKNVSFADLKATIYSFAQQMFGSDVKLRFRP
SFFPFTEPSAEVDVTCYLCGGKGCRVCKKSGWLEILGCGMVHPNVLRNCG
IDPEEWSGYAFGMGVDRTVLLRYKIDDIRLLFENDVRMLQQFKA
>Cag_0302 conserved hypothetical protein
MKILERYIFQQFIKAFLFTALVFVSLFIIINMIEKLGNFMDHHVSALEIA
RYYLLSIPSIFLVTSPVSALLASILVAGKLATQNELPAIRSAGVSMRQLL
TPFAWGALLLFLFNFFNAGWLAPTTYSHNRTFEQLYLGKNAGDQETRNLH
LLDSGNRFISIGAFNPINESLNNVSIERLSGATMISRIDADSMHYNRRTK
RWTMWRVTERYFSNGYQSFTTKPTATIRLALRPKALHEMRLQPDEITLPR
HYQFLREKEEAGFSGLERSAVKFHNKIAMPFASLIITLIGVPLAARKKRG
GIAAEIAITLFIGFLYLGIQRTIAIAGYQGVLPPIVAAWLPNLLFLVVGW
VLYKKSTDS
>Cag_0666 hypothetical protein
MNKKIMNLPHKVNKKIDLIEFNKIKAIFRREFKRGTFKLLDVGSGLCSFP
NYIKNEFENAKIYCIDINKDLVDLATKSGYNAQEGDLTKLNYNDNTFDVV
HCSHVVEHLPYPHVIEAIDELVRVCKSNGLIIIRSPLWANHRFYNDIDHI
RPYPPNSILNYFANQQQQKVSKHRIVEEDRWYTKIYYEINPLRFNYKIIK
YINFFLKISWLFLSFPIDRPNNYGIVLRKRSGL
>Cag_0437 conserved hypothetical protein
MTILDRYILKKQIAPFFFAFITIVALLQLQFFSTFAERFIGKGITFVAIV
ELLALQSAWMVSFALPMAVLVAVVMSFGTLTTTSEMTVCRASGISLYRVM
VPVIVVSLLLSFTVERFNNVLLPQANYQAKSLMAEIARSKPAFGLTEQAF
STLVDGYSMYVRSSDERHGELRGVVIHDMTRPEYRTTITATRGRVEFTPD
YQYLVMTLRNGAIHQLQQPEKSGYRSMNFERYRFVFESSLSGFTPSSGNR
MRADANELSAGELHAIGLEFRRREAVALLHVQAPLVALERLAANTDNSKM
AASPPTLRQETSAIAATKALAVIEGEIARVASELEVASTNRTLYNRYMAA
YHKKYSLSLACVVFVLVGAPLGVLARRGGFGVGAAISLLFFVLYWMLMIS
GEKMAERGVLDPMIAMWMADGVMALIGVGLVTKLTQALFSTSR
>Cag_1943 Siroheme synthase-like
MKVFLPLNVRIDHKKILFVGGGKVALHKIKSLIHYTYNITIISPYILPEL
HDMGFTEIYKEYEAADLEGFFLVYAATNNLVVNQQIRNDATLAGVLVNVV
DNRELSDFISPAIIKDGPMTIAVSSNGEDVKRTVALRNAIKAMVESPDWS
LEK
>Cag_1142 ATPase, E1-E2 type
MIRKDENKCWHALDSQQVLSIFESSLQGLELEEVKTRLVSYGRNELKRKN
KDGALKVLWRQINNPLIWVLIGSSTLATALGKITDGMVVFAVVVINSIIG
FIQEFKAGKAIEALSSMVPENATVIRDGNVLTIPVAELVPGDIVQVAAGD
RIPADMRIIQQKNLQVEEAALTGESVPSQKTTEAVNREAVIGDRKSMVFS
GTLVVSGTATAVVVDTGMQTELGNISDMLNETIDLDTPLTQKLEVIGRYL
TIGIIAITFVIMIIGTYRALGQGVLLFDALKESLIFAIALAVGAIPEGLP
AVVTIALAIGVQHMAKRKAIIRQLPAVETLGSTTVICSDKTGTLTRNEMT
VNVLWNCNYTIQVSGIGYKKEGDFRQNGTEISSLPEEMQTLLKNAVLCSD
ANVLYDENNYRISGDPTEVALVVVAAKAGIALDGLRQAIPRKDVIPFDSE
KQYMATLNDTTIIIKGAPEVILKQCSEHIGGISFDTKPIIAQIELLGSKG
MRVLALAQKEKYTKSELSLHDVASGFTFIGLIGMIDPPRAEAIEAIKACH
NAGITVKMITGDHHATARAIGMELGLSKNGNVVTGVELSNMSDNDLDSNI
KHTNIFARVAPEHKLRLVKALQKNNEIVAMTGDGVNDAPSLKQSNVGVAM
GITGTSVSKESADIVLADDNFSSIAAAVEEGRRVYDNLLKSLAFLLPTNL
GLAFILVYAIMFFPFNPLTKELVLPLLPTQLLWINLVAAIALALPLAFEV
KEPNVMNRPPRKPDEALFNGFVTFRVFFVSILMTIGTIALFSWEYANSLA
NGMLQNEALAKSQTIAVTFIIFFQIFYLINCRSLQDSVLKIGIFSNKFIF
WGIAVILLLQTLFIYTPFMQQVFGTTALDGRGLLISLAAGSLISLIISVE
KWFAKNMLKEKK
>Cag_1521 methylmalonyl-CoA decarboxylase, alpha subunit
MERLLAEREKAHLGGGQKRIDSQHKKGKLTARERLELLLDEGSFEEFDMF
VTHRAVDFDLEKEHYLGDGVITGHGTIDGRTVYVFAQDFTVFGGSLSETN
ALKICKVMDQAMKVGAPLIGINDSGGARIQEGVHSLAGYASIFERNILAS
GVIPQISVIFGPCAGGAVYSPALTDFVLMADKTSYMFVTGPKVVKTVMCE
NVTDEELGGAMAHATKSGVTHFVAQNEHEGIEQIKKLLSYLPQSNREAPP
VAPCSDRIDRADEKLNSIIPEKPSHPYDVKEVIHTITDHGEFLEVHEHYA
PNIVVGFATFNGISTGIVANQPCCLAGCLDNDASRKAARFVRFCDAFNIP
LVTLVDVPGFLPGTAQEFNGIISNGAKLLFAYGEATVPKITVILRKAYGG
AYCVMSSKHLRGDINYAWPTAEIAVMGPLGAIEVLCNKTLCTLKDPEERA
HFIADNAAEYREKFANPYEAAKYGYIDDIIEPRNTRPRIIRALQMLRTKK
EENPYKKHATMPL
>Cag_1130 conserved hypothetical protein
MRVFFLKYAQQELDDTAHCYEMELKGLGKIFKDEVKKAISRIIKYPEAWT
IERTTIRKCTLHKFPYVILYSIEKNHIVIIAISHQHRKPYYWIDRKPT
>Cag_1464 conserved hypothetical protein
MQDRVERVTRYIDNVMANTEGTRAEGVYVVSVALKGRAGQQKLEVLVDSD
KGIAVEQCAWVSRRILEKLEEDDEPLSEEIAIEVSSPGLGTPLQLPRQYY
RHLGKLLHVRYRTPEGAEAEIEGYLQEAQLDGVNGGDSADSVIVLKPKVQ
GKRPRNAQPLENIRLPLSRIIKAVPEAEL
>Cag_1884 hypothetical protein
MAQQQRKLTIMVSSTVYGIEELLDRIYTLLTAYGYEVWMSHKGTLPVHSG
LTAFDNCLRAVDESDLFLGIITTSYGSGQNPADNKSRSITHQEILKAIEL
NKPRWLLAHDHVVFARTLLTNLGFKGKSGRQSLKLQKNTIFGDLRILDLY
EEATIDHESPDDVPLAERRGNWVQKFRTHEEGSLFVNAQFGRYQEVEAFL
QENFERGFSLLKKGGNA
>Cag_1035 hypothetical protein
MKKTIWVAAGVAGMLLGNPAVNTQAETRESIQTTSDHSFVIDARPSFIYL
PDQGFAVSVDSPYDIISGDDHYFLNQKGSWYRSSSYRGPWELTKEKHLPS
NVRKHRLEDIRKYRDAEYNKIINQRTPEQQRSNDNNRPAQEQQRSNNNNW
PAQEQQKSNNNNWPAQEQQRTDDNRSR
>Cag_0755 conserved hypothetical protein
MKKLPVGIQTFSKVVEDDYLYIDKTDIARSIIEKYQYVFLSRPRRFGKSL
FLDTLKNIFEGKQELFKDLLIYKQWNWDVTYPVIKISFSGGIRDKESLQK
NLFYILNDNQERLTITCKEKSDPNQCFAELIKKTFQKYQKSVVILIDEYD
KPILDNIENIAEALIIRDGMRDFYTRIKESDQYLRFVFLTGVSKFSKVSL
FSGLNNLEDISLNPDFGNICGYTQKDVDTSFAPYLKGVDMEEVKRWYNGY
NFLGDKVYNPFDILLFIKNKCVFDNYWFETGTPKFLVDLIKKKNYFIPDM
LTLRVNKSVVNSFDIENINLETILFQTGYLTIKQVLPLGMGVGYELGFPN
KEVQISFNDYILQIMTIVADKEPIRYELFDIINNGNVASLEPIIRRLFAS
IAYNNFTNNYIESYEGFYASILYAYFASLGFDIIAEDVSNKGRIDLTLKN
QDKIYLFEFKVSNQEPLEQIKKMKYYEKYNGERYLIGIVFDPKERNVSQF
AWEKI
>Cag_1745 conserved hypothetical protein
MEPKRSLKRGAGLLMSLSALSILSLVFMAIWVNWEKPRQAAEPLTPEVKN
LVDRIPSTTDALIYIGMKDIRQSRLWQEVIPDSLKQAPLFQPTGELATLL
ERSTINPSKDIDTLLISFKRHGYKEQLFLAIASGNLQTKLPKAMQAGNHE
TLGGHSCYSFGSSLWFSQLNSRRVVLSNSKELLGNFLQPQGSFLQRDSLT
TTLIDKARYKSHLWFALPSAAWTSGALQSLTSSNKDVKSIGNLNRIKHLT
LSVNFKDGIEAESEWLYESNQAAYFASTFLWGAIQLPRLSEKNEQTRALL
DNIAIQQNLNSVIIHTALPLQIFQTAKEQPAP
>Cag_1339 putative sugar transport protein
MARNVVNAEQTEEVSSTRRVIAASSVGTLIEWYDFYIFGSLAKIISEQFF
PKDNPTAALLATLATFAAGFVVRPFGALFFGRLGDLIGRKYTFLVTLVIM
GGSTFAIGLVPGYATIGFAAPAIVFVLRLLQGLALGGEYGGAATYVAEHS
PNGKRGFWTSFIQTTATFGLFLSLGVILIVRQTLGVETFQDWGWRVPFIL
SAFLVGVSIYIRMKMSESPMFAKMKKEGKTSANPLAESFKQKDNLKMVLL
ALLGATAGQGVVWYTGQFYALSFLQNACNIEFEQSNLIILIALVIGTPFF
VIFGALSDKIGRKYIMMAGMFIAVLAYRPIYTMMYNDANLKNKIEIVDQT
TVETKEEVKGTDNVITTVTKKTFEDGTTYKEIKKETIPLDAAKKAELAAA
DKLKPETKKEVVLPQHLYYKMIGLVLIQVIFVTMVYGPIAAFLVEIFPTR
IRYTSMSLPYHIGNGVFGGLVPLISTRLVEATRPAAGLPPADPLAGLWYP
IIIAGVSFVIGMLYISNNTNNMDVE
>Cag_1175 phytoene desaturase
MTSDKPRTQPSLHQPTTDMLTHCRSLDAAYNYCRHIAQEHAKTFYLASLF
LPKEQQRPIYAIYALLRTVDDVVDMAEEKLAAGLITRADISTMLDEWKHK
LEACYAGTVARDPIMMAWHDTLQNYTIPIELPLDLMDGVAMDIEFKPFET
FEELYVYCYKVASVVGLITAEVFGYSNKEALDHAIELGIAMQLTNILRDV
GEDAANRQRIYLPLEDLRRFNYSPEELMQKKMNDNFIALMKFEIERARHY
YTTSDKGIPMLQSRSRFGVALSSINYANILTAIEENHYDVFSKRAYRSFW
QKISTLPTVWQKANGA
>Cag_1051 hypothetical protein
MRKIILSKRASKRLEKLLEYLEFEWSFKVKNDFIKQLDKSLKRIQKYPES
CEQTRFVKGLHMLVVTKQTSLFYQFDSETITIVTLFDNRMNPDTLKKETA
>Cag_0881 conserved hypothetical protein
MNRIKAYYDEAYPPVPSKRVLYWRKNLPWQIIRFFVLNFKIMRIVVGGHS
>Cag_0487 Methionyl-tRNA formyltransferase
MRIIVMGTPDFAVPSLQAIAAMGNGFDIVLVVTGQDKPRKSKHAAAEASP
IKQAALALNLPVHEVDDVNDPHFAEIVAAYKPDVIVVAAFRILPPAVYSQ
ARLGAFNLHASLLPAYRGAAPVNWAIMNGEEETGVTTFFLQQRVDTGTII
MQQKTAIAPEENATELIVRLANIGADVVVETLRRIAAKNAATAPQDDSLA
SRAPKLTRINTRINWEQPVATLHNFIRGLALKPSAWTTFNDKTLKIFRTT
LSTPEHDHPSAPAGSLLVSHGKLYVHGSDGWLELLALQLEGRKPMEAADF
ARGLHVEGEALRLV
>Cag_0789 periplasmic sensor hybrid histidine kinase
MKSRVPKGLPVPQVSTNQPLPVVPVAVLPGIYFVQSFAGKVLMGNERFSQ
CVAHSQHDKLLGVTAYNLIHTRDHPILQDALEKVTLYEQSVSIEVRLNFL
TSENAFWYLFTVSPLTLNGVACFQFVGIDITERLYSNRLKHFIESISTVN
KTISDEVLIQQLLDVAELLTGSSCSYCYQCETLEKNQLILSASTQQFIQE
HEISLLSWQTNNEGILEYCPSATVADQLSDVLLLNTTFHRRLLFPINDSA
PTTAFVCVGNKLCSYTEDDRNYLDQLVECLWSFFAQRNKEQTQTHVKALL
ERTVQQTTLCRITDMVGEEIAHLLISLRSLIQNSSQEQGLEASLHASAAQ
AIDAVKGLTTFQQDLFSLIHINAEHLMTVDAYDTLDALAGIVQNELDCII
EVESDFKHFSPLIQIDPLHLKRVITILCENALEAMLFSKTITIRLSPDVT
FEQESLQPLQPMLSIAVHDTGVGIAPHIIPHMFEPFFTTHGEEGKRGIGL
TIAKRLMHLHNGDIVCSSSEEGETIITLTLPMLYEKELLEHSTKMDSLVL
MVISDVSVLAMLSLLLETHGYGLLIANSVEDALEFAETYSNSINMLVTET
HLPECSGSELAALLLANQPFMKVLYLAASTPTIQTDKNNGMIMYPPFSGY
DVLDMLEALSS
>Cag_1429 Cobyrinic acid a,c-diamide synthase CbiA
MPNSSFPAFLLAAPSSGSGKTTITLALLRLLAQRGMVVQPFKCGPDYLDT
RLHSLAASYGEHERMGINLDSFMSSPNHVQELFARYSAHADAAVVEGVMG
LFDGAEKWQGSSAEIAMLLNIPVIMVLNAHSMAYSAAPILHGFKNFNPAL
KLAGVIFNQVNSASHYRFLEEAAHDVGVEPLGYVPKSEHIAISERHLGLN
TSADYNREATITAMAIHVAQTVHVERLLEITAIARPPQLSMPSAKHVGKG
AWRIAVARDEAFNFLYHHNMEVLSHYGELSFFSPLHDERLPDADALYLAG
GYPELYAKQLAANSAMRQALKAFSHNGGRIYGECGGMMYLGNSMTLADGT
QHAMCGALDLETTMQEARLTLGYRQVEIAGCPFTLRGHEFHYSRISQQGE
LRTIAVVKNARNEQVATPLFQQGNTIASYLHLYWGEMHDAPRWLFSM
>Cag_1153 conserved hypothetical protein
MLYFAHPEEVLSHWVSRVCGGSPSDIAALYHPNAVLIPTFSPHTVTTPEG
ILNYFSQLATRQGMGVRLHNKALRTQPLSETLHTISGIYSFEFEVDTVLL
SFPSRFTFVVDCALKTPIIHHHSSQVPRNLS
>Cag_0420 Dihydropteroate synthase
MVMGILNTTPDSFYDGGNFVQSGKAIDVECALERALCMVREGAAIIDVGG
ESSRPGAVKISAEEEIKRTVPLIAKLRQASDVLISIDTYKAEVAKAALQA
GANIVNDISGFTFDAALPEVCRRYNAGVVLMQTPVTPQEMQWSLLTPNSN
GNVVAQVRQSLEKSLAVAKRHGITNIMLDPGFGFGKSVEENYRLLAHLSL
LQNLGYPLLVGLSRKSFLGRLKVNGEMLIFPAKERFVATIAAQALALQQG
ARIIRAHDVAEAVQCLQIVEATQAAGAVDYPHDSGFPHSRE
>Cag_0933 DNA polymerase III, gamma/tau subunit
MGAWQHKLANFATQPHFKPPTAPVPTQADASHASTVTTPIVAMPSTITLE
ALKVEWQQFLEHLTHHGHTVLATHLQSCELASCSATGLVELACCRKFSCE
EVQQERDMLQQEMVRFYQQPLQLRIRYDAAKDACTKEKSRFTLFQELSQQ
NEVIRFIVQEFGGELMY
>Cag_1551 RNA-binding region RNP-1 (RNA recognition motif)
MNIYIGNLAYSVTENDLRDAFGQFGQVESASIITDKFSGRSKGFGFVDMP
NDSEAREAIGAMNEKELNGRPIKVNEAKPREERPARRDRY
>Cag_1119 Fmu, rRNA SAM-dependent methyltransferase
MSTAREVALQVLQALEKSSERSDALLHKHLEASGLERVERALATELVNGV
LRWQLQLDSRISLAYHHKLEAAAPVLRNILRLGAYQLLFLTKIPRWAAVN
ECVKLARKYKGERMAKLVNGVLRHLDGGNEAFEKLLQGRSQAEQLALQFS
HPAWLIERWLATYGELKTRQLLHYNNQAPMMGFRINRLKADAKDFFDNPT
FAAAMEPCELPYCFLSREFSLFETALQEGVLSVQNPTQALAPLLLNPAPQ
SVVIDLCAAPGGKAMFMAELMHNQGEVIAVDQYEQKLRKLESHAQALGLT
IIRTVAGDARSVAPNLQADAVLLDAPCSGTGVIGRRAELRWKVTPAMVAE
LQGLQAELLDHAATLLKPQGRLVYATCSIEPEENGEQVAAFLQRHPNFVA
EPHAQHMTLPGDRYGYDGGFCQCLRKLQE
>Cag_1959 redox protein regulator of disulfide bond formation-like
MRQNGSLQLRRSLISSLTFYLRSVVMSDVTPKVELNCEGLNCPLPILKTK
KAMDSLSVGDVLQVSTTDPGSVNDMDSWAKRTGNELVSHTEAAGVHTFLL
RKQ
>Cag_1660 3-oxoacyl-(acyl-carrier-protein) synthase II
MAQEQKRVVITGIGVLTPIGLTKETFWEALKAGKSGAAPITYFDTTNFAT
TFACELKGFSAENYLDKKSADRMDPYCQYAVVAATHALADSGLDLAAIDP
IRIGVVHGSGIGGMTTYDTQFRQFLDRGPRRISPFFIPMLIPDIAAGQIS
MKHGLMGPNYATASACATSLHAILDAVLLIRAGMADYMVCGGSEAPITSM
SVGGFNSAKALSTANDRPQQASRPYDRDRDGFVMGEGAGSLVLETLESAK
ARGAKIYGEIVGLGATADAYHLTAPHPEGKGAVNAMKIALSMAGIALEQI
DYINTHGTATPLGDIAEITAIKSLFGEHAYKLSISSTKSMTGHLLGAAGV
VESIACLLAMQNQTVPPTINLDNLDEQIDLDVTPNQPKERKIEYALNNGF
GFGGHNGTIIFRNGSAL
>Cag_1639 oxidoreductase, short-chain dehydrogenase/reductase family
MLTLHQKVAIITGSTRGIGKAIAQEFVRQGARVVITSSSPQNVEAACKEF
PAGSVHGIACNVTSPADMERLVRESVAHFGQLDCFINNAGISDPFTNITE
SDPEAWGRVIDTNLKGTYNGCRAALIYFLTNNKQGKIINMAGSGTDKGSN
TPWISAYGSTKAAIARFTYAIAAEYRHTNISIMLLHPGLVRTGMVSTEHP
TPELERQLRTFNTILDIFAQPPTVAASLAVKMASPWSDGKNGIYLSALSS
LRKKKLLISYPFRKLFKKIDRQTY
>Cag_0928 S-adenosylmethionine:tRNAribosyltransferase-isom erase
MRLSNFRYNLPRHKIANELCEPRDNCKLMVLNRRKKEIEHKQFEDIASYL
KKGDLLVLNNSKVFPAKIFGQKEKTDAKIEVFLLRVLNKEAGLWDVLVDP
ARKVRVGNKIYFEEDVVAEVVDNTTSRGRTIRFLNPDINVFELVDRIGTI
PLPPYFNRAPRPSDKEHYQTVFASEIGAVVAPMAGLHFTMPQLQAIQKMG
VKILPITLHPSLSTFSAIEVEDVSKHKMDSEYFHIPYQTAMEINETKLSK
SGRVIAVGTTTCRVLEANATVDGKIKFGKGWTDKFIYPPYQFKVTDCLLT
TFQLPETTLMMAVSAFADHRLLMDAYKIALKEDYQFLAYGDMMFIH
>Cag_0598 conserved hypothetical protein
MEENKSQIIIYQTENGETKLDVRFQDETVWLTQKLMAELFQTTSQNITIH
LKNIFEERELEEDATCKDFLQVQKEGNRKVKRNQKFYNLDAIISVGYRIK
SHVATKFRQWATQHIKEYIVKGFVLDDERLKNPDLPFDYFEELERRIQDI
RTSEKRFYRKITDIYATSVDYDPTFDISIDFFKTVQNKLHWAITGQTAAE
IISLRADSTKENMGLTNWRGDKIRKADVLIAKNYLNEEELTSLNNLVEQY
LIFAQGQAMRRIPMHMKDWIEKLNGFLTLNDRDILNNAGSISHTLAKENA
EREYEKFKDSEQKMITEDDFEKTIKNIEQKKKK
>Cag_0010 Peptide chain release factor 1
MLDKLQAIKEKHLDLEQALSDPSLANDQERFRKLNREYSNLREVVQAFDA
YQRKATELEAAQHLLATESDTEMKALAEAEVDELQETVAELEQGLKILLL
PKEEADSRNVIIEIRAGTGGDEASLFAADLTRMYQRYAEQKGWQYKVLDY
NESSVPGGFKEMILEISGNNVFGTMKYESGVHRVQRVPDTEAQGRIHTSA
VSVAVLPEAEEVDVEIRRDDLRFDTYRSGGKGGQNVNKVETAVRITHMPT
GIVAACQEERSQLQNKERAMQMLRTKLYDIQLAEQQQQRADLRRTMVATG
DRSAKIRTYNYPQSRVTDHRIGFTSHALPQIMQGELHEVIEALIMHDQAE
RLKSELQ
>Cag_1949 conserved hypothetical protein
MERLLVLYGGKIAANSFVTDSLSKALADVLHYGGKECNELSEALKDVTCI
LAVQAMPPSQTLALFFELPAVLRDYAVKGAPKQLLKESDMDALQKRLEEL
TLKAFDSYMMHRETISQLKVDEMKRQLFMQLRRAEA
>Cag_1275 conserved hypothetical protein
MLDVQVSHNSVGTLAHHTSEQGSYTFAYHKAIDIGQEVSLTMPWSLASYH
YRKGLHPIFQMNLPEGRLRYTLERAFRKQAQGFDDLMLLDIIGHSQIGRL
HCTSNPQLPKSVPLQSINELLAYNGTEDLLRDLLERFSATSGISGIQPKV
LICDPNQAALGAKFPTHHSPQLTNAQARITVKGATHIVKGWDENEYPHLA
LNEWFCMKAAKQAGLEVPRIFLSENYQLLILERFDLLEDGTYLGFEDFCA
LHGLSTFEKYDGSYERVAKRITQFVSQEHRQKAFEEYFKIVALSCAVRNG
DGHLKNFGVLYSNTTSDVWLSPAYDIVSTTPYIPRDSLALMLDGSKRFPS
RKKLLNFARQHCNLQHEQATEMMEKIGDAVNETMAEIKVQIKEYSPFASI
GNRMLSTWNEGIIDLNGKSTISFST
>Cag_0987 Secretion protein HlyD
MSKKKNSINRKTLWLLLAALVIGSGSLIFWLRSREKPIEITTEKAFEKEV
VHLITATGTIQPELVVAMSPDVSGEIIELPIKDGEEVKQGALLFKIQPDI
YVNQVQQSQAQLNAALSQSAEANARKLKAEDDFRKANLLYKEKLISQTDY
LASKTNAEAATAAYKASLFAVDQQKSLLVQNRDRMNKTTVRAPINGTIIA
LNSKAGERVVGTGQFPGTEVLRLANLDSMQVEVEVNENDIVKVTQGNPVT
ITVDAFGDRKFSGVVREISNSAIAKAAGTQEEVTNFAVKIRILNHNRLLK
PGMSATADIEAERISKALVVPIQSVTIRGASKPQHDEGEQKGVSVAPSGK
ASEGQQGVFVVSKNKAWFRPVTTGTTDNTHIIVTSGIRKGEEVVSGSYSA
ITNQLKNGSLIKRLTPETP
>Cag_0699 DNA mismatch repair protein MutS-like
MNPSTLKKLEFTKIAAYAAQLCLSPMGRDRLLNARPLREREALMAELERV
LELRMLLQEGLTLPFSHLPDTRVLLKKLEIEHLALEPLELLDLYHLLYSS
VQLRRFMYGNRERYGRLNDLTIMLWMERSLQAMIQRCVDERGLVRDSASD
GLLLIRHDLAESRELLRRRMERLLRRASANGWLMEETVAVKNGRLTLALK
VEYKYKIPGYIQDYSGTGQTVFIEPAETLETSNRIQDLEISERREVERIL
QEVSAALRGELENIHHNQQLMAEFDALYARARFAVETNAVLPTVTEGNEL
RLIKAYHPWLLLSHRERTVQPLDLHLSAEEQVLVISGPNAGGKSVTMKSV
GLLCCMLVHGYLLPCSESSCIPLFNNIFIEIGDDQSIEHDLSTFSSHLSA
IRSILERAGTRDLVLIDELCGGTDVEEGGAIARAVIEELLASVAKSIVTT
HLGDLKAYAHQRDGVVNGAMAFDRAELQPTFRFIKGLPGNSFAFAMMQRM
GFSPALVERARHFMAHERIGLEQMVDDLSHIMEEQQRQRQQLDDEQRTFA
ERERTVLEVEATLKQQQRELKQQISRAVQKEVEHARKEIRAIVQEVKAAP
TNPQVVQAAREKLGIKRQEVEERHTTAAPTTASEPTIDRTITIGDMVRLL
DTNATGEVERFNGDNVVVRCGTIRLQTHLKNLEKSSKTKARTAQRDTSNS
KVRSWSTVTNEVSSTQLDVRGMSGNEAVPHIERFLDTLRLHRIHFATILH
GKGTGSLRKRTAECLKLHTAVKSFRLGGLGEGGDGVTIVELGE
>Cag_2001 hypothetical protein
MNSFKSISRTLFALSVFTATPLFAETATSQPSTQQEPLNISGTLEVEGYG
ERVAGKTTSEFVLATFETRLERQLSKHVFAHATLLFEEDGDNELLVDEAF
ITYKALNSPWYVKAGRFTQPFGWFESGMISDPLTLELGETKHHAALLLGT
ESEQLSAALGMFRGDVQYDNNPSIDSFVAAVATNFTLGKEMRGTLGASFT
NNMGDTDGLQDLFDDGAGNLVGTEKVSGYSFYGSLAQGALTLRAEYVAAA
NSFVDGTLAGLKPSASNVEVSYAVREGVDVTARYGSGSDMDIANQYGAAL
ACTVDGAATLGVEYLHNSMDSAADANRLTVQLAVEF
>Cag_1594 hypothetical protein
MAAEEQKTAAGLGTPTTPAKPASGGVGDGDMAHLIGNMGILIDSTIVTVQ
SAVSTVSSTTEQIMQSVTTAINSEPVQGVINSVNSVTDKLVSGVANSVNP
NSLQEVINSVSNALSSSPVQGVLDSVSSATGQLLEGVTTTINSGQNSFQE
LGKLWTGLLENLNAMDGANQVQNLFNNVSAGLGQLLGNLPQMPMIPNMGQ
NRDALPKKDRTEIHFTPTTPVAQAAPMAPKPAPAPQAPAAQAAPVAPKPA
PAPQAPAAQAAPVAPKPAPAPQAPAAQATPVAPKPAPAPQAPAAQAAPVA
PKPAPAPQASAAQAAPVAPKPANPSQQGGPRVIVAAPQKKK
>Cag_2000 transcriptional regulator, Fis family
MPITHSLHELHRLMRTISNAIGSLHDPQELFHTIITQLRLFFAFDSAAII
TLDNNQRHCSLFFEMLRFQLPEGFVRYQRPVAGSWIEPHVEAQKVAVIRL
RDSLEHFPVDVELRQHLLELGLKQAVLSPLRAGGKVIGFLTFVFQEEQEW
LDEECELLATVSTPIAVAVSNALAYEELRKREAQTAMQLAVNNALLTIRE
RNSMMLAVCEQVQKLLPCTFLGIRVMDKEGNYRLYDNFRLQPDGAFAPVT
AALDMENYDSIARASFDLAIIGGYYGGNSFLELCARFPILEHVRQKYGIC
SFLSLPLWELPASRAGLIISTTMRPFDGHDQATANLIVPQLSLALQNYLA
FEEINELRHKLEGERSCLIEEITATHHVGEIIGSSSALTAVLYRLQQVAP
TDATVLIQGETGTGKELFARALHNLSARSKRALIKVNCATLPASLIESEL
FGHEKGSFTGATERRIGKFELAEGGTIFLDEIGELPLELQAKLLRVLQER
EFERLGGRHVIRANVRVIAATNRNVEQEVASGRFREDLYFRLNVVPLHVP
PLRERRDDIALLANHFANRYAREFSKPNRPIRQNDMQRLLGREWKGNIRE
LAHMVEQAVILSEGSTLDFATILAPQQTLNAPTNRAISSLRTMRAFEEEM
IAMEKQLILDTLEATGGRVSGSGGAAERLAMHPKTLYTRIATLGLSKRYG
>Cag_0943 1,4-dihydroxy-2-naphthoate octaprenyltransferase
MGNEHPTPSLVQAWMLAIRPKTLPAGAVPVVLGSALAAVHGSFKPLPALV
ALLCALGIQVATNFINEIYDFRKGADTAERLGPTRTVAAGLISEATMIRV
SALLVGTVFCMGMYLVFIAGWQILLIGLLSLLFAWAYTGGPYPIAYSGLG
DLFVFIFFGLVAVGGTYYVQTLEVSFPILLAAAAPGAFSVDILLVNNIRD
ITTDRKVGKMTLPARIGAEWARRLYVLLMIIAFAVPIVLTGFGYSPWGML
SFAALPAAVRSVKQLYQSEGRELNQVLAGTGQVMTLHGMLLSIGLLVP
>Cag_1950 DsrH protein
MLHTINKSPFDSSSMETCFRFLQAGDAVLFLEDGVYAAQAGTKFAQAVDG
ALQKNALYVLEPDATARGIKNFVSGVQPIDYEGFVELVEQHQVNSWF
>Cag_1973 ATPase
MSTTDTRQRLEEIEKQLASLTEERNTIRARWESEKELLQLSRDLQSQIEQ
LRVQAEEFERQGDYGKVAEIRYGKIAQIERDLAENRKKIEEKKLGGNLIM
KEEIDAEDIADIVSRWTGIPLSKMLQTERQKLLLIEDELHKRVIGQHEAV
TAVSEAVKRSRAGMGDEKRPIGSFIFLGPTGVGKTELARTLADYLFDDED
AMIRIDMSEYMESHNVSRLVGAPPGYVGYEEGGQLTEAVRRKPFSVVLLD
EIEKAHPDVFNILLQILDDGRLTDSKGRTVNFKNCIIIMTSNIGAQLIQS
ELEQLEGAKRDEVLAGLQEKLFQLLKQKVRPEFLNRIDEVILFTPLTRDD
LQKIVLIQFEHIQEAALRQHITLTISDEAIVWLAKTGFDPAFGARPLKRV
MQRKITNKLSELILSGAVQEGESVAVECVGDELVVRKNA
>Cag_1114 DegT/DnrJ/EryC1/StrS family protein
MAGAELIGKEELAEIQELFSKEKVNLYRYGGGNYKAREFEEKFAAWMGVK
YAHAVSSGTAAIHCALAGAGIQPGDEVITTAWTFIAPVEAISALGATPVP
VEIDETYHLDPLEVEKAITPKTKAVVAIPMWAPPKMDELVALCNKHGLIL
IEDAAQALGASYKGRKLGTFGKVASFSFDAGKTLHTGEGGIIVTNDKEVY
DRAAEFSDHGHMHLPGLPRGKDPRRAKGLNLRLSEVTAAIGLAQLAKIEM
ILSQAKENKKKIKDAIRHLDNIVLRPFSDEAGAQGDTLIFRVRNREAALQ
FEAHLMEHGFGTKILPEALDWHFAGVWNHLLPEYERYQNVDLETLWTNTG
TMLRSSICLNIPVLMDDDTIKRLINAIVTGAEKIG
>Cag_1213 Thiamine-monophosphate kinase
MPLNGEFTLIDTIAHLVQPTLANAPTLLQGIGDDCAIMQPTAGMVEVATT
DLLVESVHFDLLTTPLSHLGSKAISVNVSDICAMNALPRYALVSLALPPT
FSKKMVEELYGGMVHAAQAYGIAIAGGDTSASRSGLMISITAIGEALPTQ
LTRRSGAQLDDLLCVTGTLGGSMAGLKLLMREKEIMLEHLRNNEPVNRNL
LADLDEYRELMQRHLLPTARLDVVRLFHRLGIQPTAMIDISDGLSSEVQH
ICRHSNCGALLHESRIPIHATTRQLADEMQEEPLTWALTGGEEYQLLFTL
PEATYQQLAHERDIHVIGTITPTNEGMVLEEMFGIRIDLTTIHGFDHFAP
SGNDDGNTENEEEEFEDGV
>Cag_0432 PhoU
MPDRQIHELINELSCILVRYSEKVVQQLLHALHNTEAQHEALQQGSYGNE
STNLEERCLTYLALQQPVAKDLRAIVTIIKINDELHHISDLSRNIIKRMK
EIQPEMVEHFGFENMAMNAKEMVIKAIDSFVEKDDSLARKLHEMDDAIDT
THRNIFNTATAMMKNPDADIDQLIASLSISRYIERMADHAVSIANEVIFL
VTGEMIRHAEWSYEELLKSRNNQ
>Cag_1380 tRNA pseudouridine synthase
MRTIRLDIEYDGTDFAGWQRQSGTIPTVQGTIEKLLSQITQEEVLLNGAG
RTDKGVHARQQVASCALHSSMELSRLAHSLNCLLPPTIRICNAVHVTDDF
HARFSATERTYRYFVSETPSALCNRFTGCANALLDVGVMQQMAAMVVGEH
DFTAFSREERDSPAKRCKVTSCRWHRLHGMVVLQISANRFLRSMVRYLVH
AMLQGGKGRLAPTLFQDMVESGTSSYRMVAAPASGLFLWRIGY
>Cag_1873 Glutamyl-tRNA(Gln) amidotransferase B subunit
MDYELVVGLEVHCQLNTVTKAFCGCSAQFGKAANTNVCPVCLALPGALPV
LNRQVVEDAVKIGLALDCRIAPHSVLARKNYFYPDLPKGYQISQFEEPIC
SDGSIDVELDGVNRTIHLIRIHIEEDAGKSIHDIGDDTFIDLNRSGVPLL
EIVSYPDIRSAKEASAYLQKLRQIVKYLGISDGNMEEGSLRCDANVSLRP
VGATEYGTRTEIKNMNSFKNVEKAIEYEALRHREILENGGVIIQETRLWD
ADKGETRSMRGKEFAHDYRYFPDPDLVPVLVDEAMIERLKLELPEFPEMR
ARRFAADYGIPTYDAGVLTVERELADYFEETVKLSGDAKTSSNWVMGEVM
RTLKEKYLDIAEFSIRPARLAGLIQLIHNKVISNTIAKQVFEVMLNDEAE
PAAIVERDGLAQVSDSGALEAVAQEVIDANPKQLAEYREGKTKLMGFFVG
QCMSRMKGKANPQLVNDILLKKLEG
>Cag_0729 Peptidase S41A, C-terminal protease
MPPCRVEAFWPLATSRAASATTLQPTALHDETGKYISQTLLQYHYRKPAT
NDSLSLQIFNRYLEQLDGSKSYFVASEVESLRKVYGTRFDDELLAGKSKS
GFGMYNFFLKRAKEKMRFMKATADTARFSFMQPEEFELDRKRTPFLPDRR
QLTALWRRELKYQWLTLKHSGEKNSSIRAELSKSYASRLSLLQRQTPNDA
FQSYMAAVTTSFDPHTSYLSPDDYTNFQIDMSRSLEGIGAKLQTEGQYTV
VGEIIPGGPAFKTGFVKKGDKIIAVGQGSSAPMVDVTGWRINDVVKQIRG
PKNSIVRLKILPASQGGVASTKVVQLVREKIDLQEQAARKSIIQQNGLKI
GVITIPSFYLDFEGQQKQATNYASTSRDVARIVEELQREELSGIILDLRD
NGGGSLEEAVNVTGLFITSGPVVQVSNASGGKSVVRDDDRRIFYSGPLAV
LVNRYSASASEIVAAAMQDYKRGIVIGERTFGKGTVQSIVKLTRPFHFFG
KAPEFGQLKLTVAKFYRISGGSTQHKGVVPDITMPSLIDTSSVGEDTYSS
SLPWSTISPALFRPIADVTPEHVTQLRQKQQVRIDTSRLYKTYMRDLATL
NRIRKKKSITLQDSSFKSDVETLRQIEKNWGESNELDSTHTKSGGKALER
DVLLQQSSAVMADFVELKTTERQTVIRAVPALN
>Cag_1261 putative serine acetyltransferase
MKHNTSDIIEETINLLSCRRGNNTCEYLGRHEHLLPNIDKLRETVDLLRS
IMFPGYFGAPVLRQESLPHFLGVRLEKLQGMLAEQIPRVLHFGNANHDEA
EHAKNTATIVRQFIEMMPEIKRRLCTDVRAIYDGDPAAKSYEEIIFCYPS
IRAVINYRVAHALLSLGVPLIPRIITEMAHSETGIDIHPGAHIGDYFCID
HGTGVVIGETCIIGNHVRIYQGVTLGAKKFETDEEGHLVKDLPRHPIIED
NVVIYANATVLGRITIGKNSVIGGSVWQTKSLPPHSKVLQQTAEES
>Cag_0468 conserved hypothetical protein
MSWLVLLAMALTLFFSFVVLFRKFLGYMKKEQNLVIEPLKDALIDKDNPV
GLNPEELQKLKQQQQEAQRHLSEVIAKIPVIQKDGRFQIDQEAIQKRKEE
LLKTENISKN
>Cag_0809 probable transposase
MAGTYSQIYIQYVFAVKGRENLLQKPWRDDVFKYIAGIIKGKNQKPIIVN
GIEDHIHVFVGLKPSMSIADLVRDIKNNSSNFINEQKFLPRKFAWQEGYG
VFSYAHSQIEYVYQYISKQEEHHKTKTFKNEYLEFLQKYEISYDEKYLFE
WLD
>Cag_1027 aspartate aminotransferase
MFDEIEFDKIKRLPKYVFAAVNELKMAERRAGEDIIDFSMGNPDGPTPQH
IVDKLVESINKPRTHGYSVSKGIYKLRGAVGSWYQRKYNVDLDLDREVVA
TMGSKEGYVHLVQAITNPGDLAMVPDPCYPIHSQAFILAGGNVHRLKLDM
HPDFTLDEEGFFRNIETALRESSPKPKYLVVNFPNNPTTATVELSFYEKL
VALAREERFYIISDIAYAEITFDGYVTPSILQVPGAKDVAVESYTLSKTY
NMAGWRVGFMVGNAKLIGALEKIKSWLDYGTFTPIQVASTIALMEDQSCV
REIAEVYRKRRDVMVKSFNNAGWDLVSPRASMFTWAQIPEKYRHLGSLEF
SKRLLKEAKVAVSPGIGFGAYGDEYVRIAMIENEERIRQAARNIKRFLRE
>Cag_1955 Sulfite reductase, dissimilatory-type beta subunit
MNSTMSTPEQKWKTIESGPHTYEDALHPVVKKNYGQWRYHEIPKPGVLKH
VAWSGDAIYTVRAGTPRQDTVDLVRRLCAIADKYSDGFLRFTVRNNVEFL
TPNEANIEPMIQELEAMGLPVGGTGNCVSSVSHTQGWLHCDIPATDASGV
VKSMMDTLYNEFRHMEMPNKVRLSTSCCSINCGGQADIAVVVKHTRPPRI
NHDVLATICELPKAVARCPVAAIRPTVINGKRSLMVDEAKCICCGACFGA
CPAMEINHPEHSKFAVWVGGKNSNARSKPSTMSIVAHNLPNNPPRWPEVT
EVLGKILTAYKKGGRPWERVGEWINRIGWKRFFEETGLTFDENMIDSYRH
ARTTFNQSAHVRF
>Cag_1563 SecA DEAD domain protein/helicase, putative
MIPSQEYRKSVVHQPENVPSGFRGATHWLAGKVHRRQSKQQALLEQAHTI
HTAAEAHRTLSLVDLQAQLLSFRDHFRRRARGYEQHISAAMALIVEASHR
QLGLRPFPVQIMGALALLEGSLIEMQTGEGKTLVAALAAVFLGWSGRSCH
VITVNDYLASRDYARLEPLYTFCGVTASCVIGELKRPERQRSYQAAVVYV
TSKELVADFLKDRLLLHGVSDPSRHFLHSSNTLREGDEVPVLNGLWAAIV
DEADSVLVDDAATPLIISRPVKNEPLMEACREAVRLAAKLQPTLHYTVEE
RYKQIALTSEGNATIEQMLPTLPPFWHSATRRNELLLLVLNAREFFRKGK
DYVVSDGKVVIIDEFTGRLMPDRKWQKGTQQIVELLEGVEPTDPVEVAAR
ISFQRFFRFYKLLCGMSGTVKGVTAELWHIYSLPYVAIPTNKPSRRTTQA
PEYFLEKGAKYAALIATLEALHRQGVPILVGTRSVRESEFLADLLRQKML
NFQLLNAIYHKEEAAIIARAGERGNITIATNMAGRGTDILLEQGVAALGG
LHVLLAEPNEAERIDRQFYGRCARQGDPGTSYSYIALDDRLLQRFFPERF
LNSVMAEVLLRRLPGSHALMQLLVYLAQQMAQRMAYQQRLSLLRRDEQLD
QLMSFAGSGPKF
>Cag_1155 Ribose-phosphate pyrophosphokinase
MAKPIKIFAGRSNPALAEKIADYLGQPLCNVKVDNFSDGEISVNYHESVR
GSDLFIIQPTNPPGDNLLELLIMIDAARRSSAARITAVMPYYGYARQDRK
DRPRVAITAKLIANLLTQAGANRILTMDLHAPQIQGFFDIPFDHLYSSVV
LIDNIRNRDFCDNLVVASPDVGGVKLARKFAEELGTDLVIVDKRRPAANV
AEVMNIIGDVSGKNVLLVDDMIDTAGTIVNAAKAIREKGGLKVYAAATHP
ILSGPAIERINNSVLEKMIVTDSVITNHAPSPKIETVTVSELIGEAIKRI
YEDDSVSCLFDSKNIATKLMHIH
>Cag_0297 conserved hypothetical protein
MENSSNIARQRKAVKSRACEPNENLLAADGWRVFKIMSEFVNGFETMSSC
GAAVSMFGATRALPESKEYQLAEELGELLAREGFAVITGGGPGIMEAGNK
GAQKAGGVSIGFNIKLPEQQHPNHYIDQEKLLHFDYFFVRKMMFLKYAQA
FIALPGGFGTLDEVSEAIALIQTGKSERFPIILVGKSFWQGFYDWIRQTL
LEEKGYINTFDLDFIYLEDDPKEVVAIITRFYPEGYTLNF
>Cag_0425 putative transcriptional regulator, ArsR family
MQRTTTPITYSNDEAQLAIFAKALSHPVRLKILKLLANQTCCFTGELTTM
IPMAQSTISQHLKALLEAGLIQGEINPPKVRYCINRQNWVHAYTLFEDFF
APEDLFSCQCGHGEKESTPSVAVSSTQ
>Cag_1801 Divalent cation transporter
MIGNPLLPEIRELIEQRNFSALQRIFSDWLPVDLAELISDLPENEQAILF
RLLQKDVATETFEYLDFDSQQNLLTALTKKDVTHILNSMSADDRTALLEE
LPGAVVQELLKLLSFKEFKIAKTLLAYPEGSVGRLMSPDYLCVKKDWDIN
QVLEYIRTYGHESETLNVIYVVDDNGKLQGELMARELLLSPLDKAVSAIV
EDEKIITLTASQDQKDALDAFKRYDTVALPVVDSNGYLIGVVTVDDMLDV
AEEEETEDIQKFGGIAALEDPYIDLSIPELVRKRAVWLVILFIGEMLTAS
AMAFFEDELAKALVLATFIPLIISSGGNSGSQAASLIIRSLSLGEITVRD
WWKIMRRELASGLALGCILGFIGTFRVILWGMALGEYAPETPFIGLTIGL
SLVGIVMFGTLTGSMLPLLLQRLGFDPAVSSAPFVATLVDVTGIVIYFSV
ASLLLSGILL
>Cag_1404 conserved hypothetical protein
MSNVVCSCLLGDNAHIRLRLSQLLHSEVEALYSNEPPLGEGQPEICEVEL
QGCRRDFFVNASGVPCVVGQQVVVESDGGYDYGLVYSTGAIARKKLQLKG
LDKQGIEWSSVVRVADEHDARAIEELQRRQAEIREVCLAKIKRHELDLKL
VDVELRMDQQKLSVYYTSSHRVDFRMLVRDLAGEFKARIQMVQITTREEA
RRANAFGPCGNLLCCSSWIQKIQANPFADKTHYSENPSNNDSHTFNMTGL
CNRPKCCIGFTTRQDKNGGRIGGSCCSQQQPLPTVGTLLSTPDGQAQIAF
VDAQKKLVVIRYQHNNQTRRFPLDKFNALFTRQ
>Cag_1469 Death-on-curing protein
MMEQDTNQLAVYQAENGALELRADSALDTIWASLDQIAELFGRDKSVISR
HIKNIYLESELEKTATVAFFATVQKEGSRIVTRNIEYFNLDTILSVGYRV
NSKVATRFRQWATKTLKQYITQGFSINQKRLEENKTQFLKTLEDLKILTQ
SSQQVETKDILTLIQNFSHTWFSLDSYDKNEFPKQGTQEAIQTSAEELYQ
DLMQLKAELVAKDEATELFAQEKNIGNLKGIFGNVFQSVFGQDAYPSVEE
KAAHLLYFIIKNHPFNDGNKRSGAFSFIWLLQKAGYQFRDKISPETLTTL
TILIAESKPSDKDKMIGIVKLVLSVEP
>Cag_1971 conserved hypothetical protein
MNKKTWQGIFLLVLFVIAPNFLQNSIVRAEKKKIILKQAEIVEGGENAKG
SFRRLSGSVELSDGSITLRCNRATEYEASRSIVLEGKVMIADQRAEVYAD
GGTYYPDKEIGDLNGKVRLRTLDGALVAIANTAHLNHAANQITLYGNVVA
WHEAQQVSGNEMVITLRASSGKQEHQVEKVDIRGNAFLAAKDTLSKPIAV
YNQFSARRMTMHFNEASLLQNALLQGQSESLWHLYSEENRPSAIHYSSGN
TMQLAFREGALYTMKVSGHCEGKHYPASFWENKKINLPFFVWREKEYPFP
KKK
>Cag_0773 Arginine biosynthesis protein ArgJ
MLTTLNKALDTIHRLAAETVWSEGVTPMKIGDSGEFGFWPKGFSVGATAG
NIRYSGRDDMMLIVSDSPASAAALFTTNLCCAAPVVLSRNHLQQSAASMR
AIVCNSGNANAATGKQGMADAQAMADAVAEQLSIKPEEVLVASTGVIGQL
LPMERVHAGIVALPATLQSNSVLGAVSAIMTTDTFPKFYAVDVALSSGTV
RLCGIAKGSGMICPNMATMLGFLATDAAIAPELLQTLLSEANRKSFNAIT
VDGDTSTNDMVAMLASGAGAEVVAGSKDEALFSAALQSLMILLAKLIVID
GEGATKLVEITVKGAVSNEEAELAARTIANSSLVKTAIHGEDANWGRIIA
AAGRSGARFNEDELELWFNEMPILKKGLIADFSEDEAAIILAQPSYSITL
SLGNGSGSATLWSCDLSKEYVEINGSYRS
>Cag_0467 KpsF/GutQ
MNTTPQTETATLTGIGRQILEQEAQAIAHIAEHLDHHFAEAIQVMVACKG
KVIVSGMGKSGIIAQKIAATMASTGTTALFLHPADAAHGDLGVVAAEDVV
LCLSKSGSTEELNFIIPPLRQLGAKIIVMTGNPRSFLAQNADITLNTGVA
KEACPYDLAPTTSTTAMLAMGDALAITLMQQKKFTQHDFALTHPKGSLGR
RLTVKVSDIMATENAVPMVRTNAAVTELILEMTSKRYGVSAVVNENGELA
GIFTDGDLRRLVQSGRKFLALQAGEVMTARPKTVPPDMLARECLDILEEY
RITQLLVCDNHQRPIGVVHIHDLLTLGL
>Cag_1624 conserved hypothetical protein
MSGSVVSCAFMPHTTALVRVKPSSGSGYVVSTAKRFPFGLVRIAADRDGS
LLAKIGKELQRWHDDLLALNFTPPLYRSLPAFLPSDATPEEQATYQRLEA
SNFLHQPNSYWCSALESAEPCTASDFQAYFLLYYPAEPLRMVRNALSAHC
ALAVCSTPVEAFCRLTVSTQDVHILLEIEEAHVALAVAHQGKLMRFVCHP
IHRREEREYFALRELLNTPACRDHVVQVSGSHATKNMLELLRRETGLSLR
LPTLPAPHFVTNSTRQLLTEPDMYHALSAAMFSL
>Cag_0769 Exodeoxyribonuclease V, beta subunit
MHHQPLNHTTVTLAGINLIEASAGTGKTYAIASLYVRLLLEKQLLPEQIL
VVTYTEAATQELRGRIRSRIREVLEVFEGAATSDAIVQRLYDQALEQGDD
MVERARMALVQALALFDTAAIFTIHGFCLRVLQEHAFESGSLYDTTLVTD
QRALLLEIVEDFWRTHFFGEASPLLAYTLQCGGSPESFLALLQKLHVSGG
ATIIPTFCDEEREALHATCLVAYAELCRLWQSDGAAVRELLSTDKGLSRA
ADYYRADKLELLFAGMEEFIAGGNPFNLFADFQKFATSGIAAGTKPKGTS
PDHPLFACAEKLLQAVQKRYVALKSELVQFYQRELPKRKRKANFRFFDDL
LSDLADALQAPERGVALAQRLRSTYQAALIDEFQDTDPVQYIIFQTMYAD
SDAPLFLIGDPKQAIYSFRGADIFAYMQAARAVEASRRFTLSENWRSTPQ
LLNAFNQLFSNERLPFIYPDIIYHPLQAGNPDVANGEESAPALQFYLLEG
DDAKGDVLSVEQGEALAAEATAGELYRLLQAGEIIGGKQVAAGDCAVIVR
THAQAAQMVAALQRRGIAGVVRSDKSVFATREAEELRQLLIALADPAHEV
KVRSALITDILGRSGDDCAELLADEVAWLQVLRRFRHYHHVWQHRGVMVM
SRELMADEGVRGRLLASPDGMGERRLTNVLHCIELLHRQEHEHGFGCEEL
LQWFSERISLQDELQEEYQLRLESDEAAVRIVTVHASKGLEYPIVFCPFL
WNSVGNRRDEVVSFHNEVWQLVKDFGSPERDRHRVLAGRESLAEQLRLLY
VALTRAKYRCTVLLARIKSEASAFNYLLHASDATRQSNKVVLELEQEMKG
ISSEERKVRLHDIAKQSAGAIGVRQLSRVEIEALKEQPRLVRQRSAEPLH
LRHFAGTVDGSWRVASFTSFSRHESTSTHFASPELPDRDEVRSSTSASTM
QPTLPSEQSIAAFPKGARAGILLHALFEELNFANPTDEAIAERVTEELAR
SIYPLSWQSTLITIVQAVLQTPLAALDGSTFQLGTLHAKSWITELEFFFP
LRFINSKELSALLTRHGVLPGGIALADMVEVLDFKPVRGMVMGFMDMVFE
SGGRYYLLDWKSNYLGASPADYTLEAMGRAMQEHLYPLQYLLYMVALHRY
LALRIPNYRYSTHIGGVIYVFLRGVTPEFGEARGFYRDLPSEALIEELTA
LLVDFEG
>Cag_1479 Glycosyltransferases involved in cell wall biogenesis-like
MKAPSVSVILPLYNHEQYIDKTLTSIFEQTSPPDEIILIDDGSTDQGFEK
ATLILKNDPRAHLIQQENKGAHNTINRGIELARSEFIAILNTDDLFLPSK
IERCRNIIENYPEIDFICGDINIIDENSNMVTSGETVKWLEHAHDFQMRC
HSLDLGLLNDNYVTTTSNMFFSKNLWEKNKGFQNLRYCHDLDFILTALSS
SKVVIDYNYKHINYRTHSHNTIKEKISKIRYEMAAVMANAMMCNNAIVSK
STLLDDIKSLGGIIDKKNNALLLSVLMTLRSRYTCKIPYYEQLQEDAIAA
ELYKLLN
>Cag_1978 rubrerythrin
MPTTHENLKNAFAGESQAFMKYTAFAKKAEKDGFANIAKLFRTTAEAEKI
HAEGHLSADDAIQTTAANLETAIGGETYEHNEMYPPMYEQAVAEGHKAKR
MFGFAVEAEQVHAALYRKALDAVQNGQDLSESNIWLCPVCGHIELGNPPA
NCPICNVKASVYVQVS
>Cag_1131 hypothetical protein
MNTIGTFPFGQPIKKVCQIDQSPKKIFVLGVYASAVHATWIDNNEKVLIR
ALAVDSEPEIFWRGANVKEIIEQIQMPEGAGKLIPAAKNLNGPSGKALDD
CFLKPLGFEDRSEIWLCDLVPHSCQNKQQEAALEREYRTRMISLSLPEYD
YSEVPLKLSNEVRQQEILDELHIASPSVIITLGDEPLKWFTYKLDSSTKS
QLKDYGISEDTYGQLHDISIDGKRMKHLPLVHPRQAGRLGSHSNMWASCH
ERWMQYVAPSLLAEL
>Cag_0796 citrate lyase, subunit 1
MAKILEGPAMKLFNKWGIPVPNYVVVMDPNRLAHLGEANKWLRESKLVVK
AHEAIGGRFKLGLVKLGLNLEQAVEASREMIGRKIGTAMISQVIVAEMLD
HDEEYYISIAGNRDGSELLVSNRGGVDIEDNWDTVRRLAITIGETPTIER
ITEVATEAGFEGEIAERVAKIASRLILCFDNEDAQSIEINPLVIRKSDMR
FAALDAVMNLDWDARFRHADWDFKPVSEIGRPFTEAEQQIMEIDARIKGS
VKFVEVPGGNIALLTAGGGASVFYSDAVVARGGTIANYAEYSGDPADWAV
EALTETICRLPNIKHIIVGGAIANFTDVRATFSGIINGFRSSKARGYLEN
VKIWVRRGGPNEAQGLAAMKQLQEEGFDIHVYDRSMPMTDIVDLALNS
>Cag_1248 Nitrogenase molybdenum-iron protein beta chain
MLPLFSYEKDDNTGALVSTHKQEKTQMLLRHTPKEVKEREGLTINPAKTC
QPIGAMYAALGIHGCLPHSHGSQGCCSYHRSTLTRHYKEPVMAATSSFTE
GASVFGGQANLLSAIETIFSVYDPDIIAVHSTCLSETIGDDLQQITKKAS
DDGKIPAGKYVIYASTPSFVGSHVTGYSNMVAGIAEQFAQVSDTKTDQIN
IVAGWMEPSDMREIKRLSSELGVKIVLFPDTSDVLDAPQTGKHVFYPKGG
TTIDELKSIGSSKCSLALGCISAEPAAIAIEKKCKVPFETVDMPIGVSAT
DRFLMALSKAAHVQIPAHITADRGRLVDVMTDMEQYFYGKKVALFGDPDQ
LIPLTEFLLDLGMKPTHIVSGTPGMRFEKRMKEILKRAPYANFKNGLGAD
MFLLHQWMKNEPVDLLIGNTYGKYIARDEDIPFVRYGFPILDRIGHSYFP
SVGYMGGLRLVEKILSALMDRQDSAALEEKFELVM
>Cag_1061 conserved hypothetical phage AbiD protein
MRFEKPPKTFEEQVQILQERGMIINDPATASYYLSHLNYYRLAAYWLPFE
QEHTTHRFTQGTSFDIVLQYYLFDRELRLLVMDAIERFEVSFRTHWAYEL
SLAYGAHAHLNKSLFKQKQYIGNKRCGWNYDDNVNRLKKSVANSSEIFIQ
HFKKYDEELPPIWAICEIMTFGELSTWYSNLRHGKDRNAIARAYKLDEVN
LTSFLHHLSIVRNLCAHHARLWNRDFTVVWKLPLKTPSGMIENFNSIAKR
KIYNTLVMLAYLTDTINPNNSWKKRLFALFMKYENVKESYMGFPDNWRNK
QIWQ
>Cag_1300 Protein of unknown function UPF0074
MLQVSRKFEYGLHAITYLATKGQNRVVTVKELSEELGFSQEFLSKAMQSL
KKAGITKSVQGLRGGFRLIKPITEITVADIAVATEGEPHLVQCNISAERC
HVYATCQHKAFMNTLQHKIHELLATTTIVALLEQCDIPFTPYTLIIDTEE
TEKMEDGEESLEDMESSKPIIPFSPLYYAPVKTHSK
>Cag_1744 ribonuclease H
MKKQVTIYTDGACSGNPGPGGWGALLMFGSITREVSGSSPATTNNRMELG
AAIEALALLKEPCLVDLYSDSSYLVNAINNGWLQRWQRNSWQTAAKKSVE
NIDLWQKLIKLLKVHEVRFHKVKGHSDNAYNNRCDQLAREAIKKTS
>Cag_1221 putative transcriptional regulator
MITETEVQLLLSNMEADTIERTTAVADTDKFGQAICAFANDLPNHRAPGY
LLIGVKDNGELSGLTVTDELLKNLGGIRSQGNVLPQPYMNIAKFSFASGD
VAVVEVYPSDLPPVRYKGRVYIRVGPRKGIANEQEERVLTERRIALARTF
DARPCGEATLSNIALGQFDAYRREVIDAETIAANNRSIELQLASLRLFDL
KYNTPTHAGILLFGKNPRYFLHGAYIQYLRFPSTDITDIPLDQAEISGDL
YVVLRELEMRVKLLIQTSMRQVSTLKEQLLPDYPEWAVRELLMNAVMHRN
YESNTPIRFYAFSDHIEIQNSGGLYGEATPENFPTCNSYRNPVIAEAMKS
LGFVNRFGYGVQRAQALLAQNGNPPATFEFNEHSVLVKVWKRMK
>Cag_1533 imidazoleglycerol-phosphate dehydratase
MAEQARTATVKRTTKETDITVTLTLDGNGTSAIASGVVFLDHMLTNFCKH
AGFNLTLRCNGDLDVDDHHSVEDVALVLGTAITEALGNKSGIQRYGWAII
PMDEALARCAVDLGGRSYCVFKAEFRRPVIQGLSTEMVEHFFRSLADTMK
ATLHLEVLDGNNTHHKIEALFKAFAYAMKAAVSITGNDIPSTKGLI
>Cag_0561 hypothetical protein
MPQNDAKRFVEKLRADDGYRGRIAEIMRYEGYSGTADDLKKLTNEEGDDK
RYPNKSYCSSLSWHQGDKKQQDGASQGYHHWAG
>Cag_2005 sulfide dehydrogenase, cytochrome subunit
MAVQLRMRLTACTLLAMAATALSSAMPSESSAAPATKKAAVVCDVPVSNP
RGQILALSCSSCHGTDGKSVGIIPPIHGKSVDYIESALKDFKSGARVSTV
MSRHAKGYSDEEITLIAQYFGSLSRKK
>Cag_1995 hypothetical protein
MQVTIQLLSSAINDLLDARRFYEQQRNGLGAYFFDSIFADIDKLTLYAGC
HPKYFGYYRMLAKKFPYAIYYKMNDTSVAVVWRILDMRRSPYKIKQLLP
>Cag_1853 Translation elongation factor Tu
MAKESYKRDKPHVNIGTIGHVDHGKTTLTAAITLVLSKQGLAQERDFGSI
DKAPEERERGITISTAHVEYQTDKRHYAHIDCPGHADYIKNMITGAAQMD
GAILVVAATDGPMPQTREHILLARQVNVPSLVVFLNKVDIADPELIELVE
LELRELLSQYDFPGDDIPIIKGSALKAMEGDAEGEKAILELMDAVDAFIP
DPVRDVDKPFLMPVEDVFSISGRGTVGTGRIERGRIKLNEEVEIVGLRPT
RKSVVTGIEMFQKLLDEGQAGDNAGLLLRGVDKTELERGMVIAKPGTIKP
HTKFKAEVYILKKEEGGRHTPFFNGYRPQFYFRTTDVTGSVTLPEGVEMV
MPGDNLAIEVELLAPIAMDEGLRFAIREGGRTVGAGSVTKINE
>Cag_1440 aspartate aminotransferase, putative
MAVTGEQYLTQRVLGMQESQTIRITNLAGKMKAEGLDIVSLSAGEPDFPT
PQHVCDAGIEAIRAGFTRYTANSGIPDLKKAIVAKFKRDNGLEFAENQII
VSNGGKQTLANTFLALCAEGDEVIVPAPFWVSFPEMVRLAGGTPVIVNTT
IESGYKLTPDQLEAAITPKTKMLVLNSPSNPTGSVYSEAEVRALMAVLEG
RNIFVLSDEMYDMIVYDNVRPFSPACIPAMKDWVIVSNGVSKAYSMTGWR
IGYLAGPKWLIDACDKIQSQTTSNPNSIAQKAAVAALNGDQSMIEEHRLE
FQKRRDYMYEALNKIPGFKTTLPQGAFYIFPDISGLLGRTFNGVEMKDSA
DVAEYLLKVHYLATVPGDAFGAPANLRLSYAASIAALDEALNRLRKAFS
>Cag_1590 zeta-carotene desaturase
MKVAIFGAGVAGLSAAIELVDRGHSVEIYEKRKVLGGKVSVWKDSDGDSV
ESGLHIVFGGYEQLQSYLKRVGAEDNYQWKDHALVYAEQDGKQVAFRKAL
NMPSPWAEVVGGMRTDLLTFWDKISLLKGLYPAITGDEAYFRSQDYMTYS
EWHRRNGASEHSLQRLWRAIALAMNFIEPNVISARPMITIFKYFGTNYSA
TKFGFFRKNPGDSMIEPMRQYIQSKGGRIFVDAKLSRFELNSDETIKEAV
LRDGHKIEADAYISALPVHSIKKIVPTTWLKHKYFRNLHEFVGSPVANCQ
IWFDRKITDTDNLMFSQGTIFATFADVSLTCPEDFQQGIGSANGGSVMSL
VLAPAHQLMDMPQEVIIDLVVKDLHDRFPASRNAKVLKSTLVKIPQSVYK
AVPDVDQYRPDQISPVRNFFLAGDYTDQHYLASMEGAALSGKQAAEKLMS
KIGNS
>Cag_1030 hydrogenase, methyl-violgen-reducing type, delta subunit
MSEPFEPKIVAFVCTYCTYAGADLAGTSRLKYAPNVRLIRLPCTGRISPL
FILKALQKGADGVLVSGCHPGDCHFNQGNYHARRRWTVFRSLLAFSGIPE
ERIKFSWISAAEGGKFAELINTITEDIRKQGPFEQYKSLISETEASATL
>Cag_0409 Phosphoserine phosphatase SerB
MLFPNAFMQKQQVKCALCNIFKRASTPYHAVMKELLLINITGPDKAGLLS
SFTAILAIHNTSVLDIGQAVIHNHLALGLLVEIPQKATASALVKEMLYCV
HSLGLTMSFTPITPEAYNTWVTEQGQPHYVVTLLARHVTAEHIAAVSTLI
GEHQLSIATINRLSGRILLEPSEVPSLSKACVEFSLRGTLCNESRFREQL
LSITDTLGIDIAFQEENIFRRNRRLVVFDMDSTLITSEVIDELALEAGVG
EQVAAITEQAMRGELDFTASLQRRVALLKGLEESVMERVAARLQLTEGAE
TLFKHLHHLGFKTAILSGGFTYFGHYLQKKLNIHYVHANMLEIENGRLTG
KVVGQVVDGKRKAELLEHIAERENIRLEQTVAVGDGANDLPMLGKAGLGI
AFRAKPIVRENAKQAISTLGLDALLYLMGFRDRDTRCM
>Cag_0566 Protein of unknown function UPF0004
MKETIKQRVAALTLGCKLNYAESSALLSALWQKGWEVVSLHDGKVDLIII
HSCAVTAQAEQKGRQKIRALHRNYPTSRIMVIGCAAQLHPDVFAAIEGVD
GVLGSNDKFELSHYERVMAKKCNKPLVCITPMEHCKTIHHGFSLPATLAK
QERTRAFLKIQDGCNAGCAYCVIPHLRGASRSLPANQLVEQAHRLAASGY
REIVLTGVNIGDYWADGVDLCGLLRALAEVPVSRLRLSSTELEALSDELI
ELVAASPTIVPHLHVPLQGGSDSLLRAMRRRYTTAMYRARIERAVERIAN
CAIGVDIMVGYPSETEEMFHESMAFIEALPLSYLHVFSCSLRPHTLLADE
VARQERKALSGDIVAERSRQMVALGHQKEAAFKQRFLGAVMPVLLEQAKT
IAPHVTAWSGHTPNYLHVLVHPLEQNEHDGQSGQSGQNGAHRADTFAGQE
ILVRLEQLDDDLNFIGSVLS
>Cag_1640 ribulose-bisphosphate carboxylase large chain
MNREDINGFFASKEQLNMADYLILDYYLECVGDIETALAHFCSEQSTAQW
KRVDYDEDFRPKHAAKVISLTVEAELSSLSYPIHHATDEPIHACRVTIAH
PHTNFGTKLPNLLSAVCGEGAFFTPGVPVVKLLDIHFPDSYLQAFEGPKF
GIDGIRDLLKAYNRPIFFGVVKPNIGLSPAEFGEIARQSWLGGLDIAKDD
EMLADVTWSSLADRSRELGEARRNAEQATGEPKVYLANITDEVDRLLEQH
DVAVRNGANALLINALPVGLSAVRMLAKHTKVPLIGHFPFIAAFSRLEKF
GVHSRVMTKLQRLAGLDSIIMPGFGSRMMTTEEEVMANVQECLQDFGHLR
RSLPVPGGSDSALTLEGVYRKVGSVDFGFVPGRGIFGHPMGPKAGAASIR
QAWEAIEQGVPLETYAQGRPELQAMVDGAHGK
>Cag_0390 Thymidylate synthase complementing protein ThyX
MNVRLIATTTPAIQIEGQTPTPEGLMAYCARVSSPNQESLDYEKLLAYCI
RNKHWSVFEMVDMTVEIVTSRAISPQILRHKSFQFQEFSQRYAKVQEVER
YEPRRQDSKNRQNSIDDLDESVKVWFMEAQEEIANNAMEYYNMALEKGIA
KECARVLLPLATQTRLYMKGSVRSWIHYLEVRTSSSTQMEHREIALAIQR
LFMEQFPITASALGWKSAAPCP
>Cag_1887 von Willebrand factor, type A
MCFTHPEKLLLLLLLVPIAGLLIGRFIKEKRLRQALMANKMAATMMPLQH
LLPYAFRSLMLFVASGLLLLALAEPRWCGGTKPVLRHGADVLFILDVSRS
MQATDVAPNRLMRAKQEIAAISQNVQGGRRGLLIFAASPLLHCPLTTDRD
GFATLLNMAAPELIEEQGTRLQPAFALASTIFDVANESNAASTRGVQVIV
LLSDGEDHDSNVQRAAQQLAKQSVQLFVIGVGSLKPSPIPLADGSFKRDA
SGQVVMSRFRPQMLQAFARQAKGLYRHSHAEVWASADVVNRINRYAADSR
IVMEPATNDSSLLRLMVGIAVALLFMETLLRKSS
>Cag_1294 glycosyl transferase
MRFMKIGIACHHTYGGSGAIATELGKALAARGHHVHFFGKEAPFKLGAFV
RNIFYHEVEVMHYPLFDSPFYSLALASKIAEVAFYEKLDIVHAHYAIPHA
LSAMLALQMLEDKCAAAHCFKLVTTLHGTDITVVGADRSMQDVVRLAINK
SHGVTAVSHFLKEETVRMFQPKRDIAVIHNFVDTQLFSRMAAGDIREQLG
LGSEKIVIHISNFRPVKCIGDIIAIFQAIATESNATLLLVGDGPERSEAE
LLVRQLGLCSRVRFLGKLLDIVPLLSLADVMLMPSNVESFGLAALEAMAC
GVPIIATNVGGFPEFIESGKHGYLLPPGDVAAMTEKALHLLNNPDEWQRI
SMACVKQAACFNVSRLVEQYEKYYTKLMG
>Cag_0821 UDP-glucose/GDP-mannose dehydrogenase family protein
MKITIFGSGYVGLVTGACFAKVGNDVLCVDIDENKINRLRKGEIPIYEPG
LDDMVDECIKAGRLHFTTNIQEGVEFGLYQMIAVGTPPDEDGSADLSHVL
SVAHSIGSYMQEYRIVINKSTVPVGTADLVRNKIRSVLAERNIVLDFDVV
SNPEFLKEGDAVDDFMKPDRIIVGVDNPRTRELLRFLYAPFNRSHERFIA
MDIRSAELTKYAANSMLATKISFMNEIANIADRVGADVEAVRKGIGSDSR
IGFSFIYPGIGYGGSCFPKDVQALERTATKHGYTARLLQAVEAVNDDQKA
SLVTKIKNHFNGDISGKMFALWGLAFKPNTDDMREAPSRRIIAELLEAGA
TVQAYDPVAIEEARRIYGDSRGIHFAESPEAAAQGADALVVVTEWLLFRS
PDFEMLKRELRSPLIFDGRNIYSPEFMEQEGFTYYSIGRPTRCAQ
>Cag_0911 hypothetical protein
MIITPKVDETQEFIEIANDFSNPLDLVREAISNSFDANANKIYLSFDMVK
EYMDTNLRIRIVDDGEGMTLDGLQSFFDLGNSTRRGIDGTIGEKGHGTKV
YLNSSKISVKTIRDGKQYVAVMIEPIKKLYVREVPTVEVIESNVDELSGT
TIEIIGYNSNRRGKFTHEQLKDYILWFTKFGSFESFFEKKENSHKRLFLK
GLNASEYEEICFGHSFPNESQPVQRLFEEFLVSAPDYYCKRFVKRGQLKN
SPEISFEAIFSVEGNRVKLAHNTMIQRQGRPSIAGNYKVAERYGVWVCKD
FIPIQRKNEWVNYKGSEFIKLHAFFNCQGLRLTANRGSIDNTPSEVLSDI
QEEIKKIYDEITSSDDWTQLSWLEQEAESYKTTEKEKKDFEFRLKKANKA
NICEFENTIIVEPQRESGVYALVLQLKMLKPDLFPFFIVDYDTHSGIDVI
VKADDTQPIISSKLYYVEFKHYLTEEFNHSFVNLHSIICWDTTIKHNDIL
KDINGEERKMQIIPPESDGDYTKYFLDRPSSAHKIEVFVLKDYLKQKLGI
EFRPRTAKDIL
>Cag_1331 transcriptional regulator, Fis family
MKAEPALLGHSPLMQQLRQLAMQVAETDITVLIIGETGSGKEVLAHFIHH
HSQRAKGNFIAVNCGAIPAGILESELFGHEKGAFTGAQQSRKGYFESADR
GTIFLDEIGEMPLETQVKFLRVIESGEFQRVGASETIYSDARIIAATNRN
MYQAVAEKNFREDLFYRLRSVELQIPPLRERGNDVLLLADHFMRSYERKH
GIPFAGFSPDAAEMMLRYPWPGNVRELRNLMESLLVLEKGREIGSDTLEK
HLVQRNRHKSLVHDPSKSEKNELQVIFSNILQLRQEVSELRMMVQHLLQN
SQHVSVPMPFTMPLLLPDAYTEPSHTNPPKALHASPISPISSISNSSNTP
NESEAVKAKEPIALPENKPLRSLDEVEKESIREALEHHNGNKRQIARALG
ITERTLYRKIKQYGL
>Cag_1389 nucleic acid-binding protein,contains PIN domain
MSYLIDTNIIIYYFNGLTNDESLHSILANNFKISIITKIEFLGWGQFLSN
QNLYIKAKSFIHYATIFDINDAIAEQTILLRQQFKTKTPDAIIAATAMVK
NLTVVTNNTDDFNRLGIKTISVTMQ
>Cag_0261 conserved hypothetical protein
MTRTSQHSERQQATEASTHNIQQLVDEAATSLYKDIMAIVEARLELLKIE
LTQKISLIGAAVVVGVIVIIGATFLLATIALFLGELMGHTFLGFFAVSLI
FFVGFWFFTHYRPTLLQHFIQNLLLSTYDADK
>Cag_0975 chlorosome envelope protein B
MANETNDFAGALNNLMQTATSIGQKQIELVTNTAQNLVQLAEPLAKTAVD
LIGSITNTAGQLFQNIASAIAPKQ
>Cag_0970 conserved hypothetical protein
MRVEKLSTLLLCRIALAISWIYQGAVPKLMCQSSGELELLGHIIPIYKWA
CIAMQWMGYGEILFGVFLLIARWQWAFWLNIIALIGLLFFVGIFEPTMLT
LPFNPLTLNVALIALSLIAIRELRQ
>Cag_0686 conserved hypothetical protein
MINNLFLTNQRDNYDSPWKEAIEHYFLEFMAFFFPAAYASIDWSKPYHFL
NEELRAIIPDAEVSNRVVDKLVQVQLLDGMESWLYIHIEVQSFWEVNFPE
RIFVYFYRIYDKYGKAVANFVVLADQHSNWRPTSYTMETIGSKLSLDFSV
VKLLDFEPRLQELLVSDNVFGVITAAHLLTQKTKNKVKQRYEAKKLLMQL
LLQRQWEQERINELIRVIDWLLKLPKELRQKLKAEIHNMEEEQKMKYVTS
FERDAMEEGREMGLVEGMEKGKAEGLEEGLLKGRLEVAERLVASGMSKAE
AALLAGVSVEML
>Cag_1601 conserved hypothetical protein
MIVNEFEKLTAGKLLLASATMLESNFKRTVLLMCEHNEEGSLGFILNRPL
EFKVREAIHGFNDVDDVLHQGGPVQVNSIHFLHSRGDLIHNSQEVLPGIY
WGGNKDEVSYLLNTGVMHPSEIRFYLGYAGWSAGQLFSEFEEGAWYTAEA
TPDVIFSDAYERMWSRTVRAKGGAYQLIANSPELPGMN
>Cag_0260 conserved hypothetical protein
MLFKAEKNFTVNFLPLVIMEQEVPVTIHQNQESADEAARQHGTYRAPAKD
FNETISEAWTQFRESEAGEALSKGSSAAKEYIQQHPTQAMLLSVGAGALL
GLLLKRR
>Cag_2008 rubredoxin:oxygen oxidoreductase, putative
MIDHKILPITNDVSWIGVLDPGLITFDIVMETKYGTTYNSYFINAEKKAI
VETTKEKFWPEYFDKIKQVTDPADIEYIIVDHTEPDHSGNVRNLLAVAPN
ATVVGSGNAIKFLRDQTGHDFKSMIVKTGDTLSLGNKTLHFYNMPNLHWP
DTIYTWLEEDRILFTCDSFGSHYSNEAMYDDLVGNFDDAFTYYFNTILRP
FSKYMLQAVEKIRELDIHTICPGHGPILRSNWKKYVDLSEQYAKDAIALP
QDKSILIAYVSAYENTTLMAQKIAEGLQQMCDFHVDVCDIEKIDPALLEH
KIAHSTGIIVGSPTLNQNILPQIYQLFATINPIRDKGKLGAAFGSYGWSG
EASKMIETNLTMLKLKVFEQNMMVKFKPHEPEFELCVAFGKAFAQKMIEM
YQITCNI
>Cag_2033 Ribosomal protein L34
MKRTFQPHNRKRRNKHGFRQRMATKNGRKVLSARRAKGRHSLSVSSDMVR
H
>Cag_0523 Isoleucyl-tRNA synthetase, class Ia
MEKRMFAEYPSQVAYSAVEAEVLAYWNEQGIFHKSLEEKPSDRVFSFYEG
PPTVNGKPGVHHLFSRTIKDVICRYKTMQGYRVPRKAGWDTHGLPVEIAV
EKKLGLKNRSQVVDYGDGEFNAEARALVYHHIEDNREGWGKLTERMGYWV
DMDKPYITCTNNYIESVWWALKTIFDKGLIYKDYKIVPQDPKSETVLSSH
ELALGYKEVSDPSVYVKFRVKNSNESILVWTTTPWTLISNVALCVGSAIE
YVRVQNTESGEVLILAASRLSVLTEKNSEGESLWQVIGQLHGSDLVGMEY
EPLFNYLHPEKRCWYVTAGNFVSTEDGTGIVHIAPAFGADDYEIAKQYDL
PMLQPVARNGCFTAEITDYAGMFFKDADPLIIRQLKSEGKLYRKETITHT
YPYSWRYDVPVIYYARESWYIRTTAIAARMVELNKSINWCPPEIGSGRFG
NWLEENKDWALSRERFWGTPLPIWVAEDFVLGVSNPVGNMFAVGSIAELR
EGFIDIDGVQHPLGDALDKGLVVLDLHKPFVDNIYFIRDGRRFNRTPELI
DVWFDSGSMPFAQLHYPFENRELFDTMFPADFIAEGVDQTRGWFYTLHAI
ATLLFDKPAYKNLIVNAHILDKNGHKMSKSKGNVVDPFEMMERYGADALR
WYLMVSSAPWKPKSFNTDEIEEEQRRFFRALVNSYNFFVLYANVDGFTFA
EPSIAVAERSELDRWVLSCLNTLVDEVHQRMEQYDITAALRLINDFTVDD
LSNWYIRRSRKRFWKSDMGPDKIAAYQTLYETLMTLCKLLAPITPFIAER
LYLNLNNITKLEAHTSVHLASYPSFDGTVTDRPLELRMKKAQIITSLVRT
MREKASLKVRQPLRRILLAVTDSDAKEEYRLVRDIIMEEVNVQAIEYIEE
DDSVVSKKVKPNFKALGPRYGKDMKALAEAIRVMSHKQIALLEKEGQVML
ELDGKSFEVLRSDVEIMHEDIEGWLVAADETYGIMVALDTEITPELEQLG
LARELVSRIQTLRKESGLAITDRIVLTMHCSPKLHEAVCIHEAYITAETL
ATQLTTVPLEATIPQEALEQINGEACRLNVAKA
>Cag_0256 hypothetical protein
MIDVKPFENCVCQKMNNHLSLCKSELTLSEKGKSVTLRIRSSEEAKVLVL
DGCVFMDNSSRCDGLYLYKKGNKRYALLVELKGACDIPHGFEQLAYVKKN
RQEYRQIVDHFWVEAGGVQPIEKAFLVSNGSLSKPDLETLEKQHNIRVTA
ILHCEATSQIPDLRKYL
>Cag_0332 type II DNA modification methyltransferase M.TdeIII
MEMSMRNDLLTIAEASQWASNYLGKQVTTSNIAYLIQYGRVKKFGHNGST
KISKEHLCNYYATINRQREHSWKEQLGSDLNWSLSFDQYKEAETTKHVHR
LHPYKGKFIPQLVEYFLDDHTDDFKQQMYFTKGDIVLDPFSGSGTTIVQS
NELDIHAIGIDVSAFNTLIGNCKISSYNLKDLQQEINRITVVLKTYLKNS
SVVAFEEHLLQELALFNKKYFPVPEYKYQLRQGIIDEKKYGIQKEQDFLN
FYNSLVHEYGIILYQKNNHHFLDKWYLAPVRAEIDVVFQEIKKVQCKEIK
KILTIILSRTIRSCRATTHADLATLVDPISAPYYCAKHGKICKPLFSILS
WWETYTKDTIKRLAEFDRLRTNTYQICLTGDSRTINIIEVLEQRHPLLAA
LVKQQKVRGIFSSPPYVGLIDYHEQHAYAYDLFGFTRHDELEIGPLYKGR
GKEAKQSYINGISAVLNNCKHVLADDYDVFLVANDKFGMYPIIAENAGMK
IVNQFKRPVLNRTEKDKNAYSETIFHLKEK
>Cag_0583 conserved hypothetical protein
MLAQAQQIYDEALTLSPIEKVELIEHLYFSLDSKNSRQEVDKLWAEEAED
RLTAYENGEIKTTPASEVFAEINSMRPQ
>Cag_1231 conserved hypothetical protein
MNTVYKSEPRIIGVTISEDTLSASLEDGRTISVPLAWSWRLLEATPAERN
NYTIIGNGIGIHWPDIDEDISANGMLFGIPAQRTKNRESA
>Cag_0161 extragenic suppressor protein SuhB
MQQPSQELQTAIRAAQAAGAIMRERCGELSSSDIQAKESKDFVTVVDKAC
EAAISATIAESFPNDSMLCEEGTVMDGSSGRTWIVDPLDGTLNFIHSFPV
FSVSIALRDSNQQLVAGVVYQPILNELFTAERGQGAYLNGKRIAVSARTD
KESFLMATGLPFTNYSDYLDSSIAMLKEVIADSAGIRRAGSAAIDLAYTA
CGRFDGFWEYRLFPWDFSAGVLLVREAGGTVTNFSGSEDVFSSQSIIAGS
EVTHPLLLSIAKRSFSL
>Cag_0147 rubredoxin
MEKWVCVPCGYVYDPEVGDPDSGVAPGTPFEALPDDWVCPVCGMDKSSFE
QE
>Cag_0137 1-acyl-sn-glycerol-3-phosphate acyltransferase
MNFGTILFFLVLIPVMFAGMSLWLLISFFDKQGDIFHRIAAWWGGFSAKL
FGIKIEVIGSENYNPAQHYLVVSNHAGMADIPLLLGCMKLNLRFLAKEEL
GRVPIFGTALRRSGYVMIKRGQNREAMQSLLAAAEVLKSGRSVHIFPEGT
RSATGELLPFKRGAFLVAIKGGAPVLPVTIIGSHLITPKNSVKVNRGTIR
LIIGKPIDPTQFNSAESLKNACAEVITEAFQRYRFN
>Cag_1286 Aconitate hydratase 2
MSLIESYRQHTAERATLDIPPLPLTAEQTRELVVLLKQESVPEQAYLLDL
FCNHVSPGVDDAAFEKAVFLDAIVKGEVACNAISAVEAVRILGTMLGGYN
VKPLVDALSHSDTAICSEAAAMLKKTLLVYDLFEVVVEMAKSNKYAAEVV
NSWAEAEWFTSRPALPEKMTLTVFKVPGETNTDDLSPASEAFTRSDIPLH
ARCMLGTKINDPIGTIAALKQKGHPLAYVGDVVGTGSSRKSGINSVQWHI
GSDIPAVPNKRTGGVVLGGIIAPIFFNTAEDSGALPIQADVNALEMGDVI
EVNLIEGTISKDGAVVSRFELSPNTIGDEVRAGGRIPLIIGRTLTRKART
ALGMGDDTLFVRPEQPADSGKGYTLAQKMIGKACGLAGVRPGMYVEPATL
TVGSQDTTGAMTRDEVKELAALSFSADLFMQSFCHTSAYPKPSDIKLHRS
LPYFIMSRGGVSLKPGDGVIHTWLNRMVLPDALGTGGDSHTRFPIGVSFP
AGSGLVAFAAVTGAMPLNVPESVLVRFTGELQAGITLRDLVNAIPHEAIK
RGLLTVEKKGKKNIFAGRILEIEGLPNLKVEQAFELSDASAERSAAACTV
RLNKEPVIEYLSSNVKLLEQMIEEGYGNSDTIQRRIDKMQEWLKNPILLE
PDADAEYAAVVEIDMSTITEPILACPNDPDDVMTLSEVLADAKRPKTINE
VFVGSCMTNIGHFRALGEILRGKGAVPTRLWIVPPTKMDMNKLIEEGYYS
VFGAAGARTEQPGCSLCMGNQARVADNVVVFSTSTRNFDNRMGTGAQVYL
GSAELAAVCALLGRMPNKEEYMEVVSGTLTAKSSAVYEYLNFHDLKKADL
ELLVH
>Cag_1701 Isochorismate synthase
MPTTTPNDESHIHLTTREQLPLAIALEHLKDSVKAIRRSLPYHKAQKAQV
ITCVEAIEPIEPLHWLYVQQLYPRLYWKNREQNNEVAAVGIADSFTAKDD
DNNNSAFAAFQAACDERSPALRYFGGFRFNGAELSEEHWKHFPPFFFFLP
LVTVERFADQRYTVSANLFLAVGDSVDEKINQLIAFIEQLNSRIDNESVL
NTLLPSVEHITRTPNKAAWNAGCEKALNAFENGELEKIMLARQTTLRFTH
TFSPLLFLLRYPYPHNAIYRFYLELAPNQAFFSFTPERLFRRHESMVETE
ALAGTCSKELLQATDQHASQTLLGSEKDVREHNYVREMIYQELLPICSSI
DMEDEVQVLQLQRLAHLYTRCSATLKEKSIPTSDILKQLHPTPAVGGVPK
EEILEHIRALEPFTRGWYAAPVGWVDSHHAEFAVAIRSGLVNGNEVHLYS
GAGLVKGSNPDAEWEEIEQKIGDLLSITRMKQ
>Cag_0303 peptidase, M16 family
MAKPRKFFPVLLFAAMVLFFLHLTACSPTKTLMNSNSAYPYTTIQGDSLH
TRIYKLKNGLTVFMSPCYDEPRIYTSIAVRAGSKNDPAETTGLAHYLEHM
LFKGTDAIGSLDYHKEHPQLEKITALYEEYRSTANPEKRAAIYKMIDSLS
NVAASYTVPNEYDKLLSSLGATGTNAYTWVEQTVYINDIPSNKLDQWLTI
EAERFRNPVMRLFHTELETVYEEKNMTMDSDSRKIWENLFAQLFQKHQYG
TQTTIGKAEHLKNPSIKNVMEYYRSHYVPNNMALCIAGDFDPDATIRLID
EKFSVLESQPLARFTVEAEEEITAPRVMHVKGPESEELVMGYRFKGVNSS
DADYLTLIDKILFNHTAGLIDLNLNQQQKVLDASSMLVLMKDYSAHLLTG
KPREGQSLEEVQQLLMEQIELLKQGEFPEWLLEAAINDLYTEQLKQYETN
RGRVEAYVDSFIWGMEWQAYMQQIERLHKITKADIVAFARKHYSTNNYVA
VFKEHGTPESEAKIQKPPITPLTVNRDTISTFAQNLLERPSALTQPRFLD
YSKDISFYNVTDDITLHYVHNNENDLFSLFYVFDIGKNHSKKIDLALDYL
SYLGTSKLSPKAYSQEMYKIGASFSAYTADNYVYLKLSGLHKNAEAAIRL
LEELLMDAQPDEEALGKLKAGTLKERADDKLSKKKILFEAMANYGKYGAH
SPFTNVLSNREVEQVRSQELLDELRNLLNYRHRVLYYGPESAENVLSELR
SVRHYPATFMATPSLDLFKPLEVTENLVYVVDYDMTQAEVMMLMKDETYN
SATLPIVTLFNEYYGGGMSSVVFQELREAKALAYSVFSVYRTPKQKGEHN
YIISYIGTQADKLPEALEGIGDLMKTLPESPQLFETAQKGIEQKIATERL
IKTEILFNYEEALRLGHSHDVRKDIYDATQRMSLEDVKAFHKKHFSNKKQ
VMLVLGNRKNLDMATLRKYGTVRELTLKEIFGY
>Cag_0932 Aspartyl-tRNA synthetase
MSNPVEQQVVCNRFRSHSCGLLNATFEQQHVRLAGWVHRKRDHGGLIFID
LRDHTGICQLVIQPEQQALFAEAEHLHNESVISVEGTVILRAENAINPRL
SSGEIEVVISCMSIESNAHPLPFSVADELPTSEELRLKYRFLDLRREKLH
ENIIFRSRLTAAVRRYLEEKNFTEIQTPILTSSSPEGARDFLVPSRLHPG
KFYALPQAPQQFKQLLMVAGFPRYFQIAPCFRDEDARADRSPGEFYQLDM
EMAFIEQENLFEILEGMLEHITSTMSKKRITQVPFPRISYRDVMNRFGTD
KPDLRIPLEIQDVTSLFVDSGFKVFAKNTVPGCCVKALVVKGRGNESRLF
YDKAEKRAKELGSAGLAYIQFKDEGAKGPLVKFLSEDDMNALKQHLNLEQ
GDVVFFGAGKWEVTCKIMGGMRNYFADLFTLDKDELSFCWIVDFPMYEYN
EEAKKIDFSHNPFSMPQGEMEALETKDALDILAYQYDIVCNGIELSSGAI
RNHKPEIMYKAFAIAGYSREEVDQRFGHMIEAFKLGAPPHGGIAPGLDRL
VMILCDEQNIREVIAFPMNQQAQDLMMSAPSEVTLAQLRELHLKVELPKK
EEKKG
>Cag_1070 conserved hypothetical protein
MSKGREFDTREITVFQKNEITEHFIYKNLSTRVTGVKNRRILAQIADDEM
RHYNVWKNYTRKDVAPDRVKIWFYTMISMLFGFTFGIKLMEKGEHSAQMV
YSRMPSSYREINGIISDEEEHEQALMGLLNEERLKYTGSVVLGLNDALVE
LMGVLAGLTFALQNATTIALTASITGIAAALSMAASGYLSTKSEDDGRNP
VIASFYTGMAYLLTVLALIAPYILIGDRFISLGVAFFIAICIIGFFNYYV
SIARDLPFKSRFFEMVGISFTVAALSFLAGYAINYYFPGLTM
>Cag_0164 Crossover junction endodeoxyribonuclease RuvC
MIVLGVDPGSLKTGYGVVQHHNGSFSVLAAGVIRLQAAWSHPERIGIICR
ELEQVIAEFQPERVALETAFLSHNVQAALKLGQVRGAVIGLVVRYALPIY
EYAPREVKSAITGKGAATKEQVAFMVSRMLSLHTVPKPHDVTDALGIALC
DILRGESRQSGVPPRTNSRRKSGTGGSWEQFVRQSPNVVVRS
>Cag_0751 hypothetical protein
MKINIIDPGLFHQAGHHLDLDIKVCKVLKNLGHDIAIYSSINVTDKIKSF
FEHYGEVTPIFKAIPYFNVENIDRLAGGYIAFNRQSKILAEDFTKVASAD
LWLFPSIFCAQLNACAISGSKTPIAGCIHLSATKEYAQDEMFWRYALINC
KDRSIELNLGVMEPEHLMEYQKIASNDFEIISFPIPYEGVPIERPRKTLR
KVGFFGHQRDEKGIHLVAKLTDLLTKKGYEIVIQDSTESFKSTGYSNVTI
LGYVANIALEISKCDCVILPYNPIKYQTKGSGILWDALASGVPVIAPVGT
AIGRWIQYCGSGRLFHEFDVNSIIHQFELLSENYDEYVSIAIKNASIWSV
KHGIDKFVSSLLNFKKSC
>Cag_1757 Mg chelatase-related protein
MLSILNAAALTGIDALKVTVEINVTSGLPAFTVVGLPDNAIRESRERVMT
ALRNSGFPIPPKKITVNLAPADIKKEGTAFDLPVAIGLLGSLELLNTHLQ
DTMIMGELALDGSLRRISGALPVAIMAGKEKIRRLIVPQTNAAEAAVAVS
AAQSGIEVYGVESLNQTVALLNGTFSMNPVTVDVASLFEGEQEYPVDFSD
IKGQTAAKKALEIAAAGGHNVIMIGPPGSGKTMLAKGLPGILPPLGFEES
LETTKIYSVASLLEKDRPLMVRRPFRSPHHTTSNVALIGGGAMAKPGEVS
LAHNGILFLDELPEFSRTALEVLRQPLEEREVTVARIAVTTKYPAGFMLI
AAMNPSPAGALKDRDGNFTAPPQQIQRYLSKISGPLLDRIDIHIDVPKVE
NAELFTTTAAEPSATIRQRVIAARAIQLERFANTSAPRIFTNAQMSTRQI
KSYCQLDKECSQKLMNAMNRMNLSARAHDRILKVSRTIADLDKSEHIALP
HLIQAIQYRNLDRDFWSF
>Cag_0114 Elongator protein 3/MiaB/NifB
MAYFKRICLVEAPQTIITPFPRFISDSIGICYLAAAIKNDVEYLVIPENF
YNDAIFESMRRLLKEQPCDLVAISSMTGAFNQAIKLAKIAKEAGAFVIMG
GFHPTALPEEVLKHSCVDVVVIGEGEETFRELVKNGPSRNVRGIAYRDNG
GIVRTEPRPVITNIDTIPFPLRSLRPKRYGETGDAYSIDTIYTSRGCPWS
CSFCANDQMHKAWRGRSPENVVEEIALLHDPKNKKLIKIWDANFLTNIGR
AEKICDLLLEKGLTNFRIVTETRAKDIVRAERILGKLRQIGLSKAGLGIE
SPNKETLALMNKQNSLNDVSKAIELCRKHGIGTEGYFIIGHYSEDMTATL
AYPEFARSLGLRQALFMVMTPYPGTAIFNEYKAEERIRSFDWDLYNNYCP
VITTRHMNSDTLVAMLAYCNVAFNGYRSVMKRNNMNALLLNFMVDLFQLT
FLLRVNKNISEQAIIDLLFNALQLFLEEKKGIVLQTDMKYKKQLQHPLHL
RLEHPSGKSIDFTLSEHNSRRELTMQHNTSNMVPLPVGSRCIKLGTLVAC
ACSISMDSLMTALYQNEWAHNNPKRITAPLLSLLTDRSLQHFGRNVALLY
LRVLLSGKK
>Cag_1761 conserved hypothetical protein
MSKQLALCWRYHAGNNALIWQIMFTESGLLVGQKRLVAEKKALFFALNET
SGEVAVDDFVLMEHGEDSTIPAGEGWFTGIVTTRHALTYCYATAPESPEH
LGMWAIDFREGKVVWSKAGASFVAHSKNAILAVATTFFAGFPERHFVLLE
PTTGNEQPATLTIEQVNAIRAAAEPEEVRQGVMMPDLANELLLAAFPIIT
EHVGNKQLLCCELLTHGNWLIATLHYPSATPNCCDSYLAMWQGDTLRYHD
YMEQAATRPLLNSVIVHNKHLFYIKAKNELCCLCLSTHHANHTDA
>Cag_0826 conserved hypothetical protein
MESNGFFPFRSELLKASIGLATTITETSEQPIFDELKIRRYNVPADEIAS
FILTKLDHWFGWNMLSDRPSKNNTRLIRADVGSILLFGLKIKVTYGLYEE
KDANERPITSVHAHAETSIESKGDLGESRRVIRMMLSALDFNYLTEQLHD
EEYQSRSLDCAATRYILEQMFVVEPEPEPAKPAPSKVPKATVIELRKPNP
VQTIPLITKPKSNEADVVALLEPMETEQPAANVAALEEETYQSSATSAAG
DDKPAKPKIIVVTSKKNQ
>Cag_0887 conserved hypothetical protein
MNELIVSFGGGKKVNADFRGFTIHTDQSPLAGGEGSAPEPFTLFLASIGT
CAGVYVLSFCQNRDIPTDNIRIVQSHHPKESGRGIGKIEITIELPADFPE
KYKEALVSAANLCAVKKHILEPPEFEVKTVVR
>Cag_0512 Glucose-1-phosphate thymidylyltransferase, long form
MKGIILAGGSGTRLYPVTRGVSKQLLPIYDKPMIYYPLTTLMLAGIRDIL
IITTPEDQAQFQRLLGDGDDWGISLSYVVQPSPDGLAQAFLLGEKFIDGD
DVALILGDNIFFGYTFSSILERAVQSVTQEQKATIFGYYVSDPERYGVAE
LAPDGCVRSLEEKPQQPKSNYAVVGLYFYPPNVVEIAKTIKPSERGELEI
TTINEVYLQEGNLHCSLLGRGFAWLDTGTHESFQEAGNFIQTVEKRQGLK
VACPEEIAWRAGWISDSDVQRLAEPLMKNQYGQYLQQLLHKRDKL
>Cag_1282 UDP-N-acetylglucosamine1-carboxyvinyltransferase
MNKLVIQGGRPLSGTLTASGSKNTSLPIIAATLLHGGGTFTLHRIPNLQD
IDTFRQLLHHLGAESSLENNTLTISTANVNSILAPYELVKKMRASIYVLG
PLLARFGHARVSLPGGCAFGPRPIDLHLMVMEKLGATITIETGFIDAVAK
EGKLRGAHIHFPISSVGATGNALMAAVLAEGTTTLTNAAAEPEIETLCNF
LIAMGATIRGVGTTELEIEGTSSLHAVTFNNVFDRIEAGTLLAAAAITGG
DITLLEANPRHMKSVLKKFAEAGCTIETTPDSITLKSPETLLPVDVTAKP
YPSFPTDMQAQWIALMTQANGTSRITDKVYHERFNHIPELNRLGAGIEIH
KNQAIVHGIRKLSGGPVMSTDLRASACLVLAGLVAEGTTEVLRVYHLDRG
YENIEMKLRTLGASIERQKYQEF
>Cag_0386 haem catalase/peroxidase
MNEERKCPITGATHKPSAEKGRSNHDWWPNQLNLKILHQHSALSNPMDKD
FNYAEEFKKLDLAAIKQDLYALMTDSQEWWPADYGHYGPLFIRMAWHSAG
TYRTSDGRGGAGTGSQRFAPLNSWPDNANLDKARRLLWPIKQKYGRQISW
ADLMILTGNCALESMGLKTFGFAGGREDIWEPEEDIYWGTEGEWLADKRY
SGERELEKPLAAVQMGLIYVNPEGPNGKPDPLAAAKDIRETFARMAMNDE
ETVALIAGGHTFGKTHGAGDASQVGPEPEAAGIEEQGLGWKNQYGTGKGK
DTITSGLEVIWTTTPTKWSNNFFWNLFGYEWELTKSPAGAYQWTPKYGVG
ANTVPDAHDPSKRHAPAMMTTDLALRFDPDYEKIARRYYENPDQFADAFA
RAWFKLTHRDMGPRSRYRGAEVPVEELIWQDTIPALDHELIGADEIAALK
ATILASELSIAQLISTAWASAATFRNSDKRGGANGARLRLAPQKDWEVNQ
PDELQKVLQVLETIQTEFNASRNDGKKVSLADLIVLGGCAAIEAAAEKAG
YKVTVPFTPGRMDATQEETDAHSFAVLEPVADGFRNYLKAKYSFSVEEML
IDKAQLLTLTAPEMTVLIGGMRVLNTNAGHTTHGVFTKRPETLSNDFFVN
LLDMGTVWKATSEASDIFEGRNRSTGELQWTATRVDLVFGSNSQLRALVE
VYGCKDSQEKFLNDFIAAWNKVMNLDRFDLSGL
>Cag_1252 ferredoxin, 2Fe-2S
MDKPKHHILVCASFRAQGTPQGICHKKNSLALIPYLESELADRGMSDVTV
SATGCLNLCEKGPVLVVYPENFWYGEIDSEEKIDEILDALEEGEVCEDLI
IT
>Cag_1080 conserved hypothetical protein
MKPLPVGIQTFSKIIEDDYLYIDKTDIAKNMIEKYQYVFLSRPRRFGKSL
FLDTLQNIFEGKQELFKNLLIYNQWNWSRTYPVIKISFSGGIRDKESLHK
NLFYILKDNQERLNITCEEKNDPNQCFAELIKKTFQTYQKSVVILIDEYD
KPILDNIENIAEALIIRDGIRDFYTKIKESDQYLRFVFLTGVSKFSKVSL
FSGLNNLEDISLNPDFGNICGYTQHDVDTSFAPYLEGVDMEAVKRWYNGY
NFLGDKVYNPFDILLFIKNHKMFKNYWFETGTPKFLIDLIKKNQYFVPEF
NGLKADESLINSFDIEKLTLETLLFQTGYLTIKQLLLSDVGVSYELGFPN
KEIQISFNNYILQSITQNSQKESIRHELLAIVKAGDIGNLEQIIKRLFAS
IAYNNFTNNYIESYEGFYASVLYAYFASLGFDIIAEDITNKGRIDLTLRS
LDKTYIFEFKVIAEEPLEQIKKMKYYEKYNGERYLIGIVFDPKERNVSQF
AWEKI
>Cag_0017 Orotate phosphoribosyltransferase, Thermus type
MSTSSAMLDVFKATGALLEGHFKLTSGRHSNTYFQCAKVLQHPQHLSAIC
GAIAERFAQSGVQTVISPAIGGIVVGTEVGRQLGVKTIFAERKDGAMTIR
RGFSVEKGERFLVIEDVITTGGSVAEVMALITAAGGEVVGVGSVVDRSNG
KVKLAENQFSVLSMEVISYTPEECPLCKQGLPIDAPGSRANQQPV
>Cag_0205 bacterioferritin comigratory protein, thiol peroxidase, putative
MIAEKSKAPAFSLPDSEGKMVSLADFSGKKVLLIFYPGDDTPVCTAQLCD
YRNNVQEFTKRGIVVLGISSDSVASHKSFAERHELPFTLLSDSEKTVAAA
YDALGFLGMSQRAYVLIDESGSVLMAYSDFLPILYQPMKDLLAKIDGM
>Cag_1546 glucose-1-phosphate thymidylyltransferase
MLKELIYSAFSNKIGHRKRDCNKAMKAIIPVAGVGTRLRPHTFSHPKVLL
NVAGKPIIGHIMDKLIAAGITEAIVIVGYLGDMIEEWLLQNYDIKFTFVT
QSELLGLGHAISMCKPYIPEDEPLFIILGDTIFDVNLEPVLKSTCSTIGV
KEVVDPRRFGVAVTENGAIVKLVEKPDTPVSNLAIVGLYLLQHSAALFKS
IDYLIEHNITTKGEYQLTDALQRLLDEGEKFTTFPVQGWYDCGKPETLLA
TNEILLSDNPPSKTYPGCIINDPVFIAESAKLENAIIGPYTTIGEDVVIK
DAIIKKSIIGNKAQVKHIMLGNSIIGNNAIIRGTPHEINIGDFSEIRVS
>Cag_0426 conserved hypothetical protein
MLTNSKHVVSRPNGGWAVKTAGTTRAGRVFENKIDAIKYARDAAKKIQGE
LYVHNTDGTIMEQRSYGNDPFPPRDKK
>Cag_0827 Adenylate kinase, subfamily
MKIILLGAPGAGKGTQSTSIAEHFGLQRISTGDMLRNAAKEGKPLGLEAK
KIMDAGQLVSDDIIMALIKECLSKDECSCGVLFDGFPRTIGQAEALRAEE
IHIDHVVEIYVQDEEIVKRMSGRRVHLASGRSYHVMFNPPKQEGLDDATG
EPLVQRADDSEETVKKRLQIYHAQTAPLVEYYTMLEQKGESNAPRYHKIE
GVGTVEEIRNRIFAALES
>Cag_1736 hypothetical protein
MNERVFLKILLTLLSSGAIAEVMVVLLLGWHGEALLFVFFMSCFAVAIAL
VLHKLYGTAEASGAIESVSARRVRAMQSEELRSRLGAYSVDDEFLAGTPL
RAKSTSSTYEKSDVEAMIRKFAPHVGGLSRLLQMVQERDEASFAAVAKQA
GLANVERQTVIDYIHIMLNAEKECESNTKSGEATSPLTEFTMERESFDSY
IQRCMSGEGDDGDLSDELSVGLENPLTPKSVGIPPSDFSHSPTSIMESLK
KRAGRVP
>Cag_1272 hypothetical protein
MSTTTLSSRRSKKSWHHANPIQKAHHVVREISIAPIIEEQINLSVADQTL
LISTLLNPPAANEALQKAFAHHQRLVQKY
>Cag_0434 hypothetical protein
MEIVCSKWLSDFIYIMTEYLYPFQEFLKIVPGLLIFPISLYLGLQKIGTK
VNARISFSSNPTSPSTVRYIDLRNMKERTIVIYSIFASVNDEEILLELKQ
CNPPIILKAHQSTIVEIPTYSFMQLDGSRLDVSKLPLKKTTIYLELAHTT
IKCHTLNYQKSVSRKLSKYQILEKEDHYPDALPYNKHAIYAITYSKNSRI
KTATINEAGTFHGDWTFSLNQLPISALISKESVESYLQSMNFSNEVEWFK
VYYLNHCSS
>Cag_1832 Ribosomal protein L15
MNLSSLRPAKGSVRNKKRVGRGQGSGNGTTAGKGNKGQQARSGYKRPIIE
GGQVPVYRRLPKFGFTSRSRKTITTINLLQIEQWLEQGVVSNELTVQDFK
TLLHASNTDYFKVLGAGEVTQPITITAHFFSKSAEEKIAAAGGKTIIAFR
TLAEAVNIKGLPIEEALLKEKVKLVKVKKAKPTA
>Cag_1382 carbohydrate kinase, PfkB family
MSILIVGSLAFDDIETPFGSSPNTLGGSSTYIALSASYYNSTIQMVGVVG
NDFTDEHFALLHAKNIDTAGIQKIESGKTFRWAGRYHYDMNTRDTLDTQL
NVFADFDPVIPEQYSNARFVCLGNIDPELQLKVLDQINTPELVILDTMNF
WIESKLEALKKTLQRVDVFILNDSEARLLSGEPNLVKASRIIRAMGPKTL
IIKKGEHGALLFTESGIFSAPAFPLEDIFDPTGAGDTFAGGFLGHLSRCE
NINDHELRRAVLYGSTMASFCVEKFGTEKIAALTNDDIEVRYEQFRDLSR
IEG
>Cag_1674 adhesion protein, putative
MKADMFMQERHKRAIWWLVITWLLLVMGGCSAKQHFPQTNQKVQVVTTIA
PLSYFVQRVGGNHVAVSVMVPASGNPHTYEPSPKQLAQLTKTTLMVGAGS
GVAFELDWMQRLLELNPQLRFCKAAEGVTFRKMAAHHHAREAHNEHNEAE
QIDPHYWLAPANGVIITQNVAKALTEADPAHKAEYEANAATLTAELQALE
AELRAQLAPLKRRRFLVFHPAWGYYANAFGLEQLSLEVEGKTLTPRQMER
VITFAREQNIHTLFISPQFNTMQAATIANDIQGTTVTVDPLSVDYQQNLR
QATKAFVQAMQ
>Cag_0739 hypothetical protein
MKKIYTKLQKTVTATTIGALLFIAPKVQAQEVTYNTEGWYGTAALSKIID
TESSGMQANLGSGVIRPGEIDYNGNFAGALAVGHENSFCRKNNTPIYLRT
EGEYLMGSADRKSATVDQYHAVLDDSVDFRALFANALLGIEDTQHTRWWL
GGGIGYGWVDRPAITGCSSTCSFAAATTDGFAWQLKAVVERTISKDAALF
AEARYVALPGESNSTSQCYDDINVATLGIGFRSYF
>Cag_1102 Gamma-glutamyl phosphate reductase GPR
MKDQIIQKLHDVVRASRELVTLSDEAINRLLLDLADQIPNHQTAILEANQ
RDIERMDSANPMVDRLLLNEARLQAIAADIRNVASLPTPLDAVLEKRTLP
NGLELKKVSVPMGVIGIIYEARPNVTFDVFALCLKSGNATVLKGGSDAMY
SNIAIVELIHSVLQQHGINPDTLYLLPAEREAAAVMLGAVGYIDMIIPRG
SQKLIDFVRDNSRVPVIETGAGIVHTYFDVSGDLELGKQVIFNAKTRRPS
VCNALDTLVIHRDRLGDLSYLVEPLQSRQVELFVDDAAYQELHGFYPKAL
LHQAKPEHFGTEFLSLKMSVKTVANLDEALEHIATYSSRHSEAIIATDAA
TVATFMKRVDAAVVYANTSTAFTDGAQFGLGAEIGISTQKLHARGPMALR
EMTTYKWLIVGNGQVRPA
>Cag_0003 RecF protein
MKLQRTIFSGFRNHTSLLFEPSEGVTIIYGANGSGKTSLLEGIHYGALTK
GLLGAPDSECLSFDTEAFTLDSHFLSDSNIPIHVLVTYQLEGEKQVIVDR
QEVKPFSSHIGRIPTITFSPYEISLVSGPPAERRRFLDSAISQLDHRYLD
RLITYRRILQQRNALLAQLSSGEKSNRNTLPLWTTQLAELSAWLVERRLL
FLTSFSPYFQHYYRYIIKGEEPSINYRCTSCPLHGNTTFQELYQLFLQRY
SDIEAQEIQRGQTLFGAHRDDVLFFLNEKEIKRYASQGQLRSFLIALKIS
QAHLFADHLHEQPMCLFDDLFSELDGGRIEQILALLKECGQTIITAVEPR
YTEGITLCDIQALR
>Cag_1506 conserved hypothetical protein
MQSSRRVLFTQEFKRNIKNLAKKYRSIRADLTPYITQLENGELPGDQIPG
IEQHVYKVRVPNQSAGKGKRGGFRMIYYLKLKEEIILLSMYSKTIKTDIS
IEEITQIIKNNS
>Cag_1137 HhH-GPD
MEDWLPSKRQIEQLQAKVFAFYGEHGRSFPWRNTTDRYAVMVSEVMLQQT
QAERVVERFEAWLVAFPTVQALADAPLREVLALWSGLGYNSRAERLQRCA
QTIVADFGGVVPALPEVLLQLPGIGAYTSRSIPIFADNFDVATVDTNIRR
IVLHEFGLPETLKPRELQMVADRLLPHGQSRKWHNALMDYGALHLTSQKS
GIRPLTRQSKFQGSRRWYRGQMLKALLKTEALPLEALEATWADSPYCLRD
IASDLVREGLVEYHPSASADDSPLLRIRGSG
>Cag_0807 DNA-methyltransferase, type I restriction-modification enzyme subunit M
MFEQTFKNIDDILYKDAGADSELDYIEQTSWILFLRYLDELEKEKADAAA
LQGKTYQFIIDKVFRWNYWAMPKTADGKLDHHSVMTGTDLVQFVDLQLFP
YLASFKLKAIENPKSIEYKIGEIYSELKNKVKSGYNLREIIEMIDTLPFG
TSKDKHELSHLYETKIKNMGNAGRNGGQYYTPRPLIRAIINVVNPQIGEK
VYDAACGSAGFLCEAYSYMYERMEKTTTNLKTLQENTFYGKEKKNLAYII
GVMNMILHGIEAPNIFHTNTLTENIRDIQEKDRYHVILANPPFGGKERKE
VQQNFDIKTGETASLFLQHFIKSLKAGGRAGIVIKNTFLSNADNASVSLR
KHLLESCNLHTILDMPAGTFLGAGVKTVVLFFTKGEPTKKIWYYQLDTGR
SMGKTNPLNDDDMQEFKQLQKTLADSDNSWTVDISSINTSNYDLSVKNPN
GKVEQTFRSMDEILAEMEALDEETSEILNGIKEMFS
>Cag_1084 ADP-L-glycero-D-manno-heptose-6-epimerase
MIIITGGAGFIGSAMLWELNAHGEEDIIIVDELGSTTTQQWRNLSGLRYS
DYIHKNDFIPLLERNALKGITAIIHMGANSSTTETDADHLMSNNFGYSKS
IATYCMEHHIRLIYASSAATYGDGANGYSDDSNGITTLRPLNMYGYSKQL
FDWWALQQGVLNYAVGLKFFNVYGPNEYHKGDMSSVVYKAFHQIHQNGSV
KLFQSHRPDYGHGEQSRDFIYVRDCTAIMLWLLEHPLGGLFNVGSGVARS
FRELVTATFAALGKEPAIEYIPMPETLRDKYQYYTCATMEKLRQAGYTAP
FTSLEEGVRDYVQTYLSATSPYLGKG
>Cag_0652 GDP-mannose 4,6-dehydratase
MHKVINNVFNCLIMSSHSSFSNNKVALITGITGQDGAYLAELLLGKGYIV
HGIKRRASSFNTQRIDHLYKDPHDIQKNVADGTQHSALYLHHGDLTDSSS
LIRIIQQTQPDEIYNLAAQSHVAVSFEEPEYTANSDALGALRILEAIRIL
GLEKKTRFYQASTSELYGLVQEVPQKETTPFYPRSPYAVAKLYAYWITVN
YREAYGIYACNGILFNHESPVRGETFVTRKITRALARIKLGLQQCLYLGN
LEAKRDWGHARDYVEMQWLMLQQEQPEDFVIATGIQYSVRDFVNAAAKEL
GMAIRWEGEGVDEKGYWAVEAYSDMPQQIVPIIEVDPRYFRPTEVETLLG
DATKAKERLGWVPKTTFDELVAEMVREDLRSAERDELVKQSGFKVLDFNE
>Cag_0735 hypothetical protein
MGQFLAIGLVTQIGVLKKELAAAQLTTDQLQERMKAELPYNPELYLLHEH
TDYYSFDLRDEIFYAQLLPLLEEFYPSFYNSPEMYESILAKLRKLPPSEW
FAWAKRKPEEAFQFDPYGMRETIEEGFTDISLHYEAILLTMNGKIVMEAY
GSLFRFLNYTMKQTFKQYSLASALRLYITG
>Cag_0470 hypothetical protein
MNILTSKRLVVTALVLLTGLNVALLGVIWWQNKQTTTTPPCTPNSKSYRT
KASPLAPLNLSAEQRTQFRTLRKEHQQSISDEMAEMALLKKSLIRESLKE
QPNQATIEKLSRSIGSLQAKVEEERGRHFHAMAKICSPEQRDSLQTMLER
FATKRHGKRGNNSNSAWQRR
>Cag_0471 hypothetical protein
MKKEITIALSALLLAVGATNVEARPGMMRNGGNAGMNNNCQQMMMQERLD
VTDKQQEQLDALRVKYFEKLSAERRKLMTLERELNTETLKSTPDKGQINK
LADQIGKQYSELMRLKSTHMADISAILTPAQRDSMRAWKNFRPMRNGAAH
PMMMCP
>Cag_0707 conserved hypothetical protein
MKIEYHPAIEDELRQIIKYYKESSAGLGTEFLNEFERQVLKIADNPRRWV
AVKGTIRRSLMRRFPYVIYFRLVNDETLRVTVVKHQRRHPKKGVNRR
>Cag_0131 Phosphoheptose isomerase
MANSCRCAGGLHDRSRYDEVVLDVMLYSARLHETVARQNSAVIVAMARMI
AESFDEDGKVLLCGNGGSAADAQHVAAELTIRYRSSVSRPALPAIALSTD
TSALTAGANDLGYDNVFARLVEAYGRKNDVLIALSTSGNSASVTKALDMA
QRLGLKTIALLGGDGGEMKGKADLELIVPHSGSADRVQECHITVCHVLVD
LVERMLGYRC
>Cag_1727 conserved hypothetical protein
MNRFYAGDKIIYRKPKSSFSPGPRARDIYPLAHGEAYHYIVDKYWKVEKV
YADGTLEVVTRTGKTNRLQANDPNIHKAHLLQRLFYKKRFPSSNVAAQS
>Cag_1511 hypothetical protein
MSMKLILTVFALFLATQIADAKDVYVNGYYRQNGTYVRPHIRSSPDAYKS
NNYGPSKNSYELMNPKARDADRDGIPNYQDNDDNNNGISDNNE
>Cag_1232 putative aldo/keto reductase
MEFRQLGQSGMAITPIGFGSWAIGGDRWQYGWGAQDDGAAIAAIHRAVER
GVNWIDTAAVYGLGHSEELVAKAVAGAEHPPYIFTKCGLVWDSARQISNC
LKADSIRRECEASLQRLRVNAIDLYQIHWPNPDEDIEAAWETMAQLQAEG
LVRHIGVSNFSVAQLQRILPIAPVASVQPPYSMLRPAIEAELLPFCKEQQ
IGVIVYSPMLSGMLSGAMTKERVANFAQSDWRRNNKEFQEPRLSANLELV
ELLRTIGARHGVSAGEVAIAWTLRHQAVTGAIVGGRSPEQVDGVLGAAAL
HLSENELAEIEKGITN
>Cag_0334 ATPase
MFEARNLSLSIGTKQLLNDTSFRIGDTDRVALVGLNGTGKSTLMRLISNT
SPDSSTLRVGGDFIKSADTTIGYLPQEISFEDDLEKSALHYALQANKELF
DLSETITRFEHELALPEHDYESEAYHRLIERFSDAMHNFERLGGYTMQSD
AEKVLAGLGFSEIDFHKKVKAFSGGWQMRLHIAKLLLQNPTLLLLDEPTN
HLDIDSLRWLENYLTNYEHSYIIISHDRFFLDKLTTRTLEIAFERINEYK
GNYSTYEKEKVERYELLMSKYQNDLKKMAELNSFVERFRYKATKARQAQS
RLKQMEKLEKNLVAPEEDLSQISFRFPKAQPSGREVMRLDGVKKSYTLPD
GSRKEVLKRIDLEIMRGDRIAIVGSNGAGKSTFCKILANELDYEGKLTTG
HNVSLNYFAQHQTDTLATEKSIYIEMMDSAPNSEAQKKVRDILGCFLFSG
DTVNKKIKVLSGGEKSRVALAKILLQASNLLIMDEPTNHLDMRSKEMLIE
SLENYDGTLLLVSHDRYFLDSLVNKVVEIKNGTLQLYLGTYAEYLEKSEK
TRQAEEQAEALQRQKEQAAAKAAIKAEEQRAAAATPAPAKAKNSKKLEAI
EKKINQLEQQKEEMERIMATEDFYKKSKEENARTLEHYHKLCDELNALFA
EWETLG
>Cag_0543 conserved hypothetical protein
MQLPCIVPSTLRRYLPTLFVTSLALLPLTPISVYGEMAEDLAALSSSSES
DYNSEVALALLEELRHHPLSINRATANELRQLPWLSAADVHAIIKYRTQK
GAFRSLSELETILGKERATWLSPYLTVEAAPVPAKTTVRPKATSTSKKTK
KVATTGSLYSRYFTEMPPRKGILTEKYEGGNSKMYHRAQFYAPHVSASVV
QEKDIGEAAITDFTSLSVSVADVGMMERVVLGNYRLTLGQGLMIGQGRFF
SKGAEVGGRLTTKTLMPYASASEEGFLQGAAATLQIQPIALTLFYSANQR
DAIINKEGVITSLSSSGYHRTTLEVSRKDNITENVMGAHLRYRTAVAGME
ATLGGGMMNYSYPYPFDELEPNEPVSTVLGATLTNVDATLSFGSGALFAE
AAFASDPHDMAWFAGAEYEPLRGVTAVAALRRYGENFYSPFANAFAERGG
GSNEEGLYTAVQAAFSKKVTLGAYYDRFTFPQLGSHYQQAADGFDARAWF
SWQQSSLLCWNVQVQHKEKPEEKNQGTTKNPIWTPLPILTDRLQLNCEVT
PHKGISLRTRFELKNVDKEYLLATQSFTGKMWYQQVGYRTENFSLKGRFT
RFTTTDYAAAIYAYEDDLPLTSSLGMYSGDGSSLFAVATWQPMKQMKVAA
RYEVTRYNDRDVYSSGNDERATNAPSSLHVGCMLSF
>Cag_1386 transcriptional regulator, XRE family
MEKINNNLTSWDDHLDNKYGKQGTTTREKYEQEFEAFKIGVLIQEARKKQ
HLTQEQLAEKVGTTKNYISRIENNASDIRLSTLMRIISEGLGGHLKLSVD
I
>Cag_1927 Twin-arginine translocation pathway signal
MQLSRREFLRLLGIASAAGLLPNLGNSQPRSAGMYQLPTFGDVRLMHITD
THAQLLPIYYREPSVNIGLGEAAGRPPHLVTKALLQYYGIAKNTPLAHAY
TPLNYVEAAERYGKLGGFAHLKTLVEQLRGEFGAPKTLLFDTGDTWQGSG
TALWSRGNDMVEACNMLGVDVMTGHWEFTYLEKEVRKNLAAFKGDFVAQN
VKVKDDALLNGAEAFNDQTGHAFRPYVMRQVGAHRIAVIGQAFPYTPIAN
PSRFIPNWTFGINAADLQALVTSIRSKEKPDAVILISHNGMDVDIKLARE
VSGIDVIFGGHTHDGVPQPFVVRNASGRTLVTNGGSNGKFLGVMDLKLGN
GVVKEFRYQLLPVFANELKEHSGMKTLIEQIRAPHLATLQKPLATAGTLL
YRRGNFDGPFDQVICDALRKRNDAQIALSPGFRWGTTVLPGQTITMEHLL
DQTCMTYPETYVREMSGMDIKNILEDVADNLFNPDPYYQQGGDMVRVGGM
HYTIDPTATMGKRISNMRFENGNILDASKKYRVAGWATVGATPSSGEPVW
ETVAAHLQDVKTVEITKLNTPTIKNLGNNSGFDRS
>Cag_1682 conserved hypothetical protein
MNKNFSITVTNLSFMASSRPWFHHPWQFKESALFTIFLLVAGFGLQFIAQ
GAPLQAPRLPMNAMLLTLYALLLFGVGLSFRNRPFVQWLGSIPLGLSLIV
AIALLSLIGGIVPQEGEMAVAWMAQWGFYHIFTSIPFVLAILLFLANLGI
ALSWRIIPFSVKNLQFILFHGGFWLTVAGSMAGSSDLTRLVVPLYEGRAG
NLGYNREGNATEQLPFAITLHDFSLEEYPPQLLVYNPKTDTMLLERSKSV
TQVRKGMEASWNNLHVKVLEYFPSALPTSNGEAVATTSNDGIPFANVAVT
TPTGTYTTWLTTGSPTIKPDAAEINGILLIMAPGAPKAFRSGITLEDSNG
NKRDAMLEVNKPVDMMGWKLYQMGYDEGAGRWSQLSLVEAIRDPWLPVVY
TGFFMMMAANLLFFWNGIKSRN
>Cag_1998 nitrogen regulatory protein P-II (GlnB, GlnK)
MKLITAIIQPDRLDRVREALIQADITRITVSRVTGHGRQEDIELYRGQQI
APNLLPKVRIDIAVNNEFVDVTVDTIIEAATHGVGEIGDGKIFITNLEEC
IRIRTRERGGKAI
>Cag_0661 hypothetical protein
MQAVKALYKEGNIEFRANSKEEFMALGLGSFFDTDEDNNVDWEAMFVGAT
LLEKTQEYFHRETVLL
>Cag_1518 GTP-binding protein TypA
MSAAKNIRNIAIIAHVDHGKTTLVDSIFQVTGTFRENQQMSARMLDSNPQ
ERERGITIFSKNAAARYKDHKINIVDTPGHADFGGEVERILKMVDGVLLL
VDAFEGPMPQTKFVLRKALELHLKPIVVINKIDRPQSDPIKVHDQVLDLF
IALGAEEEQLDFPYLFTSAKLGIAKYSMDDEDGDMSLLLDMIINELPAPV
TNDEDGFQMLVTSLDYNDYLGKIAIGRIHRGTVTPGSQLVVVDPDGILSK
GSVTKLFQFDRLQRVEATEASSGDIIALAGIAEATVGMTLAAADKPDPLD
SFDISKPTLSMLFSVNDSPFAGLEGKEVTSRKIRERLMKEKMTNVALHVE
ETDSADTFRVAGRGELHLSVLIETMRREGYELAISRPEVIMHRDSDGHLL
EPIEHVTIDIPEEYTGVVIEKMGRRKAEMTNMTTLRGGMTRMEFEIPTRG
LIGYNLEFTTDTKGEGMLSHVFHNYQPFKGKLPARETGALVSSETGICVA
YALSSLDDRGTFFVPPNTKVYGGMIVGESTRGLDITMNVCKTKKLTNMRA
SGTDESLRLAPPRLMSLEQALEFINDDELVEVTPQSIRLRKKILDSNLRA
KAVKKAKE
>Cag_0790 hypothetical protein
MREIIMNKKIALALLGLALPLSAQAVEFRTPGTALGIGGAGVARNNGGLT
SYWNPAAGAFKDSPFAVGAGVGAGLKINNGLAENVDNLSKLDFDDITKFN
NSVDDVGNFTKAVTIMDDISKSGGNIGITGQVPIGVSINQFSFGIYGNMS
GYIMPIADITNIVPTANAGGANITVNDLNTSLGANTYTPSGYFTTAQLAA
LSAAITANQTGALPAGAADNLANAIDNQLKESGIPADQALATLTTTALPV
LNAASANTFNQNTTSVLTKAIQYVEIPVSYGHPIKLGKKSTLGVGITGKV
ISGTVYQSQVLLVNNNNVDASDIIEDIDTNKKTSSAFGIDLGLLYKYDKW
LNVGLVAKNINSPEFDAPDYNAPKYDTISGQVLINDLKKGDAVKLKPQVR
AGVSADVLPIVNVSADLDITENETVAPSVVGLTAPKSQNLGGGVEVHPAS
WLKIRGGAYKNLSASKGGTVLTAGFKIFMLDVDGAFATDTMEFDGNEIPQ
EAQVNASLNFSF
>Cag_0986 outer membrane efflux protein, putative
MKHLRTWINQRKRTFYNISSTLAFLLTPLPALTLMAVSLQVYAGENATPT
RLTLEQCITIALERATPLKKADNNLTLQGTDVLQRYGSFLPRLTLSAGYT
PVQQQKSYTTLSGTMPPTLLTTESDALSMQLTTSLNLFNGFGDMAALQAG
LNRRDAARLSVARARETVVYDVTQAYYQALLDRELLLIARENLQASRDQL
TLTERQYQAGLKSLIDREQQAAETADSQLRVMKAESRAEQSLLELLRRLQ
LDPLTSLELQTAADVVNGDAPYTLAADELIARAREQRNDLKSQQAQSKAN
RWQEREAAAQRYPSLDLNLTASTSATGDVEQRIAGIEKKYSYPPLSDQLG
NATSYSVTLSMNWVLFDGFRSRYSLQSAHLNYLNQQLDVEDAKRNLAIDV
RKAIAEYDAARQQISAARVSLQAASAAFNGIKRKYELGAATFVELSSARA
ALFNARSSLSQATYSLALQKNILDYVSGSTSFSK
>Cag_1474 heptosyltransferase
MASKFIQQKRFTRTLVARLLQLLSGKHKQTQHYHGTPKSIIILAQEKLGD
SILLTPLLKNLHQLFPQIKIHLVCFSKASATFFKNDSHITAIHQPKIDGL
AYYRFIRSHTFDILFNTKDSPSTSFLLQTVLIRAKYKVGHIHQHHTGLFN
YEIPINFHSQMAAKNCALFSFLNVQVKPEDCRPYLPANNISKEVATLLSV
PHKSNIIGINISTGSPDRQWQEEKWYKLISDFPEQRFIVFAAPNDIASKQ
RLETLPNVMTTPQTKNIYEVGLLVERLVLLITPDTSLVHVASCYNIPIIG
LYTTIEHDLSRFSPYLIDYHIVRSPTNKVEDIALDAVKTTLKKQLEC
>Cag_1395 anti-anti-sigma factor, putative
MKHSITNRRELTVLKIEEEVFDFRHRDDFTQIVDSLLQQEESKNLLIDFS
PVKAIDSAGIGAILQAHQLANRNAGLAVFVALCKQIKDLLKLTNLDKQLY
IFSSINEVMTLVEPNAKPKKNNRAKSAAPVDEIDELLGDDLDIIGAPELG
DEKLAGVIDDELHVDDSEEEHDKPEMDDDADEDDDEHVEEEKAPAPKEPK
AAKAKKEPKPKKESKTPPKPAASKAPAKTTTKAPAKAPEKTTKRAQPKKR
VEKESE
>Cag_1262 conserved hypothetical protein
MMAKINVKKKEIAIISVADNDYVSITYIAKYKNPNNADDVIKNWLQNRNT
IELLGLWETLHNPNFKPLEFEGFKKEAEIK
>Cag_1769 Protein of unknown function DUF132
MQKDNAIRVVIDTNIWIGFLIGKTFSDLYKAIINDQIKILFSDELFAELI
EVLQRPKFHKYFSQNDIAELISLIHLKTEFVEITDQFNDCRDPKDNFLLD
VCVSGHADYLLTGDDDLLILNPFHEVKIINYRKFTDILKRV
>Cag_0283 SugE protein
MAWVYLLLAGCLECAWAIGLKYSAGFTKPLPSAFTASAMLASFWLLSLAM
KSIPVGTAYAVWTGIGATGVALLGMVLFDESRDVARLLCLLLILAGVVGL
RIFSGGE
>Cag_1188 phytoene desaturase
MSSEKKSVVILGGGLAGLTAAKRLTDLGFQVKLLEKRNIFGGKVSSWKDE
EGDWIESGTHCFFGAYDVLYDLLREIGTYHAVQWKDHQLTYTLAGGNAFT
FKTWDMPSPLHLLPAIASNGYFSAGEMAAFAKSLIPLAFLKADYPPTQDH
LTFAEWAQEKKFGTRLMDTMFRPMALALKFIPPEEISAKIILDVTETFYR
IPDSSCMGFLKGAPQEYLHQPLVDYLTERGAELQTNVTVDELLFDGSDIK
GVQLLNGEILTADYYLCALPIHNLNKVLPQNLKDYDFFFNRLENLEGVPV
ISVQIWYDTEITSANNVLFSPDGVIPVYANLARTTPEYQTLRGKPFTGKS
RFEFCVAPARELMGLSKYEIIRMVDQSIRNCYPKTSRGAQILKSTVVKIP
HSVYAPLPNMEQHRPTQQTPVSNLFLAGGFSRQLYYDSMGGAVMSANLAA
KALAKAAGCEE
>Cag_1335 conserved hypothetical protein
MIKPFLQRTLPLLATLPFFATPTAQAATPLHAAKSAHFAPLLADPLEPRV
AVEPFLGEKSLQLDIGTTEELYRNDKGTFAAGVDFATWSLLRRSNNFKFP
VDAIDYLFGVNASWKMPLQNSSLPFDDFNVRARLSHISAHFEDGHYQNGQ
WLQQAEWQGTIPFVYSREFVNVVLALSAPEHRIYTGYQYLYNALPSGINP
HSWQAGVEIATTNTTYVAADMKLLPIWQTKQAETEGFRASWNFQAGMRLK
GKQADKVRLVANYYTGMSRHGMYFYHPENYSTIGAIIDF
>Cag_0166 putative transcriptional regulator
MLKAELLEIIANGENSGIEFKRDDIRPEQLAKEIVAFLNVQGGRIILGIE
DNGTISGIQRSDIQEWVLNVIRDKVHPQIIPFYEEITLDDGIRLVIITIT
KGISKPYVVRHNNREDVFIRLGDRSELATREQQLRLFESSGLLHVETLPV
AGTSFATLDIIRLDFYLRDLIKDPEMPTNEAEWISRLIGLGLMTDDGLGH
TVCTIAGLLCFGIQPRQYLRQAGLRVMVFSGTDKEYSSLLDTVIDAPLVG
RWQFNAIGKMQLIDDGLIEKFTAAITPFIGEEASDINELMRKEKIWHYPW
QAIRETVINALVHRDWTRNVDIEVTIYSDRLEVISQGRLPNSMTIEKMIA
GQRSPRNMLIMEILRDYGYVDARGMGVRTKVIPLMKNVNGQAPIFTATDD
YLKTILPRRKEE
>Cag_1369 Heavy metal translocating P-type ATPase
MSEIKRTNNDRLRCDHCQQTVSRTSAIIATINGDEKVFCCHGCLGVYELL
HNNALDTFYNERCTLQPTSVTEGKAPEAAAFRDTVISNGNEQQIDVLLSG
IRCASCVWLVETVLLKQQGILSARVNYATHKATITWQQGAITLDTILQTL
HKLGYQPHPYRGSSNDVMIAEKQDLLLRFGTASFFSMQLMLFTAALYAGF
FDGMESRYKLAFQLIALALATPVIFYSGYPFLANALRAIKRLTLNMDVLI
ALGSLASYLYSIAMIFADGEVFFDTSAMIITSILLGRFLEAGSRLKANNA
IAELAELQPHEALRVSNNGETNRVAVATIQSGEVIEILAGNRIPLDGKVV
EGEAEVNEAMLSGESTPVLKGMDSEVFAGSTAMNGRLRVRVSGNASTTLL
AQIIRTVEEAQARKAPIQSLADKTAAYFVPAIIALATVTFLYWKFTNGNT
VIALMNAVSVLVIACPCALGLATPLALLVATTATSRHGLLVKGGDVMENL
STITTVVLDKTGTITTGKPSLTDCFLMDENLPFIQHAASLEALSEHPTAR
AITAAWQGERLEVTHFKAITGKGVSGIIEQTHWYAGSKAFMEQEKISIAP
KEQERIAALEAEGKTVVIVARDRSIVGIIGMIDELRSDALPLINGLRTKG
LKLHLFTGDNKGVARYIAERCGISDVQAEQSPLDKAARIRALQAHGERVM
MVGDGINDAPALTEATVGVTLGNASAIALDSAGVAILHDDLRLITTLLND
ASRCFSIIRQNLVWAFSYNLIALPLAVSGTLHPIVSALLMATSSLVVVSN
SLRLRKSSRKTAPL
>Cag_0496 conserved hypothetical protein
MGKRQIIYQADRIRGNQELLNREINLVTREARVWHGRITAISSNDVELKN
SRMGKHRFNIDQIESIYCDITTDY
>Cag_0414 Molybdate ABC transporter, permease protein
MQLTPEDITAIWLSMKVAVTATALALPLGFAIAWLLVFGSFKGKVILEVL
INLPLTLPPVVIGFLLLLLLGRNSPIGQWLSSNNISVLFTWKAAVIASAT
VGFPLLVRSIRLSMESIDKKLVNASRSLGAHWWDTLFTIILPLSIKGIFA
GSSLMFARSLGEFGATIIIAGNIPGITQTIPLAIYDYASSPMGTTMALSL
CVVSIILSVSVLFIHELLTRKLEKGATS
>Cag_1409 Membrane-fusion protein-like
MKKNIGLALGGAVLFIMILLFFFNPFSASDKSTTLVTVQRRDAVPPPNST
LHDSTSYAHATEEAGFISGIVEPYNDATIGLVVQGKIASIWVPEGRRVGR
GGIILTLEKSQEELEVARRKIIWQNQAELKSAEAVVTTLTETLRLNRKLY
DETRSVSREELDKLSLQWENAVAERDRLRNQKLREKVEYEIAAKTLDSRL
LKAPFSGTVEKVFLKVGEIYQPGQPLVRLIDADRCVLTANIDDTKNYKFK
QGMTVELHVMEGSSEVIRKGTITKVPIAVDPASGVMQVKAFFDNRDGRIK
PGTTAKMRVPQE
>Cag_1271 Rhodanese-like
MIDCQFELFNVRMKNIKEVNAVNAHAMLKKGALLLDVRESFEVSRGTFDV
PDVKHIPMRELEQRLHEIPAKRQIIIACRSGSRSMMAARMLMSRGYHKVA
NLQHGVMGWQRAGLPMSKAQKQHVESFLVRFLKLFKRS
>Cag_0884 Fibrobacter succinogenes major paralogous domain
MRYSKSALLASCYVLLLVAIAGCGKKLSPPVNDRDGNSYPVVELASKTWM
AKNLEVEHYRNGDLIPQVQNAEEWAQLTTGAWCYAGNNPEEGKKYGKLYN
WYAVADPRGIAPEGWHVATDAEWQALCEAFGGLDAAGAALKATGEWKNST
PENATNSSGFNALPGGARRDTDGYFMPTGEYSRLWTSTEIAEGSAWAVSL
GYYDAAVRRGKASKKTGFSLRCIKD
>Cag_0804 Ribosomal protein S16
MVKIRLKRAGRKKMPVYQIVVADSRSPRDGKFLEVVGHYQPTLKPHSITL
KKDRVEYWMHCGAQPTATVNSLIRATGLLYELRMKKLGRSEADIATEMEK
WEAAQMARREKRVTLKARRRRAKKEAEAASASSAEG
>Cag_1500 conserved hypothetical protein
MAKKITENEVDLKKESKKKTSTKSETSKSTATKTKKSKATAEEAALTPEM
REEAIRLAAYYAWEQKGCPDNSHMDDWFEAENTLP
>Cag_0564 Histone-like DNA-binding protein
MGNTTTKADLVAVIAHKTGLTKNETEAVVDGLFESIIESLKAGRRIEIRG
FGSFNIRQKNFRKARNPRTGESVEVDPKQVPAFKISKEFKLAVSESLKGG
DV
>Cag_0335 FusA/NodT family protein
MTTKNFIKQVQIFSMKQYIASTLLLFLLIGNAPSVYAEVLSWEQCVAEAR
RAHPSLVQANAIVQQASANRRIVGSSRLPNVALALNAQQQGSSDGTSTDH
IGSSLSLHQLLYDGSKTSKQLSGADEALRAAEAAAQLTNAEVRYQLRSAF
VALLKAQELVELTNEIAERRQKNLRLIRLRYNGGREHIGSLRQAEADVAE
ATFEVEQAKRELTLAQRHLALALGRQKSVALRVQGSLQAAPFSLKKPDIE
QLLTIHPATQQAAAQSRAARYELEASRSAFSPTLALTSSLGRTAASYFPL
ESVDWQAGLSLAVPIYSGGEGKARVAKARAYALEQQAAAQAKVLQVTGAL
EAAWTRLQDAEQAIAVRRRFVEAANERATIASAQYSNGLLGFNEWMIIED
NLVNAKKRLLEASAALFVAEAQWLEAQGGGLNEAEK
>Cag_0510 conserved hypothetical protein
MSNAPRRFHFIVNPAANKGRATRHIAKLQQRLLGRNEAKVHVTQTAGDAA
VCAQSAAQAGDTIVACGGDGTLHEVVNAVAAMNATVGVLPLGSANDFIKS
LYTNPAEASNIDALWSAQAKAVDLGRVTYGSTQRYFVNSMGIGFTGNIAR
HVRENRWLKGDLTYLYALFRVLITLQPRTFTLTLTTPQGVRHEQEPLMVF
SIMNGKIEGGKFTIAPEAAIDDGLLDVCLLKAVPKWKIFRYLMRYIRGTH
INDAQVLYYKASRIELTLNEPTTMHMDGEVYENVCGNVTIEVVPKALRML
IIN
>Cag_1351 Purine nucleoside phosphorylase I, inosine and guanosine-specific
MKEQQAKIQEALAYIRSKTTAEYSVGIVLGTGLGALAKEIEVELALDYSD
IPNFPVSTVETHHGKLIFGTLAGKKVVAMQGRFHFYEGYSMQQIAFPIRV
MKQLGIKTLGITNACGGLNPGYSKGDIMLIDDHINLLGSNPLIGPNDPEI
GPRFPDLCAPYSPRILALAEEIALNQKIKVQRGVYVALSGPCLETRAEYR
MLRILGADVVGMSTVPEVIAAVHQGTEVFGMSIITDECFPDCLKAVSIDE
IIEVSNHAEPKMTAIFKSIVANL
>Cag_0372 GTP cyclohydrolase I
MKQEKTPSAITENKLPFSSSNCGCGCHNDDDTLTHDDGLLTDNLESAVYT
MLQNVGEDPQREGLLKTPERVARSMRFLTKGYHENPEELLQKALFTESYD
EMVLVRDIDLFSMCEHHMLPFFGKAHVAYIPDGKIVGLSKLARVVEVFSR
RLQVQERLTQQIRDAIQNVLNPKGVAVVIEAKHLCMVMRGVEKLNSITTT
SAMSGVFMTSPSTRGEFLRLIQK
>Cag_0830 conserved hypothetical protein
MVDVVARVHKHRVKLRDEGMRPIQLWVLDTRREGFAEECHRQSALLANDS
HEDEMMLFLSEVADTEGWTA
>Cag_0684 conserved hypothetical protein
MDNLLLKKSMLMPPVQRVALAELLLASLDYEAEDIREEWINEVQARMNAV
SEGRSKLLDFDLLWQ
>Cag_0326 Ribosome recycling factor
MSVKDVAQKIEPKMKKCLEAFQHEITTIRTGKATTTLLDRVKVEAYGQQM
PLKQVGNVSVQDAHTLMVQVWDKSMVSATEKAIRDANLGLNPAAEGQSIR
VSIPPLTEERRKEFVKLTKKFGEDSKVSLRNLRRDMIQDIEKLEKAKAVS
EDEKNKGKKDADDLLHKYEKLLMDMITAKEKEIMAV
>Cag_0611 putative anti-sigma regulatory factor (serine/threonine protein kinase)
MRCAELVLCSTLAEYERLERFLVAFTEELHVSQAFSDNLLFCVKELFINA
VTHGNRYQEGYKVYCRLELHADALVVTVLDEGQGFALEALPDPRQAPFCE
QLCGRGLFCMQSLVDGIAVVVNERGCAVTLRWKFIH
>Cag_0972 Secretion protein HlyD
MKRRTFIVIIAGVLASVVGGYLVLHQEKQEAPKIAVQAERPIPVSIVAVA
AATVSDTLNLVGEIQAIREADIVAEVEGQVRRIAVEPGERKEKGALLVAL
DDEVAAARKKKAELHYRQAARTAERYSALYRDGAVSLSAYEAMELQREEA
QAELVSATKHMQNAAIRAPFGGVVTSRLVSEGELVRVGTKVAHMADFSRT
KVVLYAPERTLPLFVLGKSVLVSSNLFPDKRFSGKVSAISDRAERAHNYR
IEVLLNNESRSIPFRSGMFARVLVPSEGERQALVVPRRALVNGMRNPAIF
VVRNGRAWFTPIIAGMEMPREFEVLGGLVAGDSVIVSGQQELRDKARVEV
VHR
>Cag_1567 hydrogenase/sulfur reductase, gamma subunit
MRTDMGYKCRITNIVQLTEQEKLFQIRITDPAERTLFRFKAGQFLMLELP
GYGDVPISISSSSSNHEYLELCIRKAGHVTSALFDAQKGDHVAIRGPFGS
SFPMDEMADHHILLIAGGLGIAPLRAPLFWVNEHRDHYKNVHLLYGAKEP
AQMLFTWQFEEWEKINHINLHTIVEHSDQHWQGKVGMITELFNDISIDVK
NTYAIVCGPPIMFKFVCSYLDKLGIPMNRMFVSLERRMHCGMGKCCRCMV
GSTFTCIDGPVFDYWTVMNLKEAI
>Cag_0237 killer suppression protein HigA, putative
MILLLLFCTLHMNLRYSTRKLEKTVETFSAIKKHYGIWGKQISQRLADLT
SAKNLADMYTIRPAHCHELKADRATEFAVSISRNHRIIFIPDHDPIPRKE
DGGVDVNQVNSVIITAIGEDYH
>Cag_0563 putative PAS/PAC sensor protein
MIFVGGIAYIKVTRVSYQLWEAVENSSKHYNGIIMQKKRPSSSSKATPST
NVLIHSNQTKIPIVGIGASAGGLEALEAFLRNIPDKCGAAFVIVQHLDPT
YKDIMVELLQRISTIPVLQIADRMVVEANHVYVIPPNKNLSLLHGVLHLL
DHVEPRGLRLPIDFFFQSLAADLEERSVGVILSGMGSDGMLGLRAIKEKR
GGTFVQAPNSAKFEGMPRSAIDGCQPDGVASADKLPELILAYLSHGHSPL
RGEQPLAEKVMSGLEKVVILLRTQTGQDFSLYKKSTIYRRIERRMGIHQI
ERIADYVHFLQENPHEIELLFKELLIGVTNFFRDPVAWEVLKCKVFPLLF
AAKPVGGVLRAWVVGCSTGEEAYSLAIVFREVLESIKPLGNIRLQIFATD
LDNDAIDRARSGCFPLTIANDVSAERLQRFFERDDHGYKIVRSIRETVVF
APQNVIMDPPFTRLDILVCRNLLIYMEPELQKKLFPLFHYSLNIGGVLFL
GSAESLGSFHQLFKPIEAKARLFMKLQGIRSEPLALPLSYAHNTPSILSD
EVGAQPKASTGVLNLQVLTEQVLLEHYTPPAVLTNDKGDIVYISGHTGKY
LEPAAGKANWNIFAMARDGIRYELNLIFSSVLRKQGSVTKKGIPVSLSEG
SQRINLTIRLLDKPEALRGLVLIVFEDDENVTVTTDTTEQLSLASGENRM
VLLEQELSQARDEILIIREEMQTSQEELKSTNEEMQSANEELQSTNEELT
TSKEEMQSMNEELQTINHELQSKVSDLSQANNDMKNLLNSTDIATLFLDD
ALNIRRFTTRTASLIKLIATDVGRPITDIVTDLHYPTLVDDAREVLQSLV
FSEKEVSASNDRWFSVRIMPYRTQENRIVGLVITFSDISMAKKLEGNLRE
SENRFRFFYNSSPYGVLFFDREGKITQANPEAEKLLGYSLVEMVGKNVAT
LNWQMLREDGSTMPLTDFPVTVALQTAKPVTETVVGIVQERTQYCRWLTL
RAVPRIEDGSSSPCEVFMTCHERDAVSHCHDSCNP
>Cag_1458 Ribosomal protein S15
MLYCPLFLFPYNSNHVAMSLTQDQKVEIIKQFGTSEKDTGKSEVQVALYT
RRISDLTSHLQEHPKDKHSRRGLLMLVAKRKKILNYLKNVDIERYRKVIG
ELELRK
>Cag_1314 hypothetical protein
MKSLKALNTSTSESLTARLQLSFGLMGAIWLLGLVGHFTPLQEWPALVLY
LVGSIGLVLWRGKSFGEWQPMYLAGGDLKTSLKWGGMAGGLLFALDLMNS
VIYFKAGNPPMANMEFLLMNGSLLYGFPLLILAEELLWRGIMFSALIERG
VNKHLTVILTTIFYVVNHFAVAPVSMYERALMGMMAMPIGLFGGYLVLKT
RNVWGSVLVHMITMISMVLDLFVIPQLVR
>Cag_0118 hypothetical protein
MSFLKEAGGLAGGALLLTPLTPLGVPLLLHGVAGIVVGGAGLFVADAVLK
QVAETTKPQSGDEPEDELVD
>Cag_1581 Magnesium and cobalt transport protein CorA
MPLINQPELLSRKPGAAPGIEHHELAALAVGKGSVTITCFDYSPTNLLRQ
EIDDLATFLADHRPEWSEVRWINVDGLSDINAIHALATKYNLHPLAIEDV
LHSQQRPKVEVYGGVANDYQARVFIVTRVPMFRNGGLQQEQMSLFVGHNT
VLSFQSAHCKEVWDALVQRVNTQGSRLRSSDASFLAYSLLDAVIDQCFPI
LEVYSDRAEELEDLILEHPEHTLIHAIHELKRGLLTLHRLLWPMREVVLV
LQRDPHECMSETTQVYLRDLYDHVVELIDIVETYREMTTDVTETYMSSIS
NHMNEIMKLLTLIGTIFLPLTFLAGVYGMNFRYFPELELRWAYPAFWAVC
LVIAVSMLLIFRHRRWL
>Cag_0225 NAD-dependent epimerase/dehydratase family protein/3-beta hydroxysteroid dehydrogenase/isomerase family protein
MKETILLTGSTGFIGQRLLHYLAEEKCHIKVLLRPESPQTALPFDCEIVR
GSFDDSQTLAKAVRGTTHIMHLAGVTKARDEDGYDAGNVMPLQNLLAAVR
HECPDLKRFLYVSSLAAAGPAPEGITGLTESDAPAPVSAYGRSKLRAETS
CHAQARHIPITIVRPPAVYGPGDKDVLQIFQMMAKGVLIGAGHPQKQRFS
LIYVDDLVTGMVQAMRAEKALNRTYYITSPTAYGWNELIAQAQPLLGFKK
LRQFTLPMPFLLGVAGLMGAIGELQGKAPLINRDKVNELVQNYWVCSGKQ
AQLDFGFTATTPLQEGLATTIAWYRKKGWL
>Cag_0865 conserved hypothetical protein
MPYPDLLPFNALPLLRLLHVVCFAAWFGSVLASIYFLKTIEERMTGSADS
AKDFAQLLQRYIKLETKIADTGFKGVIITGLLLAFFYYGWNVWLFVKIGL
IVLQVVLTLGYITRSIRTLTYPCSVTDYKRWYKLFTISLSMFAIVISISF
FLL
>Cag_0087 Iojap-related protein
MKSEHESNEAMVQEIEESELLAQRIAELALEKKCEVVKILDVRGLTSITD
FFVIATADSERKAKASADHILDELRTEEGERPLHVEGLDSQHWILLDYVD
VVVHLFLPDERRFYDLESLWSDAPTTLVG
>Cag_0560 Ubiquinone/menaquinone biosynthesis methyltransferase
MAKKADAPANQTATLLQTKSRGSIRSMFDDVAPTYDFLNHLLSLGIDNYW
RAKASGTARKLLGNNNAPKILDVATGTGDLAASMAKLSGATVTALDLSPE
MLVIARKKYPHITFHEGFAEQLPFADGSFNIVSAGFGVRNFENLDAGMRE
FNRVLKTGGHALIIEPMIPRNALMKKLYLMYFKKVLPRIASLFSKSTFAY
DYLPNSVEQFPQAEAFTTILRNNGFSKAEYLPMTFETSILYIAKK
>Cag_1703 hypothetical protein
MTFAEFKVQLENAATEEAVKAAYATYFKIKYDTSNYHDLYTEQVFFEFKK
EKNFHNIKALATILAQSLYYIRRLKFVEVEKIIPFFICLADKDEACLTEV
RKWSSYYSNDSYDWERPASKPDPKLIDHLVKEPEINNIHIYNVTKKQEHD
AFKKNLENALKPQMVLDFGDKKVINEENFEAVFEHWKNVIGHYIVNGYKP
SFYFLSNIQKERIEIDRENNRIVFHFEDKNSKVQKVLMKDYDYFWSMYDY
VASPDTINGIHAKLDRLTDDSQRRFEGEFYTPLIFAKKAIHYFTKLLGKN
WYKSGKYRIWDMAAGTGNLEYHLPAEAYKYLYMSTLHASEADHLKKVFPE
ATCFQYDYLNDDVEYVFNCKNFLFEDNWKLPQKLRDELADPNITWIVYIN
PPFATAQDAKQKESKTGVSKTKIEKLMDIEKIGHAKRELFAQFMFRIAHE
LPKKSYLGMFSTLKYINAPDSVEYRNHYFNFKYEQGFLFHSKCFHGVTGN
FPIAFLIWNLAEQCHSEIIKIDISDDNAHTIGTKYLRFIDKSIVLNNWFT
RPKNSKTYILPPLSNGITVKNENTDRRHRARPDFLASICSNGNDLQHAKY
VVILSSPNASAGAFTVIKENFEQALVLHAVKKIPKPTWLNDRNQFIIPHT
QPSQEFINDCIVWSLFSHSNETTALRNVHYLGRTYQIKNNFFPFMLEEIK
EWEIKEHDFYVQMLDDTNRFVAEWLVTHQYSNEARAVLEKGKNVYKMYFS
HLHQMITKHWKIDTWDAGWYQMRRCLAEHNIAVDELRELYVANEKLANKI
LPQIEEYGFLDKDEIYEQL
>Cag_1491 TPR repeat
MLKDALGSYRGSLAELNKMVEVQPHNAELWFARANERSGCGDYAGAISDY
TTALLLGLRFREAVNAYGNRGMARLAMGDMRGAMEDFSTIIARQPKNRSL
LRTAYLKRAHLREKQGDEVGAQNDRKAADCINGK
>Cag_0744 Type I secretion outer membrane protein, TolC
MRCHLKKIASLLLLLVLSVSAFPLHAETLDLATAYRKAMEYDARLRAAKA
DNAIYREEVGKARSQLRPNIRGNASRGRSTTQRGNKYGFYPADSYNTVNY
GVTFRQTIFNFSSTAAYDQAKLVAMKSDTDFRKEEEMVMVRIAEAYCNVL
FAEDNLAFNNSFKTAAKEQLQQAKKRFAKGVGTLIEVEEAQASYDQADAQ
GIDMQNNLEFSRRELEHLTGIYPSELRAVDAAKLPLFAQQESFEVWLERA
RTANASVESARHEILIAKKEAAKQRGAQYPSLELVAGRNYSESENNYSIG
AIYNTYSVSMQLSWPIYTGGYGSSSIRQADAKKIKAEEQYSLQVRQMESD
VRKYYNSVAGSIALVKAYQQAVNSREVALKGMKRGFQAGLRSNVEVLDAE
QKLFASRRDLAKSRYQYILNLLMLKQAAGVLQPQDVDEVNGWFAKASLK
>Cag_1052 conserved hypothetical protein
MKERWVVVIGGGAAGMAAAVSAAEQARYLGVDCHITVIEKTHQVGSKIRI
SGGGKCNVTHVGTSAELLEKGFLRAAEQRFLRSALYAFSNNELRALLQQQ
GVATTEREDGKVFPVAGEASVVAEAFRTLLQRLKINCELHAPVQAIKVHG
QQFHLITLHGDIVADAVIVATGGVSYRHTGTTGDGLRLARALGHTVVEPS
AALSSIMVQPHSLVALAGAALRGVAAVARAGKLRAERQGDILFTHRGFSG
PAMLSLSRDVANMQRSQREAVHLAADLYPQQLHDELEALLLQHSKKQGGQ
LVRKFLQVSPIGMLLLKSETMPYGTIPNAMVPLLMRQAALDDEVTFATLS
REHRHQLVVTLKQFQLGTVHNVSLDAGEVSAGGVALSEVNPKSMESRLVP
NLYFCGEVLDYVGEIGGYNLQAAFSTGWMAGKSAVNKLLTAL
>Cag_1749 DNA repair protein RecN
MLASLSIKNIALIEELTVVFHPSLTIITGETGAGKSILMDSLSLVMGDRA
SSSMIRTGANKAVIEAILTDVHSETIEALLADAAIDSRQGELILRRELAA
NGQSRCFMNDTPCTLSLLRQAAEELIDLHGQHEHQLLLRSATHEGLLDDF
AQAHHERATYSRCYQHLQQLQAQRSALVEKAQSLRDKKEFLDFQLQELQS
AQLQEGEEINIEQEITLLENAEQLFTLTTLLHETLYNSDNSAYSNLTAAL
HTLEKLATIDQRFASAIEEARAATTIVDELARFARSYSADVEFNPERLEE
LRERQLLLQRMCRKYGRTHAELIAFEQELCAEQAGAESLDDELRQLEMAI
VTEKKQLSQLAIILSEKRQKAATLLEAHLQQELALLGMPHARFAISITQQ
EKADGDIAVAGNHFAATRTGYDTVEFLLSANQGETARPLTKVASGGEISR
VMLALKSALATSTHLPILVFDEIDTGISGRIAESVGKSLKKLSRLHQIIA
ITHLPQIAAMGDLHLSVQKSVRENRTTTSVTPLDGESRLHAIASLMSGEQ
ISATSLNLAAELLAHGQAVNLPSI
>Cag_1200 conserved hypothetical protein
MCNSFSFVAYYKYSDISSMTTNNRIKLKSTLACHTQGTVDLASWLEQHGI
SYGLQKHYRKSGWMESVGTGALKRPGEEVTWQGALYTLQTQAKLPLHAGA
LTALALQGFAHYVPLGKQTVYLFSPIKTLLPAWFRNYDWPQLILHEKTSF
LPNETGITDLKLPLFSLCISSPERAILECLYLSPDTLNLVECYQIMEGLT
TLRPQMVQDLLEQCRSIKVKRLFLYMAEKAGHEWYKRLDHTKLDLGKGAR
SVIKGGVYVEKYSLNLPEELVKL
>Cag_1411 enoyl-(acyl-carrier protein) reductase (NADH)
MSEKAHYGLLKGKKGIVFGPLDESSIGWQIALYAYNEGATIAISNIAAAQ
RFGNMDELSALCGNAPIIVCDASKNEDVDNCFRELKEKIGSVDFIVHSIG
MSQNIRKQMPYEDLNYEWFVRTLDVSALSLHRIVSYALKHEAINPGGSIL
ALSYIASQRNYWTYSDMGDAKSLLESIVRSYGPRLARKGIRINTISQSPT
YTKAGSGIPGFDKMYEYSDLMSPLGNASAEECAEYTMTILSDLSRKVTMQ
NLFHDGGYSSMGATIPMIKLAHEVLNDKELAERVGLDEFMPS
>Cag_0584 conserved hypothetical protein
MSVTTIHLQPEIEQSLQAMTVALKKSKDWLINQALQEYIDRQHVEQERWQ
QTLHAMDSVAQGNVVSGDAVHAWLRSWGDENELPPPIKEKK
>Cag_0558 Secretion protein HlyD
MKSFQKFLPRYSIALVVLFSATVLFLLTRANTVDVEVGEVSPSDLVQAIY
ATGFVEADTVAELHTEASGRVIAVGALEGEQVRAGQTIVQLDATRAQLAV
REARAALAEQQAIVNDNRLRFARRQALFREGAISRQELEESERSRLQSEE
LLQQRQLQIGMKAEEARKSSILAPFSGTLTLQSLKVGDYAPANTLVAQVV
DSNGFMVVVEVDELDMPRLRVGQAATVAFDALPDKRYKAVVSRLVPHTDR
ITKTSRLYLTLQELPASIQEGMTATANIVYNVRPQALLVRKNALVDEQRK
SYVWKIEKGALKKVEVELGASDISFVEVLRGVRAGDKVVLSPAKTLRDGM
EAKITSIKKL
>Cag_1941 Elongator protein 3/MiaB/NifB
MDEPYATSQPLFDTFQRQITYARLAVTSACNLRCGYCLSEAHEPATLHQP
LLSTAELCTIIELLAKHGIQKLRFTGGEPLLRSDIVALIAMARQHSSIRT
IGLTTNGLLLLPLLPRLLDAGLDSVNLSLDTLNRHRYFQITRRDLFPQAE
AALHALLATPSLSVKLNVVMLRGINSDELTGFVELTKEHNITVRFLELQP
FDDHQIWKTGRFLRADRLEEMLLHAYPALQRVQGEATQHFSYCLSNYKGA
LAIIPAYTRAFCEQCNRLRITSSGKLISCLYEKDGLELLPLLRNGAKPEE
FAALLQQAVLRKPANGHQRHTGAVRTSMSEIGG
>Cag_0300 hypothetical protein
MRKLFLLVLLPLVFLLPVNKAFGANLVILEPTDGASVTWRPLVKGSVSGA
KHVWVVVRSVENARFYVQPKAAVRKNGSWKTSVFIGKQADTANNRDFEIM
AVGNPTEKLSDGMELEDWPSGVVTSNIVRVVRTKMN
>Cag_0748 transposase
MKDTVLFQQALCLPAPWFVKSSAFDIEQKRLTIQLDFQKGSTFSCPTCGQ
HNLKAYDTAEKQWRHLNFFQHECYLTARVPRISCPTCGVKAITDLPWARR
DSGFTLLFEAMIIALVPSMPCKTIANYVGEYDSRIWRIIHHYLDEVLEQQ
DLSSVTKVGLDETASKRGHNYVTSFVDLESSKVLFVTEGKDATTVEKFHK
HLLAHKGKAENIKEICCDMSPAFIKGVTTNFPEAHITFDKFHIVQVLTKA
VDEVRREEQKERPELAKSRYLWLKNQVHLNQSQQVKLEKLQLKKLNLKTA
RAYQVKLNFQEFFKQAPAYAQSFLNQWYYSASHSRLEPIKEAARTIKRHW
YGILRWFTSNITNGKLEGLNSMIQAAKARARGYRTTNNLIAMIYFIGGKF
EFALPALTHSK
>Cag_1298 conserved hypothetical protein
MALGFHIQQQRRNVAIVLPMLLPLLLLPFSTLLSATPAKKAPFTASPKSS
FTTTILFTAKDSVLYHAEKQTMSLWGNAVMKDAATSVTSPKMVIDLPASR
LTAYGEQASPTIPAKPAIFSDPQGSFNSSAILYDFATRKGTTTALSSNYN
GLQVKGGNVERLENGVLKITDATFTTCLEEEPHYWFESTTMTITPEGKVT
AKPLYMYMRPEIFSARLPAVPVLALPYMKFSTKPSRTSGVLLPSVSRFDN
STALSGMGYYWAIDEYADLRNEGDIAQNGSWRLGERLRYRKRDRFSSEVQ
GEFKQYQTANEWNARVIHNHRFNATTQLDANLAFSGGERRYEVNSMDSQT
LVSEQTAANASMAKSFGDNKALLALTLNGWRDLRNDNGVTNLTASYYQQP
LFLSQPPTMYDEKAWQLGGYTLLKSNQTRWFGDNQNGHTITAALEADYFT
RFNPASQLLLRQGVVVQAEKPVDELSTHNYQRTTLLLPLQANARLFEHLY
LQSGITAIQTLSTDEDESNYATLLLHSAASTRLYGTLATPALTPLLGLSA
LRHTFIPELRFSWNPPFSALPISNDATNSYVWPFDTPFVIGIEPYFGALP
DGQNRIGITLKNILHGKFHSNKQNAADSIAASTDKSVALLTLNVATALNT
ATEDFRWQPLTITASSNALTPSLFISGGTMYDFYSFEPATGNRIARLNSD
EGKSALRFVKGYANISVALQGDVSSKSAPPTQPVVARTEQALYADRFRMS
SFAAVDYSLPWQATFSLYSESDKSNPLQAVSTTLLHSAVRVALSPTWQAG
FTTGYDLDRKTMVFPLVQAYHDLHCWQIGMQWVPSGEFKGYALSVGLKNF
PAF
>Cag_0895 hypothetical protein
MMQSIIRNATLSDLSRCAELLAILFSQEKEFAPNATAQLHALTMIAESPA
SGQIFVAEVNGKVEGMVLLLFTISTYLGKRVALLEDMIVTPEWRGKGIGS
QLLRHALAYAQSNGMGRVTLLTDLDNEAGHQFYKAHGFVRSSMVPFRYVW
E
>Cag_1827 30S ribosomal protein S13
MRLAGVNLPLNKHAVIALTHVYGIGRTSAENILRKAGIAYDKKISELSDE
EAHAIREIIAEEYKVEGQARGEQQTAIKRLMDIGCYRGLRHRRSLPVRGQ
RTRTNARTRKGKRKTVAGKKKAVKK
>Cag_1177 conserved hypothetical protein
MKRKIYVETSVISYLTARPSKTILGAAHQQLTLTWWEKRSDYDLYVSQAV
WQECAAGHPEAAERRLSVLAELDILVVTEPMITLANTLVEQGIIPTKAIE
DALHIAIATLHHVDFLLTWNCRHIANPIIQEKISLYLEQQGLYLPIICTP
EELIGEKNDD
>Cag_0170 conserved hypothetical protein
MPLPTIVSRHLLPQVLKLLYRSLRISVTMPKHGLPQDGGIVAFWHGNMMV
GWLLAKKLFPHKNVAAVVSQSGDGTILADALGTLGFTLIRGSSSTDGDMV
KQRMYEHVQQGQMVAITPDGPRGPNHQFKYGTIRLASRQHIPLLFATIGY
KRSWQLASWDSFAIPKPFSKVTITLHILAIPLFTSEEELRHFSTTLSARF
SHE
>Cag_1753 conserved hypothetical protein
MFEWITQPEAWIALATLTALEIVLGIDNIIFISILVGRLPEEQRNKGRTM
GLALAMLTRIALLLSITWVMSLTEGLFTIMDINFSGRDLILLGGGLFLLA
KSTHEIHHSLEGAEESSNSNPSSKAFILTLLQIAVIDIVFSLDSVITAVG
LADDVAVMITAIVISMGIMMLGAKAISEFVEAHPTIKMLALSFLILVGVT
LTAESFHAEIPKGYIYFAMAFSVTVEMLNIRLRKKDAKPVHLHSTLPMQE
EEA
>Cag_0252 hypothetical protein
MIQCLENGTLALEHLRQSGELPPTWQSMAELEAQVNQLHIRFLEERWRIA
EKRGWGDLAIELLDRAWRTNDREYMDEEHVSEAEKVTVMQALDRQNRLMD
IYNRSANMLLALCREVPNQPQRPIRVLELACGSGGLALALAEMAQRHHLS
LEITASDAVLAYCEEGNAQAKAQQLPVTFRQLDAFHLTDYANEQYDITVM
SQSLHHFTAGQLAVIIAQAMSQTTTAFVGTDAQRSVLLAGGVPLVASLQA
IPAFALDGFISARKFYSEPELALIAESATRRCNYTISRDWPLSVLTVRGG
E
>Cag_0893 DNA polymerase III, alpha subunit
MDFIHLHTHTHYSMQSSPIFPSELFKAAKAFGMPTIAVTDYGAMFNMPEL
FSEAKQVGIRLIIGSEVLLLEHDEHQTSRHTVSPSLVLLVKNETGYRNLC
ILLSRASREGFVNGMPHVESRLLEQYHDGLLCLSAYGAGRIGRALMAGSL
DEAANFSAYYQEIFGSNFYLELQRHNTSFDALLNEQIIGLAQKFSIELVA
TNNVHYLRQNDAGCYRALVANRTKEKLSGPVSAALPGSEHYLKSAEEMQQ
LFSNEYGELENTLRIAEQCTFTFSDKEPALPRFPLPDGFSDEASYLRHLT
WEGAAEKYAKSEEEGISQEEVKERIELELGVIEKMGFSSYFLIVSDLIAA
SRRMGYSVGPGRGSAAGSIVAYLTGITRVDPLRYKLLFERFLNPERLSMP
DIDIDFTPVGKQKVLEYTVEKYGADSVAKVIAIGTLGAKAAIRDAGRVLE
VPLPLVDKLAKLVPTKPGITLEKALTDSRDLREMAESTPELKTLMQYARS
MEGRARNVSMHAGAVVITDGALEEQVPLYVSNKIETEERRFADELDLDQP
DNGKAKAGESNDEKQVVTQFDKNWIETAGLLKIDYLGLETLAVIDETLRM
IKRRHGLDIDLEKVPMTDRKTFRIFQEGKMAGIFQFESSGMQSYMTRLQP
TQIGDIIAMSALYRPGALNARVDEHRNAVDLFVDRKHGREAIDYMHPMLE
GILKETYGVIVYQEQVMQISQVMGGFSMAKADNLRKAMGKKKPEIMEKFK
ADFIAGGVAQGVHDTLATRVFDLMSEFAGYGFNKSHSAAYGVLAYWTGYL
KAHYTIEFVTAVLNSEIGDTERMKHLTDEAKSFGIATLPPSINKSDALFS
VENSSNGRSAIRVGLSAIKQVGGAARAIVTSRMRRKRDFLNIFDLTASVD
LRVMNRKALECLILAGACDDFDPHRARLLANIDKAIKFGQMQNRTVTMGQ
CGFFSNEEGQEGDIHYPELDNADMMPDGEKLLHEKKLVGFYLSRHPLSAY
RRDWQAFANLPLNTKEIVKNKQYKVIGVVVSLKPYQDKKGKQMLFGAIED
FTGKADFTIFASVYEQFGHLIKPEEVLMLVVEAELGGGMLKLLVREVLPI
KKVRKSLVKKLVLTIDADEQGQLDKLSSIKELFNKHKGGTAVEFEMKAQA
GDNIETLTLFARATPIEPEEELIEQLELLLGPDNVRIAG
>Cag_0952 polysulfide reductase, subunit C, putative
MTFVQQEVWHWQIATYLFLGGLGGATFSIGAILHLFQGVNKKMLSIAIAS
ALGFLVIGTIFLLADMLQPMKAIYALTNPNSWIFWGVIFINIYFLAAIAY
LIPLLEVWPPLLPFIQKIPQPLLSLLERYNRLAALGGAAAGFLVAIYTGL
LISAAPAITFWNTPALPLLFVVSGFSTGAAFLLLLTTFWDDDTSHQVAAT
LEQLDAILIISELIVLGAYFNFALFLPTGARESAEFLFHSPIFLIGFFIF
GLIIPLALETYAIFFAKHSKKSLTLLRLASILILLGGYLLRIYVLQAGFY
QYPW
>Cag_0340 Glycosyltransferases probably involved in cell wall biogenesis-like
MQNKKASWILFVEVGLLIVLLFTYLAIRIYYTINAPFSTIEFILGFFFLC
SEVFILYHSFFFFIDILREALYDAEPKRKEPGKDASVAILVPARHEPREV
VENTLLTCINIAYENKTVYLLDDSSIEKFKQEAREVSKKLGAKIFTRVGN
RGAKAGIVNDILKTLKEKYVVIFDADQNPMPNFLRTLVPLMEGDDKLAFI
QTPQFYSNGDASPVAMTSHNQQAIFYEYICLGKSLNHAMFCCGTNVIFRR
EALQDVGYFDEDTVTEDIATSLRLHAKGWKSMFLYKAYTFGMAPEDLASY
FKQQNRWALGTSQLLRKFIHLFFTNIKALKPVQWIEYSISTTYYFIGWAY
FFLMAGPVMYIFFNIKSYNIDPLSYVFTFIPFLVLSNIVFFQGMARKSYT
RLNMLKVQMLTALSMPVYLSATVIGVLNLDKKGFQVTPKGGATNVPLKAL
WPQLLLFSIVIVTLLWGTIRIVFFEFDYMLVINVIWTTLQAFMLSGLFYF
NKK
>Cag_1041 hypothetical protein
MNLQEAYKQKAETELELAHTRLVEFRAKVKNLNAEAHLNYAKQLDDFEHG
ITTAKEKLHELGEAGEDAWEKLKDGVESALRSSSKTLQEIADKFKD
>Cag_0660 conserved hypothetical protein
MIILDTHIWLWWVNGDTEKLDQRRLEQITSSDIVAVSAISSFEVAWLVHH
GRIVLPIGARDWFDKALDGSGIHLIPITPEIACKAVELPDHHSDPQDRII
IATALVNNASLMSSDRKFSCYTELGKMLL
>Cag_1036 uncharacterized conserved coiled coil protein
MHLKDAYKKKAEAELELAQAKLVELRAKSKNFGAEAHLNYAKHLDDFEHA
ITNAKHKLHELGEAGEDAWEKLKDGVESALRSSSNKLRDIANKFKD
>Cag_0311 hypothetical protein
MLTKDFYSIIDTELDALIIKYKDDKLIKKHKSAINNQKSYVLLIWFLDFY
GRISNYSNFITDGDNDSSCDIVFDSLNNQGNKVFYVVQSKWNNADNSKKE
TKKDDILKALNDFETILRGEKKNVNEKLKSKLEELDLHLKANGEVKFIFL
SLSQYKGDADENIEAFRSNDVKTKFEVIDISRIRVDYIDRKYKKIDPINP
LENYQNPEESPINIEIVQKGGSVKIEKPFEAYMFLLRPKAIWELFKTYGF
ALFYKNVRNPLLQSQFNVDIERTALENPSYFWYYNNGITAITYLLPEIGK
KAEKVPLTGLQIINGAQTVYSIYRAYESASPTKRIQMDSEALVTLRLLKS
GGKDFDLNVTRYTNSQNPVQDRDFCANDDIQVALQNASYQTNVWYEKRRD
EFRETPTGVKKVPNFIFANVYLAYHLQDPVSVLKNHTQRFKTHKDLNFIS
HKDHKDGLYEKIFNSNTSFEDMLCAFYIFDTIDDYTPFSYEETFKTNLYH
LLALFKVAFTKYIIAKKAMYGKGKNGEKEINVNKQIIEIYEKDEKEIILK
TFKFINQFVEKQIEVADNEEKTTDRMFKFLFTLSHYQKIYDALEDTEISV
KDIDDIVLQDNDDIVEGDKDTEVTSEQE
>Cag_0919 probable ABC transporter permease protein
MCMKPALWIALRFSFARKRFRIINFISAITLAGITIGVATLLVVLSVLNG
FQELARTFFLSLESPVQLVSSHNNAITVTPALLASIRTLEGVATAEPYAE
GEALLATQNKSELVMLKGLSPTAHQHLQNYINAPQPLFTDSTIAVGELLA
LRSSLYPNQPIQLFSPELLSLGLESLTQPYLIAALTIPRTSLHTIFSIHK
LFDDRYALSSLPLARQVLLLGDNNYTGIEIRGKHGVKGETLQRTLQQWLT
KEGVEKSYRIRTLEEKYQSVFSVMQLEKWITFSILMLVIVLASLSLTGSL
AMTVVEKQQELFYLRCLGFNTPSFTALFVMQGAITGITGTTLGTALGWGI
CAAQQHFGFVQLPSRTAFIIDAYPVAMQLSDFFVVGGAAIALCLIVSLYP
ARKAALIASSRTV
>Cag_0322 hypothetical protein
MGAAWAFPSIPSLIARNITAIFVVPTATPLSSVEHVTAGATITPFKPLAV
YGGTMPYIYEVTSGVLPAGMQLDLQTGMVSGTPESIGRHQTVTITVRDAN
NAVAATRGNLTFVVSAPPVAKAQPSVQPLAKGVAVKPFKPLEAIGGTGAH
VYSVVEGQLPEGLMLDAQTGVISGVPSTTYRQSAIVVGVRDVNNVSANQT
SRVTFTTQSSSLAKRVPVKRELQVIARVVKKAPVKSVQIAKVVRSTEKVV
AANETPKKTSQNLEKSVVVSLVPNNLLLPQWANNVFVMEVTAADLERAKH
MSASVKSVAQEVRQVAEETVQSPVRSTVVKPIRASNDVMTAADEVIASPV
ENEPFKPQSPVVEIHYPSADVDGTPAASDAPSQVYLSSLSSLSLADCSSA
SSSIAYSLN
>Cag_0243 extracellular solute-binding protein, family 3
MRLLSLFFLAFFFSLQSIAAENELIVATKHSPPFAMKNAEGEWEGISIDL
MHALAEKGKSKLTFKEMSLTEMLKAVEQGEVDAAAAAITITKEREYLLDF
SAPYFHSGLAIAVKEKETRWFHMIGRLFSPAFLQVLAALSLLLLISGFLV
WLFERKKNSDHFGGTPAQGIWSGFWWAAVTMTTVGYGDKTPITLLGKLVA
LVWMFTSLMVISGFTAAMTTVLTVESLESEIKEFNDLYGKRVGTVKASSS
SSYLEESTMKSKHFLTIQDALHALENDQIDAVVFDEPIMRYVLTLGDIHG
VTIINQVLSPEYYAIALPQGSQLRETFNRNLLDITVSERWKKINNKYLNT
KQ
>Cag_0680 hypothetical protein
MVEVRIHFFVTICLIIRDHCLRRNDKLSMLFSVINHNYEMKENMMLQTLE
AEILPNGHIHFLENFSTNRKVKAYVTILSQQPLNKPKADWHHFVGALKES
SLFQGDPVEIQKTMRDEWA
>Cag_0489 IMP dehydrogenase
MTKILYEALTFDDVLLVPAYSAVLPKETTTACRLTNTISLNIPLVSAAMD
TVTESRLAIALARAGGIGIIHKNLSIEQQAREVAKVKRYESGIIRNPFTL
YDDATVQDALDLMHRHAISGIPVIERPQNEGDASRILKGIVTNRDLRIKL
QPNAPIAQIMTSQNLITAREDVGLQQAEEMLLANRIEKLLITDNAGNLKG
LITFKDIQKRKQFPNACKDSHGHLRAGAAVGIRENTLDRVQALVDAGVDV
VAVDTAHGHSQAVLDMVKKIKSHYPDLQVIAGNVATPEAVRDLVKAGADC
VKVGIGPGSICTTRIVAGVGMPQLTAIMKCAEEAAKTNTPIIADGGIKYS
GDIAKALAAGADSVMMGSIFAGTDESPGETVLYEGRKFKTYRGMGSLGAM
SEPEGSSDRYFQDSSSEAKKYVPEGIEGRIPAKGTLDEVVYQLIGGLKSS
MGYCGVATIDELKQNTRFVRITSAGLRESHPHDVKITKEAPNYSTSM
>Cag_1916 conserved hypothetical protein
MTTPAYPPTQRDDYDSPWKEAIELYFSEFMALYFLKAYAAIDWSKPYHFL
DQELRSILPQAENGKRIVDKLVQVHLLDGKERCLYIQIEVQGNRESNFEK
RMFTCNYRIFDKYGKPVASFVILTDTDSSWRPTSYSYEFAGSKMALEFQV
AKLLDFEPRMEELLASNNAFALVTAAHLLTQQTRENSLDRLDAKTQLIRL
LYNKQWPKERVRELFRVIDWFMELPKELEQQLQTEIYNIEEEQKMKGTSI
YL
>Cag_1254 hypothetical protein
MKGKPTFVIALASCLIVSCTNNTPKTDADSTMKEVSAEMADMKSYVIQYK
NEVKASGAKSSAIITQYIDMKNDKMSIETESYTELNNSKISEKSLAIYDK
EWTYIINLKDKTGVKMKEDQAEDDPMDMIKSDDTITFRQMIEKEGGKIIG
NEQFLSKDCIVVEMKQEGQISKMFYYKGIPLKIESPAYTMEAIKFEENIT
IPVSKFTIPAGITLSEVPVMP
>Cag_1898 2-isopropylmalate synthase/homocitrate synthase
MSLCTTPIELYDTTLRDGTQGEHINLSVQDKLLIAQKLDEFGMDFIEGGW
PSSNPKDEEFFLKARHLTFKHARLTSFGSTARSVKAVESDPNLLGLLRSE
TPIITIFGKTWQAHSLKSLGISDEENAELIYRSVAFLKESGREVFFDAEH
FFDGYKDNRDFAVAMLRAAVEGGATRLVLCDTNGGTLPDEVSAIVRDIGS
TFPTVALGIHSHNDGDVAVANSLAAVVAGATHIQGTINGIGERCGNANLI
SIIPNVLLKLQRTCNYVEHLTQLTPLSKFVYEILNLPANTRAPFVGKSAF
AHKGGIHVSAVMKESSLYEHIDPKVVGNHQRVLVSELAGQSNIRYKAQEL
GIELPEKSDLFKNIVHRIKELEHAGFQFDGAEASFELLLHHELGNFTPFF
EVLETKVQIESSKENKVVNQATLKVQVGDDVEHVVADGDGPVNALDKALR
KALLRFFPEIKQIKLVDYKVRVLEEKRGTSAKVRVLIESSNGETTWGTVG
VSTNIIEASLQALQDSMNYHLFSLQQKTQEQS
>Cag_0822 conserved hypothetical protein
MVSTALRHHSQFRSTLLLFAALLMLVSSCTPQQSEKDERQQHVESMMEIL
AQVQKNLARIQQKEAVVERLSTGVERSQGENVEQMGRDIAASIRYIDSTL
AASRNLVQKLEEQNRTSTYRVESLDRLVAELQVAINVRDREVKALKGQVK
KLGGQVASLVATVDVLDEYIDTQESEYYTAYYIAGSAEDLMKKGVLVPIN
PLQKIFGTNYRLASDFNINLFRKIDMMESRDFFFEKPLQSLTIITPHTRG
SYELAGGKTSSLLVIRNEREFWQKSRCLVIVTE
>Cag_0247 conserved hypothetical protein
MLLIGTRMVDWSSLQTYLVIALLLLLFALVVMLGANSLFQQLVVRIRATM
RGVSLPLELVAWPFRLLMVLAGMAVLANNVPLSARIDVLIKHSLLIGGII
GVTWFLTRLMLVFEQVVLDYYATKVRESEAVRKVATHISLARKIIDALLV
LVAIAGALMTFDTVRQVGLSILASAGILSVMVGLAAQKSLTTLIAGVQIA
ITQPVTIGDEVVVENEKGTIEEITLTYVVMKIWDERRMILPITWFLDRPF
ENWTRTSPELLGSVFLSIDYLLSPDVLRQELERLVATTPLWDGRVVKLQV
TNSTERSMEIRALVSAANASQLWDLRCLVREGLIAFLRNNYNDLLPHLNI
EIERGRSPQNR
>Cag_1094 conserved hypothetical protein
MLETLSFFLLLALFGLLIFLAIRLLSAAPKQAELARLQALEADALRNMER
LTALAEEHEALKIRYARLEVAYQNEQQSVAEKEALMLASEERLKKEFELL
SLRILEERGKALGAEQRERLDTLLLPLRQQLEAYRQRIEEVHHADTLLSG
QLIEEVRQLQALSSRVSNDAQQLAHAIKGDSKVQGNWGEIIIERMFEASG
LEKGREYLAQESFRDSDGALKRPDFMVLLPDNKAIIVDSKVSLTAFERYS
ALSDPDEQQIALREHLQSVRRHITELQAKNYHELGGNRTLDFVLLCIPIE
AAWQAAMQADPALLYTLAGRNVVVCSPTTLMMTLKLIAQLWRREHENRNA
ELIAEKAGRIYDQVALLAHSMLEAQKKLSNVNDSFEQVLKQLKTGRGNLI
GRVEEIRKLGAKVNRQMPLDVTAEALEE
>Cag_1564 membrane protein
MEENAQLALRYEIARKAIHLSSISIPLIYWFISRDLALLLLTPLFAGFVI
IDLLKNVVPPISRWYHRTFDAMLRSHELNEEKFTFNGATYMVLSALLLVL
FFPKVIAVTAFAMVAISDTMAAIVGRTLGRHRFGHKSIEGSVAFLLSALV
IVTVVPGIPAIIGIAMAITATIAEAIAGQVSHLKIDDNLVIPLTSASVGL
LLYGWL
>Cag_1531 Preprotein translocase SecG subunit
MHVFIVVLALLAAILLIGVVLLQNPKSGSGLTGGISSLGTVQTLGVRRTG
DFLSKTTAILAAVVMGLCFLAQFTLPNKESDRKEASSVLQKSAPLKPLQP
APSTPNVPVAPAATPAK
>Cag_0982 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II-like
MKNFTEILTQQALKQPDAIAVIREDEHISYQRLEELVWQCVMYLYQQGVR
EGDVVVHIFEDELLIVIAMLANARMGATMLSVPETVPKNFLQETIKEVNV
SFILSDLHINNSIHLNAISLKKEFFKEIKADKSIYVENPKVPWLIIIGSG
STGKQKLIPIDHDKFINRLELTKNWFPIIKSDIVTSFTPLDYGSAKEIFC
QTLTMSATYFLLPKNKISIYLINKYHITILHSTVFHIEKLLQNIAPSSRQ
TLNKLKILLMGASRISSNLRQRIKSILTNNLYVRYGTNELGTVSITSIDK
VFSSNETVGFSIKGIELEIVNENGSIQSKDKPGVIRIKSPAMINGYLNDK
EASKKAFKNGWFYPGDIGKFTEDGQLIHVGRADDMMIMNGINIYPSQIEN
AILQHPDVEEAFAFPLKHKIAQDLPVCAVVLKKESNVSQKILMHYAHEHL
GAYSPQLIVILDKDPRNQQGKIVRTDLIKIVQQKIQEYSKKDINLQAVIP
MRYKQTFTKTQITLKKNIFLDISLLDSYLNNVFDIIIEKLSFNELSTQKT
YLFNIAWRYLLLRKSLLQAINIPVFDTGKVISITENEDKFIIALHTPVIT
NIPLKIYKNINNLTGFVLPQIAINQNKNKIESLYKQIMDKLNALKKQIDV
RYAGKSTLPILQVAHKKQIPFEHLGSGLYQLGWGKKLQIISRSASWHDSF
IGALLSENKILASKRLKNAGFPVPKHAVANTFEEALNIAKKLKFPLVTKP
IDLNRSEGVSVDIVNEELLQKGFELANKNKKAVIIEQQIAGVCTRVFIAN
HKMYYAVKRLPKSVVGDGKHTIQQLIILANQKNEKLPPWKRTEYFPSDND
ASIAIKNAGFIFDDIVEDGRLVPLRKIESAQWGGYDEDVTHTIHPDNIAI
ALRAAELFGLQVAGIDIISPDITKPWYANGAIINEVNFAPLYGGGEIARS
TISEFLDDLIDGDGRIPITVLVGDDNALKQAKKLQEEAIKRNKNVYISTH
NQTFNQHQNEVIITFNSLFQRTKALLMNQEVEALILVVQTDEWLRTGLPI
DTIDNLIMINKNIQDTTHIPNVFGQLITLLKTVNGI
>Cag_0647 putative glycosyl transferase, family 4
MPFFDSQHPVFFILATLVSSVGLWAIVATQHLHGEHSHDTSEGPQKFHSL
PTPRIGGLAIALTLFGFSSFISSLALVTIIAALPAFLAGLLEDLTKRVSP
RSRLIATFISAIVAWWLTGYHITSLDISFIDSLLVYLPLSLLFTSFAVAG
VAHAVNMIDGFNGLAGGTLFCMIGAFTLIAYSFGDLEIVSLGVLILSVLA
SFLFFNFPFGKIFMGDGGAYLMGFLVAWIAVMLPMRHPDVSPWASLVICS
YPVNEALFSIGRRALQKMSLGQPDSEHLHSLIKKNIIRPNFSYLEPHFRN
SMVAPLCWLYALVPIILAVTFYESIVVLIAIWIGSIVLYMLLYWFVARLA
INKESLL
>Cag_1310 HNH nuclease
MKARKKIPMENKVRAELQKQISSKCPFCENEDVGHFQIHHIDENHDNNEI
LNLILICPNCHSKMTKKEIAQEIVFKKKIELLSSQVSNKKEQSSKTINFN
NSKVGNNIIGDNNIIITKTQKSKYPEGCIGFDSIKANYIGYLISRFNEYK
EYEVGKGNVKYGLFGAQLKKYYKIGSTRTIYNLPIEKFDDLVQLIQSKIN
GTKLAKVKGKNHKNYLNFNEYTKS
>Cag_0083 Adenosylmethionine--8-amino-7-oxononanoateaminot ransferase
MSPVIDTDFDRHHIWHPYTSINNPLMVYPVVAAHGVMLELEDGRQLIDGM
SSWWAAIHGYNHPVLNEAAHQQLQRMSHVMFGGLTHEPAINLAKQLVALT
PEVLQKVFFSDSGSVAVDVAMKMAVQYWATLDRYEKANFLTVRSGYHGDT
IGAMSVCDPLTGMHTMFRGMLVEQYFAEAPSCRFNEPCSVEDIADLRMRL
ERHHNNIAAVILEPIVQGAGGMRFYSPHYLRRLRALCDSYDVLLIFDEIA
TGFGRTGRMFALEHADVVPDILCLGKALTGGYMTLAATLTTNNVADVISS
GDPGLFMHGPTFMANPLACAIASANISLLQSYNWQGEVQRLESRLQQGLE
PCRALPQVADVRVLGAIGVVELHQSVTMPTIQRQFVERGVWVRPFGKLVY
LMPPYIMEDWQLDLLTAAVCEVVAQLD
>Cag_1357 cytochrome-c oxidase fixO chain
MGITSKPTVFALLAAVVVLVGTTVTVFVPLFMPSTQPVSPYIKPYTAVEQ
EGRDIYMREGCNNCHTQTVRPLRTEVLRYGEYSKPEEFAYDRPFLWGSRR
TGPDLNRVGGKYPDSWHYKHTANPQSMFEKSNMPPYAWLANNKLDTSNSF
KKTSTLGYGYSEQEVAQQIAEYRTMVTNEQYDSKESRNQVTPPELRNELT
EMDALIAYLQKLGRDVKNMEKTP
>Cag_0030 ferredoxin, 2Fe-2S
MTVQNETPFIAHVFVCTNNRGGERKSCADGKSELVKATLKELVDAKGWKG
KVRVSSAGCLGVCSAGPNVMIYPQKVWFSNVSLDDVDEIVAVIESCVGEG
>Cag_1146 hypothetical protein
MKPLLVFITSLIILAALATALPNLLINPGKLTKGHKHLDTNCLACHTPFQ
SVSSVQCISCHKPNDIGVKTVAGVVLPKNSKKVPFHKGVATNSCIECHSD
HKGRIPTKALKPFMHASLPQSIKNNCISCHTQQQPVDALHKKVSSNCAEC
HTTKQWKPATFDHKKITASVAKECISCHKSDLPNDKLHANVSSNCAECHR
TTKWKPATFDHKNLAALGEKSCIACHKSDLPNDKLHANVSSNCAECHRTT
KWKPATFDHKNLATLGGKSCIACHKSDLPNDKLHANVSSNCAECHRTTKW
KPATFDHKNLATLGGKSCIACHKSDLPNDKLHANVSSNCAECHRTTKWKP
ATFDHNRYFRLDSDHRVSCATCHTEQNNYKRYTCYGCHEHSQARIAAEHI
KEGIANYNNCMKCHRNGKAEEKGDDD
>Cag_1384 sodium:solute symporter family protein
MPTLTLLDYSFIGGYMLLTLFIGLWFSKRASENVGEFFLSGRQLPWWIAG
TGMVATTFAADTPLAVAGFVAKHGIAGNWVWWTFVSGGMLTVFFFARLWR
RAEILTDLEFIELRYSGAPARFLRGFKAIYFGLFINAVIIGWVNLAMFKI
IRIMVPELPPEITIVALVLFTTFYSGLSGLWGVSITDAVQFVIAMVGCII
LAVLAVQSPAVVSAGGLTGALPAWMFDFFPNFSHSAEESNSVTGAMSLPL
LSFVAMAFVQWWASWYPGAEPGGGGYIAQRMMSAKDEKHSLLATLWFTVA
HYCLRPWPWILVGLASLVMFPNLPANQKEDGFVYVMQAVLPPGLKGLLIA
AFLAAYMSTLSTHLNWGTSYLINDFYQRFVKRDGTPQHYVLASKITTFLT
AAFALYITFFVLETITGAWEFIIQCGAGTGFVLIMRWFWWRLNAWSEITA
MVAPFIAFTLLQQFTTITFPISLFIIVGVTITATLVVTFATKPTEPAQLE
TFYRTTRVGGRLWKKVSDTLPDVQSDSGFGMLLVDWALGVVMVYTILFGT
GRVIFGEIGTGILFLAIGAIAGTLIFVDLNRRGWNNLQ
>Cag_0640 NADH dehydrogenase I subunit 6
MITPQYYTAIFYLFAAITVLSAAYVVFTRNVIYSAFSLLFTFFGMAALYV
FLSADFIAVTQVVVYVGGILVLLLFGVMFTNSIMQTKLKSDVLHVIPGVV
LLVGMIGSMLYMYYTTSIWKISATPLTESIVERVGFETMSRYVLPFEMVS
ILLLAALIGAAFLARFEKNSK
>Cag_0024 conserved hypothetical protein
MTNIAPLRFGVITDIHYTLDGSIATEQLAAAIRTCFASWQKRGITQALHL
GDCIRGDEQFKYEELRQVLALLQEFQGEMFHVAGNHCLLMPRQELLAALG
LQSTFYSFAMQGFRFIVLDGLDVSLFHPQADAEDAALLAHYLQHPQLHDY
CGAIGKMQQAWLQAELASAERARETVIILSHLPLLPEVSAEPYGLLWNHQ
EIAALLSASSTVKACLSGHYHHGAYAVRNGIHFMTLPAFSHQAQNPLALG
MVLELEPSMLRMYNQYNEVVFCCTLR
>Cag_1823 Ribosomal protein L17
MRKGKPARKLGRTAAHRRATLNNLSTQLILHKRIETTEAKAKETRKVIEK
MITKARKGTVHAQRDIFKDIRDKKAIRILFEEVVAKVGARNGGYTRVIKL
APRLGDAAKMAIIEFVDYSEAISPAASQKSSKQDRAKRVQGSKKNVDAVA
ESAE
>Cag_0954 Twin-arginine translocation pathway signal
MSKKITRRDFLKVASLLGGATLLRPLWGSNSAVPQGGANAVASQVAHANT
WVPSICNFCSSFCNIMVATEESEGVKRIVKIEGNANSPLNRGKICARGQA
GLRQTYDPDRLKQPLIRVQGSKRGEWRFRAATWDEAYNYIAAKLMKVNPW
EIAMVGGWTACVSYMHFSLPFTRTLQIPNIIASPLQHCVTAGHLGTDLVT
GNFNVHDEILADFEHARYILFSMNNASVAAISTARAVRFGEAKKNGAKVV
CLDPRMSELASKADEWIPIKPGTDHAFFLAFLHLLLREKLYDETFVQKHT
NAPFLAWKDEQGLVHLAGESSNGKPSSYTLFDLNSQRVVTVAGHTNSNER
SSGGARLLPALHAPEGTVWNGHSVKTVFDFFIEESATFTPEWAAAITDVP
ATTIRRIAREFGQARPALVDPGWMGARYHHVIGQRRLQAIIQTLVGGIDV
PGGWLMSGEYHHKAEKMAHGSHSGDVPPVEKPGMGFAFGLLDIFGNPAAW
EHGKPAVSFAWAMEQQKQGKPSVFIPAMADVGLLESVKGELQFNGEPYMT
KAFIMNAANPIRHYFPASRWKEMLSHENVELVVAIDVLPSDTTLYADVIL
PNHTYLERNEPLLYPLGPSTDLGYTTRLRSIKPLYNSRDTADILCEIARR
MGKLEPFLDGVAEYAGLDKAMLQSEINTATQAGTPLNDAFLKTAYAAMST
FAEQVNGTPISAQEVEATIRTKGMLLLKSADAVLEESAMPGKLALPTMSG
RLELFSPILASFAQAAGMQPIFNPVLGYVPRMMQQSGDKTTLSSNEFYFT
YGKTPLVSHASTNTNNTLLVSITAQKEGALRGLWMNSRKATALGLENGQT
VKVKNLRYGASVTATLFTTEMIRPDTVFLPSSHGSSNKLLRVAAGKGTPL
NELMPYSIEPIAASFMSQEFTVSVQAV
>Cag_0044 phosphoglycerate kinase
MEKKTLSDIATQGKRVLMRVDFNVPLDENKNITDDKRIVESLPSIRKVIE
EGGRLILMSHLGRPKGKVNAEFSLAPVAARLSELLDCPVGMAKDCIGTEV
MQQVLALQDGEVLLLENLRFHAEEEANDADFAKELASLGEIYVNDAFGTA
HRAHASTEGITHYVQTAVAGFLIERELRYLGKALQEPERPFVAILGGSKI
SGKIDVLENLFKKVDTVLIGGAMVFTFFKAQGYEVGKSLVEESKIELALS
ILEQAKAKGIKLLLPTDVVVTAEISADAESSVASIADMPNNLIGVDIGPE
TAAAYRNEIIGARTVLWNGPMGVFELDNFATGTIAVAQALADATAQGATT
IVGGGDSAAAIAKAGLASEITHISTGGGASLEFLEGKELPGIAALNN
>Cag_0365 2-oxoglutarate ferredoxin oxidoreductase, beta subunit
MTDTHTPLTAKDFTSSQEPKWCPGCGDYAALQQLKNAMADLGLKTEDVVV
VSGIGCSSRLPYYIATYGVHGIHGRALPVASGLKSARPELSVWVGTGDGD
ALSIGGNHYIHTIRRNLDLNVILFNNEIYGLTKGQYSPTSKVGLKTVTSP
NGVVDYPMNTAALTLGAGGTFFARVTDRDGKLMREIFKRAAKHKGTSIVE
IYQNCPIFNDGAFDVFIDRDRKADTTIYLEQDKPLVFGKEQEKGIRLDGF
TPVVVNLNDGSVSKDDLWIHNDKDFIKANMLARFYDDPDSVENFLPRPFG
IFYAEQRFTYEDALTAQIEQAQANGAGTLEELLAGPSTWTNQ
>Cag_0230 7, 8-Dihydro-6-hydroxymethylpterin-pyrophosphokinase, HPPK
MQQHTVYIGIGSNIGDRMSHLREALAMLNNLESTSVTAISAIYMTEPVGE
INQERFYNGVLAVTTSMQPEALRQECKAIERTIGRPATYQRWSPRVIDLD
LLLFDTLVCQTDTLAIPHPELHHRNFVLIPLLDIANPTHPLLGKSIRELL
ALCPDRSVLIKLQENIAL
>Cag_0417 NusB antitermination factor
MKTYRRQIREKILQALYTVELRGITLDEAAGWLLTEEILADPNAMKFFNL
LLSSIKAHREEIDNYIAQQTFNWDMSRIAIIDKNIIRMALTEILYCEDIP
PKVSINEAIEIAKKFNSTDKSSKFVNGILDAIFNKLKTEGKVHKNGRGLI
DQSFSRPQKPESEATEIEE
>Cag_1279 NAD+ synthase
MVQNLDLNYGLVEEMLLSFLRRETGKFGFTSVVLGLSGGIDSAVVCELAV
RALGAPNVLAILMPYETSSSASLEHASLMVQKLGISAETIAITSVAHAFF
ASIPDNQLLRRGNVMARTRMMYLYDISARDGRLVMGTSNKTELMLGYGTL
FGDMASAINPIGDLYKTQIWGLARHLGIPSQIIDKAPSADLWEGQSDEAD
FGFSYEEVDRLLYMMVELRMDKATMLAAGVSEALYERVRRMVVRNQYKRM
MPVIAKISARTPGVDFRYARDWQEVR
>Cag_1568 hypothetical protein
MATYSSFEQLNFYRSMQLTITLPDILPDEISRVIKKVKEIFSQEGIAAEI
TPEPLSTDAWDSLNFDEIAVDTGRVDFAENHDHYLYGIAKRS
>Cag_1242 VCBS
MQIIKVIPMAESVVVGRREVVFVLGNLADVKSLLDGVLLGLEVHLLDPLG
DGLTQMANILAGQSGYDAIHLLSHGSSGELQLGATMLNSSNVNSYAGVLG
QISGALAPSGDLLLYGCNVAASTEGQLLVDTLARLTQADVAASEDLTGSV
AIGGDWVLEYQAGTVESTLPFVDGMVVGYELTLVNSITLTVPSSPTINEE
ASYDFDGFDIANSGSGTAFEAIVTLTTPSNGTLSLSDYSGIEILSRGSLT
EVKTSLNDLVFQDAANYSGTVSITVTVNVYTANYVPSDLLRSDTKTFTIT
VNPVNDAPVFSDSYSPMLTGINEDVQDASNTGTLVSALVVDGSITDVDGT
AFEAIYVTAVDTAHGRWWYKVGTGSWTEFNFSGGNTSKGLLLSATDIVCF
VPAANWFGSPTLTFGAWDTSSGSVGTYATISSTGGETAFSSVTDTASITV
SPVNDDPTTSNAALSVNKNSTLTGINLTSYSHDNDTGSNNSTDAAITGYK
ILTVPTAEHGQLQKSDGTAVTVDMTLTPTEAAALKFVPTTNYTGSATFTF
QAIDAANVVSNTSTATISVQSVNMDPAISVPESLQTTQQIYEESTLTLDN
ISLSDADADSARVKVVVTATHGTVTLKQLTNLYDAATGGNTLSNATGASL
TLYGTLTDITNALDDLTYTPEKDFFGSASVTLTVDDLGNTGGNAKTDSET
INITVYNVPDKPVVSGEVTLEAINEDTLDPSGASIWQIIVTDSEAFSDAD
DNDLDGIAISADASTSAQGVWQYSANDGVTWKNIGTVTESSALMLSYETL
LRFVPVTNWNGEPGSLTFYAVDDSDYRDFTSNAERQSADFSDGNEKDIVG
ESQSLQATAATEQNGTRVEDLFETVFNNAKSEGETFTGIAVTLSGGDSEA
GVWQYSANNGETWSNVGAVAAGSALLLDGDTLLRFTGSTAGSLSCYMLDQ
SGDRTFTTGDERQSIKISDGGDDLAAAGHSLGTTITPVPDPFVIVNDFAL
RLDEGATKSINSNVLKLTSVDGTASEITYTVTALTLVGGEVQYDSDKDGS
FETVVTTGTTFTQDDLDNSRLQYVHNGGEPTSADQSITYTVWVPTSGGTT
LTNRKLTINVSPVNDIPTLYVPGDTPPSSTQLTANVATSGGSLTFSTSNI
QVVDPDNTNKQLVFRLVESLPQHGTLTIDGNEVALGTVFNYANLAKLVYT
HDGTSATTDSFRVTLRDGAGGEVTKVINLTIGTLATQAPSGIGNLTTTIY
EDPLTSTGSNPGVAIKNLTGYTFSDSDVGATVGGIAVVGNPQNSAQGTWE
YSTDGTHWAAVGTVGDNATTQALVLSPDTFVRFVPVTNYNGTPTPLTIRV
LDNTYGGAVSVSKLVDSTITETRVLLDTSTHGGSTAIADTTNTLGISVIA
VDDDPTLVNHSGTLDSSSNNTLTISSYSFYTLPITSSMLRVDDIDSSASQ
RTFTLTSQPTHGIVIKNVNSTWTKVLDNASFTQADIDSRLIRYLYFDHDI
PTTYSDSFTFTVTDGDVRIKPDPQRPGGIYADTSTTTLSTLTFNLTITDT
WSTSEGGGGGGTGGYTFPTLSPTAVPSVTTGTLTLDEGDQDVTVTTAILT
AIDSDTTDATQLVYTLTSLPTNGTLKLSGSTLQINSTFTQADIVAGSLKF
SHDGSEDFSSDFKFYVSDGGNVTTVKTLSIDITPVNDQPVIATATTAKVL
EGNSLVLRGGTLNPTTNVLSGGVVGAYDVDGVDDDKNLALDTLTYTVSTL
PTHGQVRLEVSGKTYSDSDASTYTVVATDTVISLADLNVGKLHYMHDGGE
STSETLTFTVNDNSGATNATASASSSFTIKIISLNDDPTVTVNTGLIGAN
AINEAATQVISKTDLTGYDPDNLTDEIQFRITTNVQYGQLLLDGKLLGAG
SAFTQADIVANKLSYKHDGLESEITAGHFTDFFNFRLSDGGGGNEPSGTF
TIHILPVNDAPTIIAPATRKVAEEQQLAISGVCVADLDSVNRDLSINSSF
GPIIVTLTALHGIIDLTASGSAVLADDGTASVTVTGTLGEVNATLASLVY
AGNANFNGDDSLTIDVSDQGYSGSGGTLTASQTITITVTPINDRPVNTVP
SATQTLDEDATRTFSSENGNALSVTDIYDTAMSGSTDSLRTIVSVEHGTL
RATTGGGAIISFNNSASVRIEGTAAQINAALEGLVYTPNEHYNGDDKLTI
YTNDFGYNGTGNILIDNDTVDITVNPINDAPTRSSATATATLSSIYEDRA
TSSLDANVVEPVGATVSSLFTSAFSDATDSVTDGSNANELAGVVIVGNSA
NAETQGTWQYYNDFSWSAVGTVSTSDGLYVAGGDKLRFVPVAEFNGTPTG
LTVRLADNSTGTLPTTGTRSLDVSDDTTTSGSTTRYSNSSNAVTLNTSVT
AVNDAPTMTIGTATGFTVTEDGSATVNLASANFTLGDIEAARNEGSGGTQ
GKVSITFSNANGVLHVNSGEWSVSGNDSSNLTVTGTIAEVTSALTGLTYK
PGDDPNSTETIAVLFSDLGNNGSSVTEAKTVSGSLEVTVTPVNDAPIATG
DATLADVSEDAGAAAWSSGTPTNPNYGDLAAPTGNTVSNLFGSLFSDVDE
TTSAHTFTGIAITANAEDGTNGHWEYSTDGGTNWTPIPTTGLSDSAALVM
GATANDKIRFNPDADNYNGTLGALTVRLSDGAGFAASSTVSDLKVLATSA
SDGWSNTITLSTAVSQVNDPPAIANLHDDNVAFVEAVGVSVAGTAVYLDN
TTEGFQAATFSDVELTLRKETTFNGATLTVHQQTTIDANDFFMLPTGGSI
SIQGAPVYVNGLTLFANGSSVKYYDGSTTKTVAVLTNNSLDGELRLTFNS
NATQAAVNAILQRLAYSNDNDKLENTNKNIDIIFVDGNGSTSNAQGTGGA
LTTTATVHIALTPSNDAPSFTTGVTLSGTEADAAGSPLLPSSPTTIATLF
DSKFSDPDNVSGNTLAGVAISAFVEAGRGVWQVDVDGGDNSWVALSTLRP
NSADISATNALLLSKDAQIKFVPNADANTAGLITPPRLTLFAVEDSIPTG
ANASADHAPAITFSTSGALQSYNTTTDTVEARVSATSVNVDVSISAVNDA
PVVTLATTSPNTYTEGIDTAENRSVLGEAVVIDGSVQIADVDITLGEDTF
AGTTLTVARNDGSSGFSANADDVFGFATSGSVTTTGTLASGTVSVSTVEV
GSYTYSSGTLTITFGNVTKEQVNAVAQAITYANSSDTPTASVSLRYRFND
DNHSSAQGSGNSLTGEDVITVNITAQNDMPLAVNDTHQITEDATSITGNA
ITGVGSPSTTADSDPESNSLTVSAIRTDTEVSGSGTSGTVGVSLSGSYGT
LVIAADGSYTYTLDNNNAAVNKLKTSEHLTEYFTYTLSDSALTDTAQLTI
TINGRTDGAPTITANDGNSTETGQATVSEVGLTSDGGTAETTTGTVAISA
LDGLTSITVGGTTVSREALASLSGNHVTIDTVEGTLTLTGYTSTSSVGGI
TTAGTLSYSYTLTARLAHSGTTESTDTIALQITDEGGATQNVNSLVICIV
DDVPTASNDAPSITEGTAGSPASNIVGNVVSGTTGGDAADRLGADVPTNP
VTAVIKGAVTPTTAVASGSTSSSNGTVVAGNYGSLTIGADGSYSYDLDDA
NADVNALKTGSSLTDTFTYKITDADGDSSTATLSITINGTTDNAPTITAV
DGNNGDTAGHVTVQEAGLTDVGVTSETTTGTITLTAPDGLLKVAIGGTEF
TVAQLATFTAQAPSTGIDTSEGTLTITGFTNTTGAISAPIEGTVHYSYTL
KAALTHTDATESTDTIAISITDKGNATVNGSDLVIRIVDDVPTANADAHS
VAEGTTTAPTTTSGNVVSGTSNGDVADRVGADTTATPVTAISFSGNAKTV
GTAFDSTYGSLTINSDGSYTYSVDNTNATVTALNASQSLTETFVYTITDA
DGDSQSANLVITITGTNDAPTITAGSTIATGAFTETADTTNSPTADTVSG
SIAFADVETGDTHTTSVTSRNYVWSGGTLSVAQQNALASAFTLGEKTDSN
GSGTQAWQFSAADSIFDFLAAGETVTATYTITVTDNGSPNASCTQNVVVT
ITGTNDAPTITLGTGDSDTAALPETDAKLTASDTLTITDVDVTNTVTASV
QSLETTGDVNYVNPETLLALFTVSPTNIISNTETSKSLTWTFDSGTHFFD
YLEKDETLTLAYTVQVTDSNNPAATADKVVTITITGTNDVPTLTIAAPDS
VEELAGASTQDLSAITGNLAIADKDVSNTLTPTQGTPTVVWSGGASLPAG
YNVSALTAENILTLGAAGTSTGGDVNISWTYNPSAVNLDFLANGETLTVT
YPITINDGKGATDTENLVITITGTNDAPDINVDSGDKAVDTLAETNAGLS
TSGTLTVTDLDYTNTVSAQVFSVSKSGTTVGIVPNDTTLKGYLTLTSPSI
INATNTTGDIAWNFNSGSQAFNYLAAGENLVLTYTIRATDSNTSAATDDQ
TVTITITGTNDAPVITTIAQTNLTEQTTTDALTTTINATFSDVDLTDVGH
TAQITTVSKDGVITGLSLTDEQLKVLITIGTVTKTSGAAAGSVPMTFTAA
STIFDYLAVNEVATLTYTLEVNDGDGGTHTQTFVVQITGSNDLPLLTATN
VTGEITELVTPSNNLTDSGTIAFTDVDLSNTHTVSVSIVASPLGALSAVE
NSDTTNGTGGQLTWSYSVAASAVEYLAEGETKVEQFDVLVNDGTGSSTQR
VSVTITGTNDAPTITGAIADFGFTETTDAAAQDLSRNGTLSFNDIDATNV
IDVTKSLKTAAVWSNGTIDTTLKSALEAGFAISGTDVAAPGSVNWTYNVN
DAALDFLAKDETVTLTYTVTITDNNGLTATDDVTITITGTNDTPDITATD
VNGTVTEDAALSLTDTGSISFTDLDTTDTSDATVALFSTTTTTGQAIPTA
LTDALANANAVVLSGDIVDKHAGAITWDFALDNSLTQYLAAGETVTATYQ
ITVTDDSGVTTASGSNEVNVRTQNVTVTITGTNDAPVLSNTPDLTIEQTE
DDAAPSGAVGTLVSALISGITDADATNPKGIAITATDSNRGTWYYSTTVT
PDWHSFTVSDASQSLLLSADANTRVYFKPNPDWHGEITSGLTICAWDGST
GSNGGTANITATGDITAFSTVTDTVSVTVSALNDQPTISSDVTISAFSED
VTAPTGTVLSGLTFGYSDVTDDQNDNNTATLIGGDTLTPFTFLAVVGSTD
YTAAQGTWQISKTTSPNANTSSDWIDIPTTGLSTTSALIFNADSKVRFVP
AGNYFGTPGTLTVRLADASVTLTASTSATDYKNLNDTANGDLDLTTGAWS
STDQTLGTTVTYVNDAPTISDDTQTHTAVNEDVATDDNSGALVSTLFSAS
YSDATDNQGANTDNAYENPGAAITGGDGEATPFGGIAIVKNDATADQGEW
QYSTNGTDWTPIVTDISNDKALFLPTTAKLRFAPAAHYNGTPGNLTVRLS
DATVSEIAIAQNISGIIGGSNQWSLGTVALSTSVSAVNDAPVLSATTVFT
GTITESSTAGVGTETPQALLTGITVSDIDLSAANLVSDVFGAGIITVSLT
DRTAGDKFTLNNNLSTSDGVALTTDGIADSGNYVINLTSSATLVQVKAIL
EAIRFEHTSDTPPTAARSFTVTLNDGNNDQGTPDAGGPSSLNATTTLTGS
ITITQANDPPAITDGPDSASLNETDAGLISSGTLTVTDVDTADTVTASRT
LAVSGNSKRSDAAAPSNETLLGMLTLSPTPVIDGLNTTGSLAWSFASGSE
TFNYLRKGETLILTYTVTATDNGSGTLTDTETVTITITGTNDTPAITGGS
DSASLTETDTTVTTTGSMTVTDVDTADTIVLTVDAVALSGTFTSSSSTLP
SSLSASSYQALLNMLVLSPNAALVADATSGTDFTWTFTSGASGDRAFDFL
RKGETLILTYTIKATDNSGATGGDQSASTTSTVTITLTGTNDTPAITDGA
DSANLTETDTTLTATGSMTVTDIDLTDTVSVAVTSVVRDGGTFAGTVPSA
LTDSSNAALKAMLSVTPNSELAADPNAGTSFTWNFASGGSGDSAFQFLAK
NETLVLVYTITATDSSGVSSGEVTTTTSTVTVTITGGNDAPTISNVVDLN
FAESAGDSSAQDIGATTGTLTITDQDLGDTLSITVSADATAKYNGGDVPT
EGSVSVATLVAKSAISFAPPVTTNGESQDVVWTYNPAAADLDWLRAGENL
ELTFVATISDDKGGSTTQDLVLTITGSNDVPSVTTTTPSAIVEITGDSSG
QDIGATTGTLTITDQDLGDTLTLSVSNDATAKYNGGTVPTEGSVSVAALI
ASGSISFAAPVVTNGESQDVVWTYNPAAADLDWLRSGENLVLTFVATISD
DNIGSTTQDLVFTITGSNDVPSVTTTTPSAIVEITGDSSGQDIGATTGTL
TITDQDLGDTLTLSVSNDATALYNGGTVPTDDSTVSVATLVAKSAISFAP
PVKTNGESQDVVWTYDPAAADLDWLRSGENLVLTFVATISDDNIGSTTQD
LVFTITGSNDVPSVTTTTPSAIVEITGDSSAQDIGATTGTLTITDQDLGD
TLTLSVSNDATAKYNGGDVPTEGSVSVATLVAKSAISFAPPVTTNGESQD
VVWTYNPAAADLDWLRKDDTLVLTFIATISDDKGGSTTQDLVFTITGSND
VPSVTTTTPSAIVEITGDSSGQDIGATTGTLTITDQDLGDTLTLSVSNDA
TAKYNGGTVPTDDSTVSVATLVAKSAISFAAPVVTNGESQSVVWTYDPAA
ADLDWLRAGENLVLTFVATISDDNIGSTTQDLVFTITGSNDVPSVTTTTP
SAIVEITGDSSGQDIGATTGTLTITDQDLGDTLTLSVSNDATAKYNGGDV
PTEGSVSVATLIAKGAISFATPVLTNGESQNVVWTYNPEAADLDWLKAGE
NLVLTFVATISDDNIGSTTQDLVITITGSNDVPSVTTTTPSAIAEIPSDS
SAQDIGATTGTLTITDQDLGDTLTLSVSNDATAKYNGGTVPTDDSTVSVA
TLVAKSAISFAPPVKTNGESQDVVWTYDPAAADLDWLRSGENLVLTFVAT
ISDDNIGSTTQDLVFTITGSNDVPSVTTTTPSAIVEITGDSSGQDIGATT
GTLTITDQDLGDTLTFSVSNDATAKYNGGPVPTADSTVSVATLVAKSAIS
FAPPVKTNGESQEVVWTYDPAAADLDWLRKDDTLVLTFVATISDDKGGST
TQDLVFTITGSNDVPSVTTTTPSAIVEITGDSSGQDIGATTGTLTITDQD
LGDTLTLSVSNDATAKYNGGPVPTADSTVSVATLVAKSAISFAPPVKTNG
ESQEVVWTYNPAAADLDWLRAGENLVLTFVATISDDKGGSTTQDLVFTIT
GSNDVPSVTTTTPSVISEVTGDSSAQDIGATTGTLTITDQDLGDTLTFSV
SNDATAKYNGGPVPTDDSTVSVATLVAKSAISFAPPVKTNGESQDVVWTY
DPAAADLDWLRTDDTLDLTFVATISDDKGGSTTQDLVFTITGSNDVPSVT
TTTPSAIVEITGDSSGQDIGATTGTLTITDQDLGDTLTLSVSNDATAKYN
GGPVPTADSTVSVATLVAKSAISFAAPVKTNGESQDVVWTYNPAAADLDW
LRAGENLELTFVATISDDKGGSTTQDLVFTITGSNDVPSVTTTTPSAIVE
ITGDSSGQDIGATTGTLTITDQDLGDTLTLSVSNDATAKYNGGPVPTDDS
TVSVATLVASGAISFAPPVTTNGESQDVVWTYNPAAADLDWLRAGENLEL
TFVATISDDKGGSTTQDLVFTITGSNDVPSVTTTTPSAIVEITGDSSGQD
IGATTGTLTITDQDLGDTLTLSVSNDATAKYNGGTVPTEGSVSVAALIAS
GSISFAAPVVTNGESQSVVWTYDPAAADLDWLRAGENLVLTFVATISDDN
IGSTTQDLVFTITGSNDVPSVTTTTPSAIVEITGDSSGQDIGATTGTLTI
TDQDLGDTLTLSVSNDATAKYNGGPVPTADSTVSVATLVAKSAISFAPPV
TTNGESQSVVWTYDPAAADLDWLRSGENLELTFVATISDDKGGSTTQNLV
LTLTGSNDVPSVTTTTPSAIVEITGDSSAQDIGATTGTLTITDQDLGDTL
TLSVSNDATAKYNGGTVPTDDSTVSVATLVAKSAISFAAPVTTNGESQDV
VWTYNPAAADLDWLRAGENLELTFVATIADNNGGSTTQDLVFTITGSNDV
PSVTTTTPSAIAEIPSDSSAQDIGATTGTLTITDQDLGDTLTLSVSNDAT
AKYNGGTVPTDDSTVSVATLVAKSAISFAAPVTTNGESQSVVWTYNPAAA
DLDWLRSGENLELTFVATISDDKGGSTTQDLVFTITGSNDVPSVTTTTPS
AIVEITGDSSGQDIGATTGTLTITDQDLGDTLTFSVSNDATAKYNGGTVP
TDDSTVSVATLVAKSAISFAPPVKTNGESQEVVWTYDPAAADLDWLRKDD
TLVLTFVATISDDKGGSTTQDLVFTITGSNDVPSVTTTTPSAIVEITGDS
SGQDIGATTGTLTITDQDLGDTLTLSVSNDATAKYNGGPVPTEGSVSVAT
LVAKSAISFAPPVKTNGESQDVVWTYNPAAADLDWLRAGENLELTFVATI
SDDKGGSTTQDLVFTITGSNDVPSVTTTTPSAIVEITGDSSGQDIGATTG
TLTITDQDLGDTLTLSVSNDATAKYNGGTVPTEGSVSVATLVAKSAISFA
APVTTNGESQDVVWTYNPAAADLDWLRAGENLELTFVATISDDKGGSTTQ
DLVITITGSNDVPSVTTTTPSAIAEIPSDSSAQDIGATTGTLTITDQDLG
DTLTLSVSNDATAKYNGGTVPTDDSTVSVATLVAKSAISFAAPVTTNGES
QSVVWTYNPAAADLDWLRSGENLELTFVATISDDKGGSTTQDLVFTITGS
NDVPSVTTTTPSAIVEITGDSSGQDIGATTGTLTITDQDLGDTLTFSVSN
DATAKYNGGTVPTDDSTVSVATLVAKSAISFAPPVTTNGESQDVVWTYNP
AAADLDWLRAGENLELTFVATISDDKGGSTTQDLVLTITGSNDVPSVTTT
TPSVISEVTGDSSAQDIGATTGTLTITDQDLGDTLTLSVSNDATAKYNGG
DVPTEGSVSVATLVAKSAISFAAPVVTNGESQSVVWTYNPAAADLDWLRS
GENLELTFVATISDDKGGSTTQDLVFTITGSNDVPSVTTTTPSAIVEITG
DSSGQDIGATTGTLTITDQDLGDTLTFSVSNDATAKYNGGTVPTDDSTVS
VATLVAKSAISFAPPVKTNGESQEVVWTYDPAAADLDWLRKDDTLVLTFI
ATISDDKGGSTTQDLVFTITGSNDVPSVTTTTPSAIVEITGDSSGQDIGA
TTGTLTITDQDLGDTLTLSVSNDATAKYNGGPVPTDDSTVSVATLVAKSA
ISFAPPVKTNGESQEVVWTYNPAAADLDWLRAGENLVLTFVATISDDKGG
STTQDLVFTITGSNDVPSVTTTTPSVISEVTGDSSAQDIGATTGTLTITD
QDLGDTLTLSVSNDATAKYNGGDVPTEGSVSVATLVAKSAISFAPPVTTN
GESQSVVWTYNPAAADLDWLRSGENLELTFVATISDDKGGSTTQDLVFTI
TGSNDVPSVTTTTPSAIVEITGDSSGQDIGATTGTLTITDQDLGDTLTFS
VSNDATAKYNGGPVPTADSTVSVATLVAKSAISFAPPVTTNGESQDVVWT
YNPAAADLDWLRAGENLELTFVATISDDKGGSTTQDLVLTITGTNDAPTL
TALDACSYTEGAEATLIDSYVTLADVDASTRMNGGTVTVSITEDGLTTDQ
LSILTQGSEAGEIGVSGSTVSYGGTAIGTIDSTSNGVNGVALLITLNGNA
TPTAVDALIQRLAYRSTSDDPTQASATRTLSITVVDGDGTANGGTDSVMA
TSTLTITPLNDAPTITPTAGAASYTENNGAITVDSAITVTDADDTQIANG
TVTISSNFLAGDLLAINLIKGTGDNEGKFILAGSVQTNISGSYTSGTLTL
SGTDSVANYQAVLQYLTYEHTSDDPTNNTLKPNRTLTYSLTDANSDGAGA
ATGTATRTINVTALQDNPQVTTTTATAQVYTENSDPVIVDSALTLTDADD
TEMSGATVQITENLKAGDLLAVNLTIGTDAHAGKFILAGTVQTNISASYA
DGTLTLSGTDTKEHYKAVLRAVTYVNTSNNPNTNNATDPLARTITFTVTD
ANSDAVGAANGINTRTLDVTAENDKPVINGTVISPTSVETNGEGSGTSVV
KLLSGSTVTDADFFTAGTNFGGGNLTVNFTDAYVVGDVLNVESCTLAVGA
IQRSGNDVQYSSDGTTWITLGTVDNTNSGVGKSFVINLNTNADQTNVAAL
LNAISYQSTSDNPTLNNSDTSRAYSITLNDGNNNNLAGGTDEASQTSIAV
TGTITITPTNDAPIVDLNGATEGAASSVTWAESSNATHQAVTISPSATLA
DVDNLNFTQMQLVISGLHNGNSEVLTIGGTVFPLDKNATNVDVGNFVVSY
DTSLHAFIIIPDESGTIETLTNFQTLLQGITYNNTTDNPTVGDRTVTVSV
TDAGHSDSATVSGAVTSVVATATITVTSVNDQPVITDVTNVSFSENAINA
TASVIDSSITLIDIDSAIYDGGSVTVSGLVAGQDKVALPLAPTDASGNVK
WTGVNGGAVSYYNGTAWIAIGTATGGDGNNFVVSFNSSATPAIAERVIEN
LTFANSSHNPSTERTLTIAVNDADGGTVQTADVAVTIVRDNDAPTISSLD
ATAATTYIQAGTAVALDNNVALSDFDLEAYGNGSGNWSGSTLTIQRQGGA
STDDLFGASGTLSLSEDNVVVGGTTIGTYTNSGGTLSMTFSTSATTALVN
SALQNITYSNADTVSGHLGYNSVVLAYTFNDQNSNATNGTAGTGQDQGVG
GYATASGTITVNINRLPVVVNDTNSVAEGLATTDSTTISGNVLTGVGNTS
NVGADSDADVGLLGRSDALVVVHAKDSSDGSYTAITASTTSANGSTIAGD
YGSLKIGADGSYIYRVNNALNVVQALAINETLTETFAYQVHDGVGGYNAA
ALTITITGTNDAPVTASITQTNLNEQITTSDLTSNITASFSDVDLTDIGH
TAQITAVTVTGVTAGLSLTEAQLKDLISIGTVTKATGSTAGSVPMTFTAA
STVFDYLAVNEVATLSYTLEVDDHDGGKPTKTFVVQITGTNDAPVITATD
VVGTITEGSTLSDSGSISFGDLDLTDRPTAAEATKSVSALQANGTTELVL
TNAQQTAIENAFTITADSGNDNDGTISWNYSISETALDFLAKDETVTVTF
TITVSDGKGGSDSEDVTVTIIGTNDAPVITATDVVGTITEGSTLSDSGSI
SFADLDLTDRPTAAEATKSVSALKANGSTVLVLTSAQQTAIENAFTITAD
SGNDNDGSISWDYSISETSLDFLAKDETVTATFTITVSDGKGGSDSEDVT
ITITGTNDAPLITAPNVAGTITEGSTLSDSGSISFGDLDLTDRPTAAEAT
KSVSALQANGSTALVLTSAQQTAIENAFTITAESGNDNDGTIAWNYNISE
TSLDFLAKDETVTATFTITVTDGKGGNDSEDVTVTIIGTNDAPVITATAE
NIAGTITEGSVSTLSDSGSISFGDLDLTDRPTAAEVTKSVSAVKADGSTA
LVLTSAQQTAIEDAFSITAAAENDNDGTISWDYSISETALDFLAKDETVT
ATFTITVSDGKGGSDSEDVTITITGTNDAPLITAPNVAGTITEGSTLSDS
GSISFGDLDLTDRPTAAEVTKSVSALKANGTTELALTDTQKTAIENAFII
TPNGVSGSNTHDGTISWDYSISESALDFLAKDETVTVTFTITVSDGKGGS
DSEDVTVTITGTNNAPVITATDVAGTITEGATATLSDSGSISFGDLDLTD
CPTATEATKLISALKANGSTALALTNAQQTAIENAFTITPNGVSGSNSND
GTIAWNYNISETSLDFLAKDETVTATFTITVSDGTGGSDSEDVTITITGT
NDAPVITGVDVIGTITEGAASTLSDSGSVSFGDLDLTDRPTAAEATKSVS
ALQANGSTALVLTSAQQTALENAFSITPNGVSGSNTNDGSISWDYSISES
VLDFLAKDETVTATFTITVSDGKGGSDSEDVTVTIMGTNDVPTITNQSTA
LAGTVIEAGNNDDGSEVAGTSTVSGTLSASDVDAGATQTWSIQGTPSTTY
GSIVINPTKGEWTYTLDNTKATTQALKEGQSVTQSYTARITDDKGAYVDQ
TITVTIMGTNDVPTITNQSTALAGTVIEAGNNDDGSEVAGTSTVSGTLSA
SDVDAGATQTWSIQGTPSTTYGSIVINPTKGEWTYTLGNSDSDTQALKEG
ESVTETYTARVTDDKGAYVDQTITVTITGTNDIPTITNATTALAGTVIEA
GNNDDGTAVAGTSTVSGILAASDVDANATKTWSIQGTPSTTYGSIAINAT
TGEWTYTLDNSDSDTQALKESESVTQSYTARVTDDKGAYVDQTVTVTITG
TNDAPVITNATTALAGTVIEAGNNDDGTAVAGTSTVSGMLAASDVDASAT
QTWSIADVSPSTTYGSIAINATTGEWTYTLDNSDSDTQALKEGESVTQSY
TARVTDDKGAYVDQTITVTITGSNDAPTVTNATTALAGTVIEAGNNDGGS
STAGTSTASGIFAASDVDASATKTWSIADVSPSTMYGSIAINSATTGEWT
YTLDNTKATTQALKEGEIVTQSYTARVTDDKGAFVDQTITVTITGTNDQP
IAFDDTKQTNEKSILSSQVPPATDVDGTIASYELVESVSEGSLTFNADGS
YSFNPANAFNDLGVGETRNVSFTYKAVDNNDGRSNAQTITITVTGTNHAP
TSTDDTVAVTEETAKTLTINDFGTFSDADAGDSLSAVVITTLPANGTLTL
NGTPISEGQSITVADINAGKLVYTPASQDDTDESLSFKVKDADGAESSSA
YTLTLDILPVDDPPVSTNDTVAVTEDTAKTLTISDFGTFSDPDSGDSQSA
VVITSLPSNGTLTLNGTPVTANQTISVADINAGKLVYMPALHDDTDESLG
FKVKDADGTSSNNPYTLTLDILPVEDAPSATNDAVTTNEDIFVVLALADF
GTYSDPEGAALASIKITSLPTNGVLHYNSGTVQTPLWVPVPLNHEFTIAS
LEAGMLRFTPDANENGTNYATIGFNVSDGTAYSTSASTLTVNVLPVNDTP
KSSDDLISIQENTPKVLALSDFGTYSDVEGSALSSVIISALPQKGVLEYF
DTTLATPAWRAVTVNQHITKADIDAGRVQFMPASNENGDNYATIGFKVSD
GEAISEGYALRVDVIATNDAPLRTEEDVALALTMYDISRYSNTASTSIAA
VKITTVDGSGELEYFNGTAWAPVTLNQVITKAAIEDGLVRFMPASNENGD
NYATVSFTVSNGTAFSATPSTFTVHVTPVNDAPTSTNSRIATDEDSSILL
SLTSFGTYSDVENTPLTAVQFTTLASNGVLQFNNGSQWGTVRVGQELSAA
AIEAGNLRFVPDSNEFGLAYAIAGFKVSEGGNVWSNSSYTLTVDVASKND
VPSTTDSVIRTNEDTSKVLTVGDFGDYRDAETTSFTVVKITSLPTNGELQ
CNSATLAAPVWRAVTLNQELLRESIEAGRVRFMPDSNESGDNYATLGYQV
FDGEAYSAASNTMTVHVTPTNDAPTSTNDGIVTNEDVAAILSLENFGDYR
DAEGAPLGMVQFTTLPTHGALQYNSGTADAPQWVAVTLNQPITREAIAAG
ALRFMPAANEFGDDYTTMSFKVGEVSANGDVWSEAAYQLTVDVKPLNDLP
TTTNSVVRTNEDTPKTISLSDFGTYQDVESSSFTSVKITSIASNGELQYN
IATVELPAWQAVTKGQTITREDIEVGKLRFMPDSNESGDNYATLGYQVFD
GTDYSVDSYSMRVDVTPVNDPPTATDDRISTNEDSAKVLGITDFGSYSDV
EGAVLGMVQFTALPADGKLQYNSGTADAPQWKAVTLNQPLTRADIEAGKV
RFMPDGNESGEGYASMSFKVGEVSPNGDVWSEAAYQLTVDVLPINDAPLT
TNSSATTNEDTTLPLTLADFGSYSDVEGSPLATVQITALPTNGILQYNNG
TLENPQWVAVTLNKEITREDIEAGKLRFVPDSNEYWTPYTTVGFKVSDGT
TYSFDNYTLTLNVTPVNDLPTSSDDAVTIKENQVALLSVEDFGSYTDEER
TPLASVTITTLPNNGALQFNSGTPTEPVWKAVSNNQTITRADIDAGNLRF
VPDHNGNGEPYTSFQFTVSDGKGSSEAAYTLKVNVTPYNEEPIAAPEDKL
LLVTINDVSTYSGINSSSIVSVTITELPAEGVFEYFNTTLATPAWQDVTV
NQQITKADIDAGNLRFVPDSNEYSDNYAAFGFTVSDGISINPEIYTIPIA
VTPVNDAPQSTGDAITTPEDTTKVLSLTDFGEYSDVENSALAELKITSLA
TNGKLQYNNGTQWIAVSEGQTITRADIDGGKLRFVPDSNEFGEHYAEVGF
AVSDGSDYSLDAYTLVVEVTPINDLPTATNGIFITNEDTADTLSLEDFGV
YNDVEGAALRALKITTLPNHGILQYNSGTADAPQWSAVSEGQSIARANVE
AGMLRFVPAPNEYGDAYTTISFSVFDGSDYSETPSSITVKVLPVNDAPLS
TNDSISTDEDMPVLLTVADFGAYSDIEQTPLAKVKITMLASNGVVQHHNG
TQWVAVTLNQEISRADIEADIEGGKLRFVPDSNESGEPYATVGFTVSDGT
DYSNEFYSLTVAVRPINDPPISTNDSVVTPEDTPRILGVDDFGTYYDAEN
APLAAVTITTLPNKGLLQYHNGTQWVAVTEGQAISREDIDGGMLRFVPNE
HEQGSPYTSLEFTVNDGVVDSAVYTLTVHVAAMNDAPTLRAFEPSVMLVE
KGGIDNAIDGTATATIAIEKRDADGTASYNHTALENAGWSTSNGGATYSK
AGTYGTATLTIATDSVSYQLDDSRTTTQWLQGGQQVKDSFAIYVQDNATP
PANGSGNAIFTITGANDTPVAAPKAFSVTEDAPLVTGTLSSTDADAGDSA
TYTLNAAVAGLTLNADGSYSFNPSDAAYHYLNNGERQTVVANYTVTDAQG
ATDESTLTITINGRNDVFLSVDDITVNETAGSATFTITRSGDTAVATSVH
YATSDGTAKKTFDYSEVNSTVTFAIGQTSKTVVVPILDDAIFEGSEMFNL
VLSSQPTGTTLSKNFAIGTIKDNEAAISTSSSSNDSGGSGTIDPATPLTI
QLSGKGDISESSDAIFTVTLNRATTEDVTEVALTLGVKGDSAIAGNNKDY
STDYSAYYFIGEGAEQQKIDLSIANNKVQLPIGVEEFFIAIPTKSDKEYE
GAERFTLSASLDNGQSATARSTILDDGSGQVYDEHGIVDATQKGDDDWYL
EIPDVTVNEAAPYAVFRVLASSNISFTMQLEDGGVDPDGNPYESDGIATM
GEDYTNALEIYNGEGWIPYTVGSPINVPKGGSVLLVRVPIKNDDSYEGAH
AFTLVATPSGNREVKRPLGIIGDFGTGAIYNDSGAEDRKADKDDDRQLKI
DSPIVNEGSTYSLFTITGKAGAVTLTIKDDESADTDPADKILEIADKNSL
IQLWNGSSWIPYNGTNAALVDADNNSATTETLLVRVNITKEQEQVGAAIE
REGSETFMLNVVQGSGADKAESFGVSTIRDDGTGVIYLFEKSAGVADKDG
TGVYVPPPSIKDLDDDYDQDGITPTTEEALATLAASQGIGYAKQGDMNGD
GKEDATQNALATLAWTTKEKFDEGNDGTLTDSTAIISIGVAAKEDSTTSE
ISDSLQLVAIEVKKYGEIDGATTVTENKNDKGEVESQTITLVNGSEVTTP
WDPIVFKIQGQDSDYDGVVDADKKLEENSVRDISSRQGTQVKVIIDVRAA
GLTSNDVNAYIKYVSSDVLKHLTLYDLHGKQITKAGWYDFTRLDPTTDND
GAHLIFDEEVEGEVKPLLEIELIITDNQFGDNDHVLGKIHDPGDLVKITK
NAADPSTPIYTADQTPNDVDFYGDTSGTSVPLKTWYNPITGDYFYAPATV
APPYNCYIERTDINAGTVLPVNDPARAYNVHLYLNDAGDTQLAGESSALL
NKGYRDLGAIFASAKAPVLDSSAPTVTAFAPTDDAVDVPLYKDVELIFNE
EVTKGLAGSISLHENTAAGTVVQAQVTFDGQKLIINPDYDLLPNTHYVVT
VDNGAVIDLAGNAYNPSTLAYDFTTGTQGADPYADGSDNGFSTGEVLGGI
AALGFITWLVL
>Cag_1926 conserved hypothetical protein
MKRLTIPTALLTVTIAIASPLYAAAPPTTQELIAQAEATRKEAAAIGYEW
RNTAQLIKQANDALTANNEPEAQKLASAALLEAEQAVKQGKWMQANWQTL
IPTL
>Cag_2015 ATP synthase F1, epsilon subunit
MASSDKGFTIEIVTPQSLYFSGEITSVIAPGIEGQFQVLKNHAPLLAALK
TGKVKLSLANRGEQSFTVADGFLEVSNNKAILLTESVS
>Cag_1072 sialic acid synthase
MIVMAEIRIGSRMVGNGHPVFVIGEIGINHNGSLENAFKLIEGAARAGCD
AVKFQKRTPDLCVPKDQRDIERDTPWGRMKYIDYRYKVEFGKEEYAAIDG
CCKEHGIEWFASCWDEEAVDFMEQFTPPCYKAASASLTDLPLLKKTATTG
RSLIISTGMSTMEEVERTVAELGMEKLLIAHTNSTYPSPIDELNLRMITT
LKALYPEVPIGYSGHEVGLATTWAAVALGATFVERHITLDRAMWGSDQAA
SVELSGLAKLVENIRDIEKALGDGVKRLYEGEAAARKKLRRTS
>Cag_0067 H+-transporting two-sector ATPase, delta (OSCP) subunit
MSVVIASRRYANALLSVVEESNTIDKTLDEMNAISEVLHHSRDLVHALKS
PLISYDKKIHIVEEVFKGRVSETVMFFLKLVGKKNRLGHLPHIVDEFKNL
LDEDRGIINVDITSAVELSDEQANELVATIANMSGKQVRATLTVNEELIA
GAAVKIADTIIDGTVRHQLSKLRSSLVAA
>Cag_2017 hypothetical protein
MKLFMKAKIVAQHATKAALTAQAVKPEEATEVATVTDMGDSAGSFRVILF
NDEEHTFEEVIQQLMIALSCTRSKAERLTWVVHTRGRCMVFAGSLEEALQ
VSAVLEVIALRTEIQSVG
>Cag_1530 hypothetical protein
MEKIESIKTLYQQSHNNLETLYNVLSQKAFKGELDLSRVALVVEEKQKNL
>Cag_0514 dTDP-4-dehydrorhamnose 3,5-epimerase related
MHVIPTTIPEVLMLEPKVFGDERGYFFESFRQDVIEEHIGQVHFVQDNES
KSSYGVLRGLHFQKPPYTQSKLVRALFGKVLDVAVDVRHGSPTFGQHITC
LLDSERKNMLWVPKGFAHGFVVLSPEAVFAYKCDNYYTPSHDAGIAWNDP
ALGIDWQLPLADVRLSSKDAAQPSLSSVDCFPYDAYRQAELYPPLIS
>Cag_1288 S-adenosylmethionine:tRNAribosyltransferase-isom erase
MQVADFDYHLPEERIAKYPPLERGSTRLLVLNRQSGALTHSLYAHLDTFL
QAGDLLLLNNTRVVPARLFATRATGATIELMLLERHHHREQLVLYRGQLK
AGEMLQSHGHTLLVEEVLPQGLARLALADGGDLHQFFTQFGSVPIPPYLK
RNAEAVDRERYQTVFAAHSGSVAAPTASLNMTPELLQRLKNNGVEIAHIT
LHVGLGTFLPIRTESLEEHVMHRESYLLPAESVAALQRVKADGGRVVAVG
TTVTRALEHSAPRILQSSAHAEIEGEADIFIYPGYRFQMVDLLLTNFHAP
RSTVLMLTAACAGTDNLRAAYSEAVLQGYNFLSYGDSMLIC
>Cag_1420 Glyceraldehyde-3-phosphate dehydrogenase, type I
MAKVKVGINGFGRIGRLVFRQAMNNPNYEIVAINDLCDTVTLAHLLKYDS
AHKKFNGTVSVDGSNLLVNGQVVTVCAERDPSALPWKALGCDLVVESTGI
FTSREQAAKHIAAGAKKVIISAPAKDKVDATIVLGVNGECITGKEEVVSN
ASCTTNCLAPMVKVLQDNFGIQKGFMTTVHAYTNDQNILDLPHKDLRRAR
AAALSIIPTSTGAAKAIGEVIPELAGKLDGFAMRVPVPDGSVTDLTAILN
REASKAEINAAMKAASEGAMKGFLEYCEDPIVSQDIVGNPHSCIFDSKLT
MSSGTMVKVVGWYDNELGYSTRVTDLLDIYSKFV
>Cag_0238 trigger factor
MQKTITPLSPTEQELEIILSAEEFGPEYNKELDEAKRTVHIKGFRKGHAP
MGLLKKLAGPSIEIAVAEKLGGKYFTDIITAENIKPANRAQIADFAFTDD
TLTLKLVYEIHPEFELQDFSGYTFTKANYVVGDKEVEREIELILKSHGTM
VTSDEPASEKDTVIGDALRFNDEGELDEASKVANHHFSMEYLPAENPFRI
ALLGKKAGDVVEVVNTPIKEDEEGEEATAEDAEAVKPTRFQVTVTEVKRL
ELPELDDELVKEISQERFDNVADFKADVRLQLEQHFTMKSDNDLLEAISG
KMVEENPVPVPNSMVESFQDMLLENAKRQLSGNFPKGFDVTGFRASMRDN
AEKHARWMLITQKVAEMNKLTVTDEDIVAFAEKEAGKNEALDIKQIIETY
KSPDFHDYIADSIMKEKVYNAITEKVTVTEEETPVPEHRM
>Cag_0196 conserved hypothetical protein
MQQPTPQQSLTTMPVEGRLFALALCSLLIVLTTTVPYLTLINVLFFSGIF
WSGFIALHQTILRYQVPLSLRNAFVLGSLAGFVGGLASELLGIILMVLFD
YRPGIESLSLIVEWATQQAMQNPELQEQVNMLQEAEKLAKTPITLGITDV
LFNLAVTGMVYAPIAGLGGMFAVRWLKFQAARK
>Cag_1023 Glycosyltransferase-like
MFQQRKPRLLWANLYCLLDSSSGASISVREMLRQLAYNGYEVEVIGATIF
DAVSGMSALPPQWKKRLETTDILELNDAPLRHKLLMTNSHQRDAVTALEE
AKWYEFYLHTLNTFKPDVVWFYGGRPFDYLISDEAKHRGIPVAAYLVNGN
YTKTRWCRDVDCIITDTQATADYYHRKNGLTLTPVGKFIDPKMVVAAEHL
RRNVLFVNPTFEKGAALVVQIALQLEQLRPDIQLEVVESRGSWRGMVEYV
SARLGKPRTGLSNVQVMPHSRNMRPLYSRARMVLAPSLWWESGSRVLAEA
MLNAIPALVTDNGGNREMVGEGGIAIALPANYHAKPYIELLTSELLEQFV
AQIICCYDDEQFYQTLVAQATLYGCTTHHISTSTQKLLKVFGKLIASSSK
ELSYK
>Cag_1542 conserved hypothetical protein
MEKIDVNVYGNSFPLRSARRELTEKAARDVDGVMRLFAEKAPTFGEAKLA
VLAAIQFAERKIELEEEIGALRQQLGRLNLFIGQHVE
>Cag_0520 conserved hypothetical protein
MAWFLRCLILISMKIFRWNTEKNELLAKDRGITFEEIVEIIESGAKIIEV
DHPNKKKYPNQRILIVDVRGYAYMVPFVKDGNEYFLKTIIPSRKATKKHL
GG
>Cag_0154 Excinuclease ABC, A subunit
MSFSHISIRGARVHNLKNISLDIPRNQFVVITGLSGSGKSSLAFDTIYAE
GQRRFMETLSPYARQYIGNIERPDVDFIEGLSPVIAIDQKSTSRSPRSTV
GTITEIHDFIRLLYAKAGRRYNPETGAMVQAQSADNILATILALPEGSKV
QILSPLVTGRKGHYRELFERLRSKGFLRVRVDGELQEMVPNMQLERYKSH
TIELVVDRLVLAPESEARVREAVMLAISISEHKSSVICTPFEGGFTELAF
TLSKGDNEDALPTSTLAPNHFSFNSPYGACPTCNGLGELMQLSGELMIPD
PSLSLNQGGLDPFGKAGKRNHWQVIRAIAKEFDFTLDTPMSKIPKSALKI
LLNGSGKRTFEVAYTSSGHTSLYPQPFQGAVAYVQEILNNATTSKVREWA
EAYMLHQPCPVCLGARLKPESLQVKIHGLNIAELEALPLPETLAFFNNLP
PNLSQKELIIATPVLHEITKRLQFLLDVGLGYLSLDRSSHTLSGGEAQRI
RLASQLGSQLSGVLYVLDEPSIGLHQRDNHKLITSLKHLRDLGNTVLVVE
HDKDTMLEADTIVDLGPGAGAYGGEIVAFGAARELDPSSLTAGYLNGTNR
VFYASEASSEKTDADADATPLFLTLKGCKGNNLKNIDAQIPLRKLVSITG
VSGSGKSTLINETLYPILARHFYRSKVVTAPFDAIEGIELLDKVVNVDQS
PIGRTPRSNPATYTGAFTFIRDFFTRLPEAQIRGYKAGRFSFNVKGGRCE
VCQGAGTRKIEMNFLPDVYVQCENCKGERYNRETLMVKYRGKSIADVLEM
SITEAAEFFTDFPRIRRILNTMQSVGLGYLKLGQPSPMLSGGEAQRIKLS
AELAKIQTGKTLYILDEPTTGLHFQDTQHLLEVLRKLVEKGNSVIIIEHN
LDIIKNSDWVIDLGAEGGFEGGTIIAEGTPQQIADTPHSHTGRFLKMEMG
G
>Cag_0784 Dehydroquinase, class II
MGASAFSSTAEAPFCVFTPLTTMNNLSLLVMNGPNLSRLGKREPTIYGTR
TLDDINHDLATAFPSIRFDFFQSEYEGALLEKLFQTEDEKSCDGVVLNAG
AFTHYSIALRDAISAITIPVVEVHLSNVHTREEFRHKSVISAVCVGVISG
FGEQSYHLGVHALMALPQLQRRG
>Cag_0070 Arginyl-tRNA synthetase, class Ic
MIDFFRSAIAAALASAQLNTAKPIQLEQPADKKFGDFSTNIAMLAAKECG
KKPRDLAQEIINHLAFPPDTVAKIEIAGAGFINFYLTPRFIMRSVEQVLR
DGEKFGQSTEGNGKKAIVEYVSANPTGPLTIGRGRGGVVGDCIANLLATQ
GYEVTREYYFNDAGRQMQILGASVRFRYLELCGNTIEFPEDHYQGDYIRD
IAATLYEQHGNALEKSEELEPFKKAAEELIFKSIKATLERLNIRHDSFFN
EHKLYLANNNSPSANTAVIQSLSSSNFIDDYDGATWFLTTKLGQEKDKVL
IKSTGEPSYRLPDIAYHVNKFQRGFDLMVNVFGADHIDEYPDVLEALKIL
GYDTTKVHVAINQFVTTTVNGQTVKMSTRKGNADLLDDLIADVGADATRL
FFITRSKDSHLNFDVELAKRQSKENPVFYLQYAHARICSLLRLAKQEVGF
DADTYGFADVLQVLDGAAEVQLGFALLNFPAMIQSSLRLLEPQKMVEYLH
ALAEHFHRFYQESPILKAEPEVRTARLLLSVATRQVLRNGFTILGISAPE
AM
>Cag_0779 methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydrolase
MIVIDGKKVSLELKNELRERVEALNAKCGKVPGLTVIIVGEDPASQVYVR
NKAKSCKEIGMNSTVIELPADTTQEHLLKIINDLNNDDTVHGILVQQPMP
KQIDEFTITLAIDPAKDVDGFHPENLGRLVMGHLDKCFVSCTPYGILELL
GRYNIETKGKHCVVVGRSNIVGKPMANLMLQKLKESNCTVTICHSATPNI
TEFTKQADILIAAIGKANFITADMVKAGAVVIDVGINRVDDASAKNGYRL
VGDVEYAGVSALASAITPVPGGVGPMTIAMLLKNTVQSFQRVNGL
>Cag_1584 heterodisulfide reductase, subunit A
MSVETIVIVGGGISGITTAVEAAEVGYNVILVEKNAYLGGRVAQLNKYFP
KLCPPYCGLEMNFRRIKLNPKITVYTLTEVENVSGKEGDYSIKLKVNPRY
VNEKCTACNACAEVCPAERSNDFNFGMNKSKAIYLPHELAYPTKYVIDRK
ACAQSCDKCVKACVYNAIDLTMKPETVEVKAGSIVYATGWNPYDATKMQN
LGFGRVKNVITNMMMERLAAPNGPTGGKIVRPSDGREVKKVVFVQCAGSR
DQNHLNYCSAICCMASLKQATYIRDRYPDADIMIAYIDLRTPGKYEAFLN
KVENDKRIRLVKGKVAQIEEDRATGNVILTSEDVEGGGKSTYEADMVVLA
TGMAPSVSDHPMLAFEQNGFIQGGKAAGIYSTGVAKRPSDVTTSLQDATG
VALKSIQSLVRS
>Cag_0301 TPR repeat
MNMLQPPVVVLMTDFGITDTFIGQMKGVILSLCPIAQLIDLTHAVLPQNV
VQGAFLLGKSLPFLPDGSVVVAVVDPGVGSTRRIIAVQTSRHTFLAPDNG
LLTPMLASGDVQQCVSVTNERYMLPQRSSTFHGRDIFSPVAAHLAAGVPL
AELGKSMPMAECVQLEVLRANVLDNGNCIESTILYTDHFGNAVTTIEREL
LAEKHDWLIHVNELRLPLSTTYSDVAEHQPIAYIGSSGTLEIAIRNGNAA
AALGLHAGVAVRMERGEWEVESGMWEEVVQAIPDLLKQGVTLHQSGKHNE
AEACYQQILKQQPHHIDALHLLGVLFYHKKEYSKALDLLNQAIALKPTFT
EAYSNRGAVLKELKRFDEALASYNKALELKENYAAAWYNRANLLKEWKQF
SEAIESYNKAIEFQPNYPEAYSNRGVVLKELKQFDAAFASYNQAIALKPT
YVEAYSNKGTVLKELKQLDAAIESFNKAIALKPDYAEVQWNKSLVLLLSG
NFIDGWMLYEWRWKKADFTSPKRNFTQPLWLGKESLEHKTILLHSEQGLG
DTLQFCRYATLVAKRGARVILEVPDILIPLLKQLEGVEQIIAKGKKIPPF
DYHTPLLSLPLAFTTRLENIPSPSKYLFIDNNKIEEWKQRLHTIPHPRIG
LVWSGRAEHKNDHNRSIALADLLRYLPNKYHYVSLQKEVRDSDKKTLDVT
SNMVHFGNELHDFADTAALCELMDLVISVDTSVAHLSASLGKPTWILLPF
IPDWRWLLDRNDTPWYASATLYRQHTRDDWESVLKNIATDLYDYFTTDNK
VAVSHKATKAIQALLKEAIKLHQSGKQNEAAICYKNIIQLQPNHVDALHL
LGVVAFQKEQYNEALNLLNQAIALNTDFASAYFNRGLVFKNLYHFDKALE
DFDRALRLKPNYAEAYHKRGNILKELGLITAALSSYNNALALKADYAGVY
LDKAIILLLLGNFADGWDIYEWRWKCKDLPLVQRNFTQPLWLGQKDIQSK
TILLHSEQGLGDTIQFCRYTQLVAERGALVILEVPASLASLMQSLEGVTE
IVVKGKKLPPFDCHCPLLSLPLACNTTLENIPSPSKYLSSNTKKRNKWKD
RLQAIPQPRIGLVWSGSTQHKNDRNRSIELSELLQYLPDAYHYISLQKEL
RESDKATVEATSNIVHFGDALHDFADTAALCELVDIVISVDTSVAHLSAA
LGKPTWILLPYIPDWRWLLDRNDTPWYASATLYRQAHRDDWLSVFKRLQE
DLQQRCMVESAQQQLPIHTSQNNHIATLLQQGVQLHKSGKQNEAELCYQK
ILQLQPNHADALHLLGVLSFQKENYSQSLELLNQAIAIKSDFASAYFNRG
LVLKNLSQFEKAIEDFNKAIEQKPEYASAYHSRGTVQKELKQFDAALKSY
EKAIALKPDYTEAYCNRGNALQLLKRFNEAIDSYNKAIALKPQYAEAYSN
RGVVFRELKELDTSLDNFNKAIELKADYAEAYSNRGVVFRELKMLDNALA
DFNKAIELKKDYAIAYWNKSLVLLLLGNFAEGWQCYEWRWKKADFTSPKR
NFTQPLWLGEESIADKTILLHSEQGLGDTIQFCRYAPLVAELGARVVLEI
PSSLALLLQPLDGVAEVIVKGKTLPPFDYHCPLLSLPLAFKTTLETIPFP
TKYLSLPSHKIKQWQQRIGNIAKPRIGLVWSGSTKHKNDHNRSIELSKLL
EYLPDHYHYISLQKELRESDKATLEATANMVHFGDELHDFTDTAALCELV
ELVISVDTSVAHLAAALGKPTWILLPFIPDWRWLLDRNDTPWYASATLYR
QHTRDDWTAALERLHEDLRQRFLVSEM
>Cag_1129 hypothetical protein
MSNQEIVEQARTLQPMDRLWIIEQLLQSLDEPDATIAEIWAEEAEKRLEA
YRNGTLEAIPMENIFHD
>Cag_1364 hypothetical protein
MFNTIDIDRNKLTIMGVKFSDLQTLESTANAIGSNMFEGYKPTPKGIEII
RDYVIGKITLTELIKLAKEKAYI
>Cag_0224 DNA polymerase III, delta subunit
MEKLKKAIQSKKIAPIYFFTGSESYLKEEFATLIQGALFASEEDAVANTH
LLHGHDMTLRELLSRASEYPMFTERQLLVVRHFEKIKKPTTKEQQKQQYA
AFGNYLANPATFTVLLLDADELDKSDFEKQPFSLLKSVRHDFPAIKHPDL
FASERAAQAGWEFEPDALKAFAAYIDPSSREICQELDKIILYASERQSAK
RITAADVLDCVGVSRTYNVFELEKALVARNLRLCSGMSLMIMDQEGQKEG
LMAIVRYLTTFYMRLWKISMPEVQRMAQSDIAKVLGMSPRQEFLIKSYLT
YTRQFSLQQTEAALCALRDVDASLKGLRPYSDEKYLLLQLMQRLLG
>Cag_1654 ATPase
MRAVPPEIVPIPTLPGEPEMVLHTGVREKEADLDERKLKVHTKDRDAYRF
PSIDLLEKVHDDDEAIDEQHLAESKRRLLEKLRIYKIEVKRIATTVGPRV
TLFELELEPDVKVSRVTSLENDLAMALAARGIRIIAPIPGKNAVGVEIPN
AKPKTVWLRSVLQVEKFKSNNMILPIVLGKTIANEVYIADLATMPHLLIA
GATGAGKSVCINVIISSLLYACSPDKVKFVMVDPKRVELFQYQHLKNHFL
MRFPGIEEQIITDPQKAVFALRCVVKEMEMRYEALEKGSVRNIGDFNRRY
PDDALPYLVVVIDELADLMITAGREVEEPIIRIAQLARAVGIHLIVATQR
PSVDVITGIIKANFPARIAFQVASRVDSRTILDGSGAEQLLGNGDMLYQP
SNQPKPIRIQSPYVSSSEVEEITSFIGAQHALKNIYVLPMPDVSKNSNMQ
GGSAQDRDGRDAMFEEAARLVVMHQQASVSLLQRRLRLGFSRAGRVMDQL
EANGIVGEADGSKSRDVLIKNEDSLELLLRNLD
>Cag_1116 DEAD/DEAH box helicase-like
MSEQQLPLENNFFSLQLPELLMKALEEVGYESPTPIQAQTIPFLLAGRDV
LGQAQTGTGKTAAFALPILASIDIQQAEPQALVLAPTRELAIQVAEAFQR
YAEYLKGFHVVPIYGGQDYGIQFRMLRRGVQVVVGTPGRVMDHIRRGSLN
LTHLKTLVLDEADEMLRMGFIDDVEWILEQTPAGRQVALFSATMPPPIRR
IAQKYLDQPAEVTIQTKTTTVDTIRQRYWVVGGSHKLDILTRILEVEPFD
GMIIFSRTKTMTIELAEKLQARGYAAAALNGDMPQNQRERTIEQLKNGNI
NIVVATDVAARGLDVERISHVVNYDIPSDTESYVHRIGRTGRAGRAGDAI
LFVAPREKNMLYAIEKATRSRIEQMVLPTTEVINNKRIAKFNQRISDTIA
AEDLGFFTRMIEQYCNEHNVPMLDAAAALASLVQGETPLLLADKPERSRS
SERDSYGSSRDRGFEREGRDSRSGGREGRSDRFERDGRSGRDDRGGRDER
SAPRKRGRSEVYGEEPKDRYRLEVGSTHGVKAGNILGAILNEAGLAPESV
GHISISDTYTTIELPKQMPDTMFHELRKIRVCGRQLRLSRMEEHEGGHST
HSSHGTGAYGGGAKKSFRKPNKSANDEGEFFAGFKKKRKG
>Cag_1587 ATP-sulfurylase
MSLVNPHGKEKVLKPLLLTGEELTAEKARAQSFAQVRLSSRETGDLIMLG
IGGFTPLTGFMGHDDWKGSVQDCRMADGTFWPIPITLSTSKEKADELSIG
QEVALVDDESGELMGSMVIEEKYSIDKAFECQEVFKTTDPEHPGVLMVMN
QGDVNLAGRVKVFSEGTFPTEFAGIYMTPAETRKMFEANGWSTVAAFQTR
NPMHRSHEYLVKIAIEVCDGVLIHQLLGKLKPGDIPADVRKECINALMEK
YFVKGTCIQGGYPLDMRYAGPREALLHALFRQNFGCSHLIVGRDHAGVGD
YYGPFDAHHIFDQIPADALETKPLKIDWTFYCYKCDGMASMKTCPHTAED
RLNLSGTKLRKMLSEGEQVPEHFSRPEVLEILQRYYASLTQKVDIKLHSH
AVGK
>Cag_1721 conserved hypothetical protein
MKPLPVGIQTFSEIIKQDYLYIDKTSLANELIKRYKYVFLSRPRRFGKSL
FLDTLKNIFEGKQELFKELLIYKQWNWNVTHPVIKISFSGGIRDKESLRD
NLFYILKDNQERLNINCEEKNNQNLCFAELIKKVYQKYQQKVVILIDEYD
KPILDNIENIPEALIVRDGMRDFYSKIKESDEYLRFVFLTGVTKFSKVSL
FSGLNNLEDISLNPDFGNVCGYTQHDVDTIFAPYFEGVDMEEVKRWYNGY
NFLEDKVYNPFDILLFIKNQRMFKNYWFETGTPRFLIELIKKNNYFIPKL
NKLKVNESLVNSFNLENLNLETILFQAGYLTIKRLLPSGMGVGYELGFPN
KEVQISFNDYILQVMTIVSDKEPIRYELFDIINNGDVANLEPIITRLFAS
IAYNNFTNNYIESYEGFYASILYAYFASLGFDIIAEDLTNNGRIDLTLKN
YEKTYLFEFKVSNQEPLEQIKKMKYYEKYDGERYLIGIVFDPKARNVSQF
VWEKV
>Cag_0909 conserved hypothetical protein
MTITFSQKARLQIEENVRFIAADKPNAARKWAAGVKQAVYKLKEFPYLGR
QVPEYANDTLRELIYGEYRIVYQVNIELSRIEVLSLFHSKQLL
>Cag_1828 Ribosomal protein L36
MKIYSSIKKRCEHCRIVKRKGKRYVICKVNPSHKQRQG
>Cag_0015 conserved hypothetical protein
MQRDEILSILSNHKAEFSERYRFKKLGIFGSVARNSASTTSDIDIVVDME
PNILLRAQFKEELEQLLGSKIDLIRYWAKMNAYLKARIDQEAYYV
>Cag_0459 transcription termination factor Rho
MANNPVVKGLDINLLQRKKVAELHALAKELGVMAAGLRKEELIFKIIEAQ
AQRTNDPDSATVMVNTGVLQVIPEGYGFLRSANYNYLSSPDDIYVSPSQI
KRFNMRTGDTVSGQVRAPKEGERFFALLKINTIDGNDPEITRERPFFENL
TPLFPTQRLRLETRQSEMCGRIMDIYTPVGKGQRGLIVAQPKTGKTMLLQ
MIANAIIQNHPEVYLIVLLIDERPEEVTDMARSVAAEVVSSTFDEDPERH
VLVADMVLEKAKRLVEVGRDVVVLLDSITRLARAHNTIIPHSGKILSGGI
DANALTKPKRFFGAARNIEEGGSLTIIATALVDTGSRMDDVIFEEFKGTG
NMELVLDRRLSERRIFPSIDILRSGTRREELLFSQEELSRTWLLRKYLAD
KNPVECMEFMREKMADTKDNRDFFKYMNA
>Cag_0305 conserved hypothetical protein
MEIIFRPEAEAELFEAQAWYESRSQGLGIEFAQAVTVAVESVLRMPFAYP
RQFAT
>Cag_0608 hypothetical protein
MAYLRSGISLIIAGVSIMHFSHQFWYWIIGIACIPTGIVTGFFGVWRYIT
ISKSITIVRRELPLADQREAEQMK
>Cag_1695 reverse transcriptase family protein
MKRKGKLVEQIADLHNLYEAFYKAQKGKQAKRYVCAYRKQLQENLQLLRH
QILSGAIQTGKYHAFTIYDPKERVICATPFSQRVLHHAIMNVCHPFFEKH
QIAGSFASRKGKGTYAALDKAREYNCCYRWFLKLDVRKYFDSINHTVLQK
QLTRLFKDKTLLLIFEQIIDSYSTADHKGVPIGNLTSQYFANHYLSVADH
YAKEGLRVPAYVRYMDDMVLWHNEKEELLAMGYMFQTFIAKELLLELKPF
CLNATHKGLPFLGYLLFENQARLAPRSKKRFLAKYQRYENNLQSGVWTQQ
EFAKHALPLFAFTEYAQAREFRKKSLHSFCSLEGVFVRSSKGID
>Cag_0263 Alpha amylase, catalytic subdomain
MTPLSTQLLSKEHFYVNRQCRQLFAVTDPAFLSLPAGGQESQQVSERQST
IINSFLARQATSGSPKAVLPAQLNGMKLLHELFHAVMVKSTLLRQPTFFP
NLVEKLPEIFAPDEPETYLTSFVTAFPTQALYKGDITATEWLEGEGRQSN
VLEESFLVWLNNLNPALQPFNALLSDELLRTDHYRRLIAAMRNAMQEMGP
IKPYNLDLAELLFSPIKHAPASILDQLRYIRLNWAVLLEDSPFWDLLDEA
IAFIEDEDKYLFFEQVAREHAHRKDGRFFEKEAHVPHYSIDDANAPANYS
PDSTWMPEVVMLAKSTYVWLDQLSKRYKRHIARLQDIPDEELDAIAERGF
TALWLIGLWQRSFASQKIKQLQGNPEAKSSAYALEKYDIADDLGGYDGYH
NLMERARQRGIRLASDMVPNHTAMDSELVRNNPDWFLSAHTPPYYNYSYN
GPNLSSDSRYGIYLEDGYWNRSDAAVTFKRVDYQTGDTRYIYHGNDGTNM
PWNDTAQLNFLSAEVREGVIQQVLHVARMFPIIRFDAAMVLAKRHIQRLW
FPLHGHSAAVPSRSAYAMSMEEFNAAIPEEFWREVVDRVKREVPDTLLLA
EAFWMLEGYFVRTLGMHRVYNSAFMHMLKKEDNADYRYLIKNTLEFDPQI
LKRYVNFMNNPDEDTAIAQFGRGDKYFGVCMVMITMPGLPMFGHGQVEGF
TEKYGMEYAKAYYDEQPDDHLVWRHYHEIFPIMKMRPLFAEVHNFYLYDV
YSPDGNVNENILAYSNRRGNEGALMVYNNRFEQAAGWVKMSVGYLQNGSI
RQSSLVEGLSLSTDNNRYVIFRSHPDNLEYCRSCRELADNGMFVALDGYR
YNLFLDFREVTPSRLRPYDKLAATLNGSGVHSIEFAALTMSLAPLHQTLT
EALTPAAPYADATSFATTLHTLLESMAQQYSELIEQAIEVPASLAIEYAA
RYEQAITLGQEIVAQSGTRKLCSALGLESLTAYPTLVAQWLMLDAIQTML
QASNVLHTTLFDEWLLEHALQALHPESGMVGLPGNNLPNLLAALLAPQPT
EAAEDATLEEHLLARIATLHNTQRHHIGRLMQYEPRHHKIWFREHRFSTL
LAWLLVQELLTNTNHEPTTWLEALDQLDMRSFLAGYEITALLKG
>Cag_1276 Dihydrolipoamide dehydrogenase
MTLSFDIMVIGGGPGGYIASIRAAQLGMSVACCEYNPYDDPAGEPRLGGT
CLNAGCIPLKALVASSEAYEHATTHFAAHGISVSGLSMDVTKMQKRKESI
VRRMTAGIQFLFKKNNVTLLKGCGSFVGKSEDGYRLLVKGKEGETEVTAR
NVIIATGSTPRHLPNITVDNLTICDNVGALKLSDVPKRLAIIGAGVIGLE
VGSVWRRLGAEVTLLEMLPTLLPFADESIATEAAKLFKKQGLNIVTGVSI
SEIKHEATGVSIAYSDKDGAPQSVECDKLMLSIGRTPNTNGLNLEAIGLQ
PDARGFIPVNEQCATTIHGIYAIGDVVRGPMLAHKAEDEGVMVAEVIAGQ
NAHINYNAIPSVIYTSPEMAWVGQSEQQLRAEGRAYKAGQFPYGANGRAQ
GLGAPEGFIKMIADADSDKILAVHCIGASASELIGEATLAMEMGITAAEL
ARTCHAHPTLSEVMREAALAVHGRALNV
>Cag_1127 Acetyl-CoA carboxylase alpha subunit-like
MNPQPPLEHHFYLPFEKREGFSYSHEAIEQLSEYEKYQLSFHPERPRFLD
YLTVFNNTEACLENNEHGSCIIQTHRATLQGLDGKLVPVMLIGQQSAPTS
NFELLRKLMQNPEAIAEWNQGMPTPAAFEKAIQAIALAEAEGRLIMTVID
TAGADPTEQSEGGGIAWKIGRCMQALAEAKVPTLSVIINRGCSGGAIALT
GCDAVLAMEYATYMVISPEACSSILFRTRDQAGRAAEISQITAREGLKHG
IIDALITEPEGPAHRFREAALQSFKNVVSEWTSQLSRRSAHEIQEKRIER
WAAIGQWDEISEEAIQCFEKKRTGWLPAPNEGLFVARHRRCRDHANKQVI
DPVLYRKLVADNFVCETCGHRYTRLTIHDYLALLVDDGSFSEHPETRYIV
DRDILNFPDYRQKITEAQQKCGIASSVITGNGTVEGNNVVLCATSFAFLG
GSFCMSTAEKIWRASALAIERRVPLIIQATGGGARMHEGCSSMVGLPKVH
VALSRVEQAGLPVITVVTDPTLGGVAIGVGSRGIRLFEHNAGNIGFSGKR
VIEQYTGKPTSKDFQTTWWLQQKGFVQHTARPQEMKTAIGKIIEEVRR
>Cag_0475 Glycosyl transferase, family 19
MPKKLFVLAGEVSGDIHAAGVVAQLLQAHSNVTVFGVGGAHLKKLGATLL
YDTAQMSIMGIVEVVKHAGFLRRVIRELKAAIEREKPFAALLVDYPGMNL
HMAAFLHNLGIPVIYYVAPQAWAWKEGRVKTIRATVDRLLVIFDFEVEFF
RRHGIQTEFVGHPVIEELAGLAVPSRQDILQRHALSPDTRLIGLLPGSRK
QEIAYIFPAMLEAARKVSQTHKVAFLFGRAPNLKADHFRLLEEYGDLTII
ECGAHGVMHASELLLVTSGTATFEALCFGAPMIVLYKTNALNYFIGKRLV
KLHNISLANIVAEGLLSNSRTVPELLQDEATPEAIYQQVSTLLHDGKTLA
AMRAKLLMARAKLASVEPSKRVAAVIAEYL
>Cag_0391 Peptidase M41, FtsH
MAEKPTKKSSPNNSRNPFKPVDDDNGGGMGNSGNGSPLPRFPRMLIIVMI
GMLVLFTGQRFFGTAANPEISYNEYKSLLERSLIAEITISSGEERSTLLN
GRLTAPTKLQLVNQALQQSDRFSVRVPSVSLEQTDALAAKGIRVKVEENS
GGLKTFLILFAPWLIFGLIYFFVMRNMNGANNAQAKNMFNFGKSRAKMAS
EFDVKVTFKDVAGVDEAIEELKETVEFLVNPEKFQKIGGKIPKGVLLLGP
PGTGKTLLAKAIAGEAKVPFFSMSGADFVEMFVGVGASRVRDLFEQAKKN
APCIIFIDEIDAVGRSRGAGLGGGHDEREQTLNQLLVEMDGFGTTDNVIL
IAATNRPDVLDSALLRPGRFDRQITIDKPDIRGREAILAIHTQKTPLDES
VTLTVLAKSTPGFSGADLANLVNEAALLAARQEAERITATHFEQARDRIL
MGPERRSIYISDEQKKLTAYHEAGHVLVALFTPGSDPVHKVTIIPRGRSL
GLTSYLPLEDRYTQNREYLVAMISYALGGRAAEELIFNEVSTGASNDIER
ATDIARRMVRQWGMSEKLGPVNYDSGTHREVFLGKDYSHVREYSETTALH
IDNEVHAIISGCMEQAKTILTTKQELLHRLALQLIEKESLSAAEIAELTG
TELPTSTPTLKQ
>Cag_0433 Pyruvoyl-dependent arginine decarboxylase
MSFVPTRVFFTKGTGRHKEYLSSFELALRDAKIEKCNLVTVSSIFPPHCK
RISVEEGLRSLAPGQITFAVMARNSTNEFNRLIAASVGVAIPADETQYGY
LSEHHPNGESAEQAGEYAEDLAATMLATTLGIEFDPNKDWDEREGIYKMS
HKIINSYSITESAEGENGMWTTVISCAVLLP
>Cag_0714 hypothetical protein
MATETQRAFDSELDSEAAELHLAQLIHEAGEEVRRCWAKRQELHMEKLHA
TVAESQATLNKLLQNDRC
>Cag_1889 sigma-24 (FecI-like)
MYLKVAFSFGRYSMEMDAVMTKKTITTTKTATSSPPKALSAAEQKRQLEF
QREAIVHLNSLYNYALHLTMNADDAEDLVQETYLKAYRFFDSFEKGTNCK
AWLFKILKNNYINRFRKDTREPGFVDYDAIKDFYHTIKDVQSDTSETESD
YFHSLLHEEVYQALKSLPEEFREVIQLCDIEGFTYEEIANMVESPIGTVR
SRLYRGRKLLHEQLKEYAEKHGWHMNDAPNNPA
>Cag_1929 Ferrous iron transport protein B
MTDVKEIRVALAGNPNCGKTSLFNVLTGANQRVGNFAGVTVEKHEGYLEH
KGYKITVVDLPGIYSLTPYSPEEIVTREYLIGETPDVVVNVVEGPNLERN
LLLTTQLMELEVDFLVALNMMDEVEAKGIVIEEKQLQVLLGCHIIPVSAR
KKSGIDSLLDHIVRVYQKNITIQKNKLFFRPELEERVEALATILARQPEL
STYHPRWLATKLLENDRVVYHEVQSYPVWLKVERLLQEAMQATERLFEAD
PELLITEDRHAFIRGAMQECVRYPTVKGKSVTDYLDHIVLNRVLGLPIFL
LVVWAIFQLTFTLGSPLMDGLEALFALLGEAIQPYLTNKMLRSIVLDGIL
AGVGGVLVFLPNIVLLFLGLSFLEASGYMARAAFVIDRVMHRFGLHGKSF
IPLITGFGCSIPAMMATRTLKSHSDRLATLMIIPFMSCGAKLPVYVLLTA
AFFPPEMAANVMFGIYMFGVFIGLLTALVMKSTVLQSDSEPFVMELPPYR
WPTFSSVFFQAKIKAFMYLKKAGTLILGAVVLIWVVSNYPKSAALEQQLA
TESARIEASTIGAEAKATELQTLEAAITSQQLEYSIAGRVGKVIEPLIKP
LGFDWRIGIALVTGIAAKEVVVSTLGTIYSLGDSEDGAAELSAVLARDPA
FNKATALSLMVFVLLYIPCVAAIGVMKKEVGRWKPVVLYGTYSIAVAWVM
AFLTYRIALMVG
>Cag_0395 cytochrome b-c complex, cytochrome b subunit
MAENNQNAAAGAAPAKPKPATPSAAKPAAPSAAKPTAAKPAAAKPAAASA
PAGGGVYKPPVDRGTPNPYKDSRVGALSAWFQERFYAVIPIIDYLKKKEV
PQHRLSFWYYFGGLTLFFFLIQVVTGLLLLQYYKPTETEAFKSFIFIQQE
VPYGWLIRQVHAWSANLMALMAFIHMFSTFFMKSYRKPRELMWVSGFVLL
VLTLGFGFTGYLLPWNELAFFATQIGTEVPKVAPGGAFLVTILRGGEEVS
GETLTRMFSMHVVLLPGILLLALSAHLMLVQVLGTSAPIGYKEAGLIKGY
EKFFPTFLAKDAIGWMIGFFLLIYLSVVFPWEVGVQADPLASAPMGIKPE
WYFWAQFQLLKDFKFEGGELLAIVLFTVGAIVWVLVPFIDSKASREERSP
LFTMLGLLVLAFLVINSYRVFLEYGW
>Cag_1821 mannose-1-phosphate guanylyltransferase, putative
MMSTFRQHVYAVVMGGGFGKKLWPVSKRKRPKQFIDLFNDGTMILKTLQR
IAGLVPEENILVITSALGKQLLLELSPHFQESNIIVEPACRNTAPCIALA
SAHIKKRDPEALTIILPADHLVRDSDAFELIMQAALLQAQHSMGLVTLGI
MPTRPETAYGYIQATESLPMPEGFGVDDRFKLFAVKAFAEKPDYATALNF
LETRDFYWNSGIFVWHIKAIWQEFQRSMPDLYHDFLTIYNHLGTQSEQKI
IEDVYSWIHPCSIDRGIMEKAERVFVLTGEFGWTDLGCWDEVLHVAGDLP
VLVAEQEGGHHMEIACENLFVKTMPDKVIATIGVKDVMIIETDKALLVCH
KGQSHRVREIVDMLRRAGLEDYL
>Cag_0648 Periplasmic protein involved in polysaccharide export-like
MRQHRLFLHLFFVGSLALLLNGCASYRNLPAGSHQHITAKPNSTISREAV
VIADPNSSQDSSYKPSDYRVGPNDVLYVNVSGKPEFIGTAGGSSSFKGYR
VDGRGYIYLPIAGKVSVAGLPMYEVRQKIQEVMRRYFNDPWVVVEIAEYR
TRQIFVFGAVKKPGPQVIPSLGINIAQALASADVQNSGQNYKKIRIIRSL
SPTEGELLVVDFDSVLHGRSLPMQLQEGDIVYVPKSTLASWNTTISELLP
SLQAFSAVLQPFVNIKYLQD
>Cag_1011 L-threonine-O-3-phosphate decarboxylase
MNNLFCHKHGGAPIKSNQPDFSVNISPIMPPIGSLSLSDFALHNYPSIDG
SGIRDFYVARFGLDGDTVLPTNGAIEAIYLVPRALGVRRVLTLTPSFHDY
ERASLVAGATVSHLPLDAETGFALPSLEQFATALQEAEADALFVGNPNNP
TGTAIEPEVIMALASRFPWMWFIVDEAFIQFTANFPHNSLMKQVSALRNI
IVVHSLTKFYALPGLRLGAVVAHPDVINRLLDVKEPWTVNAIAEEVARRL
LHCSDYDAEVRSIIAAERLRMSAQFARSGEIVLVGGAANFFLARWIGGCS
LDVLFARLAERNLYVRDCRNFRGLEANYFRFAIRTPAENERLLEALLPAI
RPTTATMRTTAYQRNFSRTPFC
>Cag_1969 Recombination protein O, RecO
MYRIESRGAVYNTIQQYIKVHSVIVKTRAVVLRETNFRDQSRICSLYSRD
FGRLSVIIKGARNPKNRLCGLFSAGNILDVVIYRKSGRELQLASDATLVA
SPLMAEPDMERFAALYRIIDVVKQATAEHEHNPQLFTLLAATLQSLYQQG
SNNLLLTAWFLLRLVSLLGFQPSLRQCVFSNHQLATEVVAMKLSELLFVM
NPGGLALPAAGGISVGKQWRVPVALALQIAPLAEARTPADISLQVEDAEL
ELLCAILYDYCAIHLEHTPKRRHLAITAQLAEA
>Cag_0038 hypothetical protein
MTTAKLPPNQPDDYDSPWKEAIEHYFPEFMALYFPEAYAAIDWSKGYHFL
DQELRTIVPEAKTRKQVVDKLVQVQLLDGKESWLYIHIEVQGNRESGFPK
RLFIYNYRTYDKYDKAVASFVILADSDPTWRPSSYSYEFVGCKMTFAFET
VKLLDFEPRMEELLASDNVFGLITAAHLLTQKTKNKVKQRYEAKLLLMQL
LLQRQWEQARIDELLRVIDWFLRLPKELRQKLKIEIHKMEEAKKMKYVTS
FERDAKEEGIVIGIEKGMEKGREIGVLEGMEKGKAEGLEEGLMQGRLEVA
RRLVSGGMSKAEAALLAGVSVEML
>Cag_1569 hypothetical protein
MNPRVKSVLVLNDYKLSLVFTNGERGIYDCSEFIEFGVFKEFKKYGYFQL
AKVEHGTVIWPHEQDICPDTLYLDSEKVSAEG
>Cag_0172 conserved hypothetical protein
MRIQRRQFEIIQEQAFRELPYECCGLLVGKQQKDHRGNIENIVYEVAPCQ
NCLYYGRESGFEIAHHEYLAVEAEAKQHGYQIVGSYHSHINSPAVPSLHD
VDFALKGHSLLIIAMIYGQPKEVTAWLRHHSGSGVNQEQIRVIE
>Cag_0270 conserved hypothetical protein
MPTISMFYGIIIRMYFVPTEHPPPHFHVYYAEHTATVDIRICEVIQGHLP
KKQTKLVLAWAELHQEDLMADWELVMNGEEPFKIQPLQ
>Cag_1782 MECDP-synthase
MRIGIGIDVHPFAAGRKLVIGGVYIPGHDGLDGHSDADVLLHAISDALLG
AAALGDIGLHFPNTSAEFKDIDSMILLKHVGKLLAEHGYAPVNVDAMLLL
EAPKIAPYIQQMRQNIADALGLQVGDVAVKATTNEKLGYIGRKEGAEAHA
VCLIQGK
>Cag_0190 cytosine deaminase, putative
MIDFNTFMSAALREAIKAYEQREVPVGAVVLDSNGHIIGRGHNQVETLHD
ATAHAEMIALTAAMATLGNKYLDDCTLAVTMEPCPMCAGAIVNAKVGRVI
FGAYDSKMGACGTVLNITGNRVLNHQPEVFGGIMEHKCQELLQSFFKSLR
QQK
>Cag_0356 Ribosomal protein L7/L12
MSIENLVEEIGKLTLTEAAALVKALEEKFGVSAAPVAVAAGVAAPAGDVA
AVEEQTEFTVVLVAAGESKINVIKVVRSITGLGLKEAKDLVDGAPKNVKE
AISKDEAEKIAKELKDAGASVEVK
>Cag_1905 Acetolactate synthase, large subunit, biosynthetic type
MHSTGPTLTGSEIFFECLRRENVEYIFGYPGGSLLKVYETLHDVKDIKHI
LVRHEQGATHMAEGYARATGKPGVVLVTSGPGATNTVTGITNAYMDSSPM
IVFTGQVPSTLIGNDAFQEADIVGITRPITKHNFLVKDVRELALTIRKAF
YLATTGRPGPVLIDMPKDVLNSSCQFEWPETVDIRGFKPTVKCHTHQVEK
AAQKIAKAQRPLLYIGGGVISAEASEELRTLATTYNIPVTMTLQGLGAFP
ATNSLSMGMLGMHGTYWANQAVNKCDLLIAIGARFDDRVTGKVSTFAPQA
YKIHNDIDPTNVDKNIKVDLPIVGDSKNFLTTLLEVMPQQHENREAWLAE
INEWRKQCPLDYGKDDEQLRTEFVIEEVSKQTAGNAVVVTDVGQHQMWTS
QFYSFVNPRSIITSGGLGTMGYGLPAAIGAAFGIHNKPVVLFCGDGGFMM
NIQELVTAVHHKIPIKIFLINNCYLGMVRQWQELFHEEKYTFTDLAESNP
DFVKVAEAFGCKALRAETPDAARQAISEALAWNEGPVLVECKVVRKDMVF
PMVPAGGSISDMLLARLNPKTMV
>Cag_0316 conserved hypothetical protein
MAMAIEKTVRLAVLIDADNTQANIIDGLLAEVAKYGTASVKRIYGDWTSP
SLRSWKEMLLEHSIQPIQQFGYTKGKNATDSAMIIDAMDLLYTGKFHGFC
IVSSDSDFTKLASRIREAGLVVYGFGEKKTPSAFVSACDKFIYIEVLRAK
INDNEQIARKSMAELRKDARLVRLLRNAVEASSDENGWAHLATTGNNIAK
QSPEFDPRNYGYAKLRELVAATKLFDFDERVIGDGQSKAIYVRDKRMRDK
LKKADEKADSEIPF
>Cag_1251 Nitrogenase cofactor biosynthesis protein NifB
MKQDITKHPCFNDSARHTFGRIHLPVAPKCNIQCNYCSRKFDCMNENRPG
VTSKVLSPQQALYYLDQAMELSPNIAVVGIAGPGDPFANPDETMETLRLV
RAKYPEMLLCVATNGLDLLPYIDELARLQVSHVTITINAIDPEIGQEIYA
WVRYNKKMYRGKDAAKVLINNQLEALKRLKEVGVTAKVNSIIIPGINDAH
VITVASKVAELGADILNCLPYYNTKETVFENIDEPSPELVFEIQKATSEF
LPQMKHCARCRADAVGIIGEINSPEIMEKMAEVAAMAKNPFEQRPYIAVA
SMEGVLVNQHLGEADRLLVYGIDEQGDCVLVDSRQTPPAGGGNERWEALA
NLLSDCRTVLVSGIGNSPKKVLNNNGVEVLVMEGVIAEAVYALFNGHDMR
HLIKTELAHACGTNCSGTGAGCG
>Cag_1928 thiol:disulfide interchange protein, thioredoxin family
MKHSTLFIAALAIVTGGTIAQNNTLNAAPPTAKQSVKGATAPTFTAKSVQ
GHIVSSQQLAGKPYIVNFFGSWCPYCRKEIPDMIALQERYKAQGFTFIGA
AYQDNENAMPDFIWEHNINYPVIMADSKIINSFAPYIAGGIRSVPLLFAI
GRDGKIVAVEPGAQTREDLEKLIQKLLQRPAKR
>Cag_1313 ExsB
MDSLVTTAIAQQAGFELAAMHVNYGQRTMQRELNSFRAICSHYSIQQRLE
INADFLGKIGGSSLTDLSMPVSVANLESHAIPASYVPFRNAGFLSMAVSW
AEVIGAERIFIGAVEEDSSGYPDCRKIFYEAFNRVIELGTKPETHIEVVT
PLIALKKWEIVRKGIELHAPFAFSWSCYKNEGQACGVCDSCALRLRAFEQ
AGMEDPIDYETRPHYIDC
>Cag_0181 conserved hypothetical protein
MVKTIMAMKIDKPKFNPNIHHRRSIRLQGYDYSQSGFYFITIACQDRICR
FGYVENGEMVLNKYGIVAYNEWVRLRTRFPNIELDVFQIMPNHMHGIIVL
NEISVEDVGAGFTPAQNNALSNIRAGASPAPTVSEIVGTYKSLVANGCLK
IYTTKNETMGKLWQRNYYEHIIRNEQSYQSFSEYIINNPAKWEDDTFYVI
>Cag_0435 two component transcriptional regulator, winged helix family
MRGAFSSCSSTFFLQNLMSEPLLLVVEDDQNLAALMEYNLNRAGYRCHIV
GSAEAAMVELGRKSFALVLLDVMLPGMDGFELCRHIRNTHLYRDLPIIMV
TAKGEEIDRILGFELGIDDYVVKPFSFRELNLRIRAILKRERRQSGKAQE
VLGTGGLEVDLVRHCVTLDGQELVLTLMEFKLLVALLKRKGEAQSRETLL
SDVWDIDKSISTRTIDTHITRIREKLGNYGALIKTVRGLGYRLEPQEDEQ
HAS
>Cag_0192 conserved hypothetical protein
MSLRTLQQQIITLYKSRYKATPQSITQLQGDASTRRYFRVEYNSLGTIAC
YDPAFVGADPERYPFLVLQKLLKQHDILVPRTCVYNAALGLLLLEDCGDL
LFQNYVLEVLHTKKYDVLQQIYQNVVELMVAVQSIKGNEHELPFNLSFDR
EKLLFEFDFFLQHALRGYFASEIDAALIPLLRQEFEAITDLLVQPEHFVL
NHRDYHSRNIMVTYDGYFLIDFQDARMGLPQYDAVSMLRDSYVVLPDELV
AAMKEFHYKQLLEHHLTTMTYYEYCYYFDLMGFQRAIKALGTFCYQAVVK
QNRSYEQYIAPTLGYIVNYIAERPQELGKAGGLLQPLLEKALHQ
>Cag_1423 Molecular chaperone GrpE (heat shock protein)-like
MKKRLLRNLFTKQHTMDNIHTVPGYNSDANQPPQEPTAENIPLAEGVSGN
ANMPLESRIAELEAELEQQKEQVAKFREEVLRKAADFENFRRQKEREITL
TASRAFENVIRDLLPLVDDIRRLLHHAPPEGEAAQIARPYIEGVEMVQKN
LEKWLNEKGVVPIESKGMKLDVNFHEAISQMEHPDAEADTVIEEYQTGYL
LGDKVIRHAKVIVAR
>Cag_1847 Ribosomal protein S19
MPRSLKKGPFIDIKLEKRILEMNSKGEKKVVKTWSRSSMISPDFVGHTVA
VHNGKSHVPVYVGDNMVGHKLGEFAPTRNYRGHAGGKSEKGGSAPRKK
>Cag_0481 Phosphoesterase PHP-like
MPYSTSSVGHNGFQKADLHIHTKCSDGLFTPEEIVEKAARIGLNAISITD
HDSVLGIDKAKPLALEKGVELIAGVEMSSTYKGYDIHILGYFFDYQHSEL
KDYLDHCRQLRTDRAERMVSKLAKMGVKIGIEQIIVKAQNGSVGRPHIAA
VLQDGGYVKSFSEAFSKYLGAHSPAYVKSVETHPADIIRLINKASGLSFL
AHPAQNVPDEVLKQLITLGLDGIEIIHPSHDTYRQNYYREIANEYFLLFS
GGSDYHGIRERDEDLFGKVTIPYDWVKKMKSRLMLA
>Cag_0620 Twin-arginine translocation pathway signal
MNHSADFSRRDFIKISSLFLAGSAAMATVPSGVASGLGSKPEAETDGKVI
YSFCEHCFWRCGIAVHVKDGKVTKITGSEHHPLSNGKLCPRGAGGVGLLY
DPDRLAHPLIRERKGGKQYYRKATWDEAVTVVAKKLYETREQYGAGSIAL
LNHGYGVSFFKNMLSAIGVNKVAKPSYDLCCGPARQATVLTYGYPSDTPE
GFDIMNSKYMIFFGTHFGENMHNTAVQEISEGIRRGAKIVVFDPRYSTLA
GKAEHWLPIKPATDIAMMQAMMHVLISENLYDKAFVAEHTVGFGELWESV
RDMTPEKAATITDVPAEAIRTVAREFAFYAPAAFVHTGRRTNWYGDAMQR
IRAIHILNALVGNFKMPGGVVAFEKFPLPQPPTAHEHPAYERESWSKYPF
FSHDEKANTITSHEIIQQAITGEVKALFIYGVNVTETMTYGRQAALDALQ
KVDFSVAVDVLPAEVTGYVDVVLPECTYLERYDSLDNGRAFRTPFVALRQ
PAVKPMYDSKPGSEIARMITDKWGMHDVFAPTVEAGLNKNLAMVGSSLEE
IKKKGVLVMPPTDLYRKPGEALNLNTPSGKVELASAQLKAAGFDAVPVYK
QHPEAPAGFYRMLTGRKPMLTFGRTANNRFLGDLATAQENEVWVNTTIAA
KHQLAHGDYVNLKNQAGIVSDFPVKVKVTERIRPDAVYMVHGFGHTSKQL
RWAYKRGASHNQMISAVDIDMAMGGVGFQNNFVTFVKEASA
>Cag_1783 Dihydroorotate dehydrogenase 1
MTLQTSSPTTATALSLGRGLELTSPVLLASGTVSYGEEISQLCNLSRIGG
IVTKAISLEPRAGNPPQRIAETPSGMINAIGLANVGVERFIADKVPFLRQ
LNTAVIVNIAGRSIDDYCEVVSRLDGVEGLDGYEINLSCPNVKGECMIMG
VSSEMTYEIVSALRSITPRHLMIKLTPNVTSISTIALAAEKAGADSLSLI
NTVVGMAVNYKTRKPLLKNVTGGLSGPAIKPIALAKVWEVYQAVSIPIVG
MGGIASFEDAMEFLLVGARAIQIGTMNFVYPDISEKIAVALEAHFAAPNA
MPLRDYCGSLQVE
>Cag_0992 hypothetical protein
MEISLLQVVIDEHVAVFGVEPVFTGWSAFLSEDEIATNVCAAIDKGEPYV
EEEVPDGVDI
>Cag_1143 Cation efflux protein
MTAAQQNIAFQKVVAVTGVLLFVIKMVAWYLTSSVAILTDALESTVNVVS
GFIGLYSLYLAAQPRDNKHPYGHGKVEFVSAAIEGTLISIAGLLIVYEAL
KNLLGTPRPLGQLDYGIALIGVTAVVNYVVGAFAVRQGTKNNSLALIASG
KHLQSDTYSTIGITLGLLLLFFTKQVWLDSVVALIFAGVILATGYHIIRD
ALSGIMDAADESLLKKMVDLLQAHREPQWVDLHNLRIIKYGATLHLDCHV
TLPWFFTVRQAEEEVQKLGKLVDTTFGDSLDLSVQSDSCNEALCSLCRYD
VCSTRRKPFSSALEWTVENLSSNHHHYLSAE
>Cag_0589 GMP synthase-like
MQSVLVLDFGSQYTQLIARRIRELNIYSEILPYNTPADTIRTHQPKAIIL
SGGPNSVYDQTAFMPDPAIFSLGIPVLGICYGLQAIAKHFGGVVASSNKH
EFGRSKILVEQQGDNPLFQNIPNSDVWMSHGDKVMQLPEGFRATASSDNS
EICAFESTGVNSPAHIYGLQFHPEVQHTLYGKEMLGNFLLNIAAITPDWS
SKSFIEHQIEEIRRKAGNNTVICGISGGVDSTVAAVLVSKAIGKQLHCVF
VDNGLLRKNEAEKVMHFLKPLGLHITLADSSDLFLKRLKGVASPEKKRKI
IGRTFIQVFEEQIHHEKFLVQGTLYPDVIESISVKGPSETIKSHHNVGGL
PKRMKLKLIEPLRELFKDEVRAVGRELGIPEDILMRHPFPGPGLAVRVLG
SLTHERLEILREADEIYIEELQASGLYSKVWQAFAVLLPVQSVGVMGDKR
TYENVLALRAVESSDGMTADWAPLPHDFLARVSNRIINEVRGINRVAYDI
SSKPPATIEWE
>Cag_0204 Nucleoside diphosphate kinase
MERTLTILKPDCVRKQLIGAVINHIERAGFRVVAMKKIHLTKEAAGEFYA
VHRERPFYGELVDFMSSGACVPMILEKENAVADFRTVIGATDPAEAAEGT
VRKLYADSKGENIIHGSDSVENAAIEAAFFFSTEEVVRSN
>Cag_1635 Alanine racemase region
MCEALISLEHLRGNVRALQAHLNGRAHIMGIVKANAYGHNVHFVASTLET
CGIHNFGVANIHEALELKQGGALQKPATIIAFASPLPSHLPYFIEHGITM
TLCDHATFVAARDIAAALERPLQVHVKIDSGMGRLGVAPHEAMALLQAVD
ASPFLELTGVYTHFADSATPNSFTHQQLAIFKTIAAEYEHAAQRTICKHT
ANSGALLSLQASWCDMVRPGILLYGYHPSQECPTQLNVQPVMQVQAKVMF
IKKVAAGTSISYNRTWQAPTERFIATIAAGYADGYHRLLSNNAAVLINGK
RYPQVGTVTMDQIMVDLGSSHSVQVGDSAVLFGWDTLSANELAAQAHTIS
YEMLCAVSARVKRRVV
>Cag_1712 Ribosomal protein L20
MPKANNAVASRARRKRILKKAKGFWGARGNVLTVVKHSVDKAEQYAYRDR
RAKKRTFRSLWIMRINAAARLNGTTYSQLINGMAKNNIQIDRKALAEIAV
KDAAAFSAIVQAANK
>Cag_1719 Naphthoate synthase
MSAIAWVAAGEFTDILYHKAEGIAKITINRPEVRNAFRPQTVDEMIEALQ
DARNDAEIGVVILTGQGDLAFCSGGDQKIRGYAGYADGKGVNRLNVLDFQ
RDIRTCPKPVIAMVAGYAIGGGHVLHMLCDLTIAAENAVFGQTGPRVGSF
DGGWGASYMARLVGQKKAREIWFLCRQYNAAEALQMGLVNTVVPLERLEE
ETVQWCREILANSPLAIRCLKSALNADCDGQAGLQELAGNATMLYYMTEE
GQEGKNAFVEKRKPNFNKFPRRP
>Cag_1999 Ammonium transporter
MKHKHLTALLMLGALLLTGGNAWAADEVVTDSGTTSWMLTSTALVLLMIP
GLAMFYGGLVRTKNVLGTMMHSFGAMVIIGVLWPMVGYGLSFGPGILGGL
AGWDDKFFFLQGIDDSIMSTHIPEYVFAMFQGKFAIITPALIAGAFAERV
NFKGYAVFIALWCILVYSPICHWVWASDGFLFNLGPSGAIDFAGGTVVHI
SSGITALVAALYLGKRRGYPSTAMSPNNLVMTLIGAGLLWVGWFGFNAGS
AIASDLVTARALTVTQVAAASGAFAWMIIEMFHHGKATSLGVASGILAGL
VAITPAAGVVQPIGAFALGALAAAICYSAILLKSKLGYDDSLDAFGVHGV
GGIVGALALVFFIRPSWLADAAAKTANNSWTMWDQLGVQATAVGVTIAYA
AVISFVLLFIVEKTVGLRISEDDEMSGLDHSMHGEQGYGLINLN
>Cag_0463 surface antigen family protein
MKHSYSLLAVALCVATCTTNVQLANAAPKRAATQKTAPAKSAKAAEPLEP
QAALDTNLEAPLSPTSNQAEKVSVIKAIHFSGLQSINESELLNSLPLKVE
QRVALPGTELTNALNYLWKLQFFSDIRVEKEEVGKNGVTLTFHLTELPVL
DTITFRGNKKFDINELKSESNLVSSKKVSEQDVMTAANKLEKLYASKGYV
TARVAYQVEPTANNRVNVLFTVTEGNKVSIDKITFHGNNAFTQKKLRGIL
NETHQNSWWRSIFGSPKFDNEKFAADKELLLDFYRDNGYRDARILRETMS
YTKDNKGLLLDIYVDEGRRYHIGTITWSGNSKDFATTEVLQKTFRIKTGD
VYNAKLIGERLNFSQDNSDVSSLYLDRGYLSFRADLEEKVVHPDKVDLTI
SLREGELFEINTVNIKGNTKTKDHVIRRELYTVPGDMFSRKNVVRSIREI
SMLNYFDPEQIKPDVEPNQQNSTVDITYNVTEKQTDTFNASVGYSGVGFT
GALGVTFNNFSLKDLFNSEAYRPLPHGDGQKLSLQWQFGTSNYRTLSLNF
TEPWAFGTPTSVGFSAFKTHSSYDYTDDDTYNPTVIDQFGTTLSAGRRLS
WPDDYFAINWKLKYLHSKGGFLRFIDFNDPNAPEEADEISITQTISRNSI
DSPIYPRHGSKNSLTAQLAGGVLPGTVDFYKITGLSSWYIPVTKNLVWNI
STQHGVLSTFSETDYIPYTDYFYMGGSGMSSLPTTPLRGYEDRSIGTKLG
ASSTTDTSLYAGKVYSKFSTELRYPLTLSQSVSVYALTFVEGGNLWQKTS
EVNFADLKKSAGFGLRLYLPIIGQIGLDYGYGFDAVESEPEKTKQGWSFL
FSFGNSIE
>Cag_0664 putative transcriptional regulator
MEHEIIDGKQLLIVRVRKGYAKPYATSSGVYLTKSGSDKRKMSREELRRL
FAESGGLSADETVIHGSDIRDINTEILNDFLIKRDREIFEALRAQRVQLV
TICENLDLVAHGQLTLAGNLLFGREPQRFSKSFYVQCVHFDGNDVGCNSF
ISKDIIYGTLQSMYKQTLNFLKSSLRRVQKGKNFNTLGEFEIPEICLTEA
LINALIHRDYFISSSIKVFIFEDRVEIISPGKLPNSLSVEKVRLGISIHR
NPILNSLGQYLLPYSGLGSGIRRIEAHYPLVKFFNDTDKEEFRCVFFRS
>Cag_1481 Glycosyltransferase-like
MQKSINIFTPDKITLPLSWVGHIPFAAWLVEILHPQILVELGTHSGNLYF
AFCQTVKKNQLLTKCYTVDTWKEAQHSGIYDNNVYNEILAYNSTIYGDFS
QLFCMTFDEALTKFENGSIDLLHINGEYTYQAARKDFETWLPKMSNKGII
LFHDIMVKEREFGVYRLWEELSSQYGHFEFTHSHGLGVLLTGKNQHPAIE
SMAQDFQDTQKKKLISGYFEHSGYAIELEYQHQSDATKIHKLSQQLKSQS
LQINSLQKHNQSLQHEIQELNKSLVRITNSSSWRLTKPIRKWSKSLRKRF
RKIRYFLTGETAENVPKRLTTLCNNWFKPTDKTRILIIDSWIPAPDQDSG
SMDTFLTMKALVELGYDITFIPKDLKAKQKYVQLLENEGVRYPDLSKAAI
SIEEFLKVAGHYFDLVMLYRVDTASSFLAMVKHYAPQAKIVFNTVDLHFL
REQRNAELSGSDIMRKNALKTKEHELQLMQQADSTIVLSNVEFDLVKKIK
PEVNLELMPFFRMIPGRSAAFHERKNIVFIGGFKHQPNLDAITYFISEIW
PKVHLKLHDAKLRIIGSNPPKELYRLVDSDNTIELLGYVANLDPEFNTCK
LTVAPLRSGAGIKGKIVTSLSYGVPCVASPIASEGMELIPDKDLLVAKEP
DEFANKIIKLYTDEALWNALSDNALTTVEERYSYKAGKKRIGDFLNKLLG
SSRHSVWGSEEFLQNNLANETDDTDGKKRIIIELPSFDKGGLEKVVLDSI
LAFNKNKFHFLIVTPGKLGELSTVATNAGLSVIQLPDINHEAAYERLVIK
YRPHASMSHFSHLGYPVFINHHIPNITFIHNVYAFLSEKHKKEIMMYDHA
VTRYIAVSPKVACYAEKNLGINQEKITIIPNGLCITEHEERQKRATPALR
DDFGLNKNDFVFLNPASYNLHKGHYIMVDALQIVTKKRKDLKILCVGNIV
HEPHYHELQQYIISCGLSEHMLMLGYISKIENIMPIVDACIMPSFIEGWS
IAMNEAMFYGKPLIMTDTGGASEVIENNDIGILIPNEYGASDLLDRTTLD
KLAYKPHHYKISSMVADAMIAFADNHEYWKKAGEKGRKKIYRHYAFKNVV
AQYEEIMNQVTEPVTYEPQ
>Cag_0412 Putative molybdenum utilization protein ModD
MLYQLPDSAIDAFLDEDMPYGDLTTHLLGIGNERGTISFISREVITVCCT
EEAARLLERCGATVTNMVASGSVVQVGSELLVANGSAAALHAGWKVAMNL
LEYASGIATRTRTIVEHATAVNPNVRLVTTRKSFPGTKKVAIKAIMAGGA
LPHRLGLSESVLIFRQHTEFCGGLERLLATLSSLKMELPEHKIMIEAETA
DEALRIAAAGADVVQLDKLPPAELQPLITKLRVIAPTITISAAGGINAQN
AADYAATGVDILVLSSPYFGKPADIKVIMTSH
>Cag_0348 conserved hypothetical protein
MNVKRTLLQGWEYLRHEPRKIVLFGVLLVSLYMMLFGDFGILKRLQMEAE
YRQLLQEEQRTQAVLHDNALRIKNARNPDSIEKAAREKYNFRKPGETLFL
IVSPSE
>Cag_0292 conserved hypothetical protein
MNTFQLDYRHIQTPKKGFSALFRDYTADAPERETLIAECFHLDYRKSADY
YRQLGLLSARTFQRESLVNMLLRQNSRFGGGERQQQAIEKLRSPRCMAVV
TGQQLGLFTGTLYTIYKALTAIIVAEQQKSLFPDYDFVPIFWLEGEDHDY
DESASTTIFAENQLKHFTHQPYRRLPDQTVANSSFSDDIRTIIDEMVALL
PNTPHRTMVAEMLHECYYPGCTFEISFASTMLRLFRNYPLILLSPQESDF
KKLAMPIFFKEIESAPAVSYQVIAQSTRLESLGYSAQTKPRAVNFFYVNQ
HGQRQKVEQPSSDTFQIVPEKQRLSRHQMLELCQDHPERFSPNVVLRPIV
QDYVLPTFATIVGPGEINYMAQYRPIYEHFGITMPFLVPRGSFTLVEPKI
SRVMDKLIHVTGQPGFSRKNIYNAVFSNLQQVKKNAIGEAEHPQLATMFE
QAKDEMRQALERLNHTLSTIDPTLEPLLAASIVQSAKLVETIEQKTWKAS
RRKHEELLEQIQKAETALFPEGVPQERVVNIFYFIAKYGMGILDDLSNLL
KGYASDAHIIAELQG
>Cag_1353 hypothetical protein
MKSQITEQFQQIGLDDLKIHPELLTDILSNKQDLSLILEKRGDIIRYAYL
RTYDSDSRKILEEAKAEYCLKKEDGYSREEAFQDFEDVQHDIATQLASRV
R
>Cag_0268 conserved hypothetical protein
MNIYTERLPHRYRQAMFRDEMKIEELVEVMLRVAWCYPKVLSATCTKLGV
TGSIGFFKNLAGWAFGK
>Cag_0223 Single-strand binding protein
MPFLQLLSELDAMARSLNKVMLIGHLGTDPELRTTTSGQSVANFTLATNE
NYKDSSGNLQERTEWHRIIAWGKLAEICNQYLKKGRQVYVEGRLQTRSWD
DQKTGEKKYTTEIVCSDMQMLGSPREQMGGESTMQPYDQSTLPSQSSAPS
VMPPATPTVPTMIDTDKDDLPF
>Cag_1605 conserved hypothetical protein
MAQKTINLLFIGDVVGTPGLQMVSRMLKSFITKHRVDFVVCNGENAHQGK
GLSAEALNQLLEAGVDVVTGGNHTWSNFNFFETLKTHPKVLRPQNYPKGT
YGKGYAIYKLPNGLGDIAVVNLQGRTFMYTIDCPFRTADWVLKQIKEQSK
ESVKCIFVDFHAEATAEKVALGWYLDGRVSAVIGTHTHVPTADERLLPKG
TAYCTDAGMSGPHQSVIGMQIKSATDRMLYQTPHKYECAEDDVHFSAVLL
TLDLGSGKALGIERIFYPEFERGTVVR
>Cag_1819 hypothetical protein
MKVIISRKQVNTNKSNHWFRDIAMLEICDAKYVGDYKIYLVFNNGREGIA
NLEKALFNDIRSVFSQFRDKERFANFKVDHGTVIWSDEFDLASEYLFYLA
FQDNPELQTKFKEWGYVA
>Cag_1666 Ribosomal protein S32
MANPTCKMSKSRRDKRRAQFNARTKAAVTVNCPHCGEPTLSHRACRHCGH
YRGRFVLNKAEK
>Cag_0047 methyltransferase
MALHDTYHDPVLAAEVVATLVQRSGIYVDGTLGGGSHSLALLQALQAQGL
LESSLLIGIDQDSDALAMAAERLQAWQPYTRLLKGNFRDMASLVQQLCDA
EGRACAVTGVLLDLGVSSFQLDTAERGFSYMRSGPLDMRMDNTAPLTAAE
LINHADEAELARIFYHYGEEPRSRALARAVVQQREKMGNFTTTEELAALV
RRLTHGGEKAVIKTLSRLFQALRIAVNDELGALHEVLEGALELLDGNGRL
AVMSYHSLEDRVVKHFFTHHAQCDWGPKGVALREPLSQGALTIVTKRPML
ASADEIERNPRARSAKLRVAAKNQPKTI
>Cag_0574 hypothetical protein
MNTTKNEVATLLQTLSDDVSFDEIHYHLYVLEKVNRGIKRAETEGAISHE
DAKKRLSKWLLD
>Cag_0193 mannose-1-phosphate guanylyltransferase, putative
MKAFVLAAGLGTRLRPLTNHSPKPLMPVLNVPSLFYTLFLLKEAGIGEVI
CNIHYHAAQMRSVVEAHNLAGLQITFSEEPEILGTGGGLKKCEALLEGED
FVLVNSDIITSIDFRALIERHRISGNLGTLALYETPDAASIGWVGVEDGL
VKDFRNQRHTNLFSSFIYTGTAVLSPDIFRYMHAEYSSIVDTGFAALLER
NSLGYYEHCGLWMDVGTLPHYWQANLDATGAIRRTFGAMQQSFGVAPHVV
APSATISPAATVENSVVGADCVIPDGCTVRNSVLLSGVVLQPNTPLCNAI
ADSQTVTILEPSSLL
>Cag_0673 putative glycosyltransferase
MEHYVTLFDSLFLPQGMALHISMERHIKDYTLWILCVDDEAYDVLTKLQL
ANVRLLQLSTLETEELLRVKPTRSKGEYCWTLTPFAPRFVFEADATVHRV
TYLDADLWFRKHPKPIFDEFEASGKHVLITDHAYAPEYDQSATSGQYCVQ
FMTFSRHAGEEVRKWWEERCIEWCYARHEDGKFGDQKYLDDWPDRFANSV
HVLANKEYALAPWNATRFPYSAAIFYHFHGLRIFKKRKKYYVFNGTYFIP
KPTYRYIYKLYLNDLQGSIFLFLKMGAILRNQKNMWFGYNFFTFLKVLYS
KLFRLNVYASLCYVYLKPNLKAVNYNN
>Cag_0659 hypothetical protein
MEQSATKSFDGERRIMYAEKLIVETDLSGMLKKVPKLPPNKQLEAIFLVL
SESSAKVAVVRTPHPDIAGKVIIKGDIINCATSSDWDLPQ
>Cag_0119 conserved hypothetical protein
MKAIILYDSKSSGGSTDALINNIGSRLADEGFYVEKAKCRANADYSFVSE
FDVVLLGAPVYNFVVASQLLGALIQSNLKTCLRRKKVALFVMCGSPAPVA
ELLYMPQLKVQLFGNKIIAEQIFAPGALEAGALERFVDDIVEEANRKPKS
KALNAKWSLEAEELFQSVPPFLQTKFKTLAEEYAEEMGYEEITLEVLDEA
RQNMEA
>Cag_0401 2-desacetyl-2-hydroxyethylbacteriochlorophyllide
MKSMKAKAIVFSGVRQIELADVKLKPLSSTDVLVETWWSSISTGTEKMAW
NGLIPSPPFIFPFIPGYETVGKIIAVGAHVNDNLIGRFAYVAGSFGYEGV
NAAFGGASEFIACPVDSLTVLDNIEHPEAGIALPLGATALHIVDLAHVEA
KKVLVLGQGAVGILAAELAKLMGAKLVAVTEPNCNRLKLSAADLKVNPDR
QDVSAALAGHEFDVLIDSTGIMSAIDTGLRFLKFQGTVIFGGYYQRINID
YSQAFQKELSFIAAKQWAKGDLERVRELIASHKLNAERIFTHHHTVGSGN
ITDAYQQAFTDQDCLKMVLHWKQANEEPTTSN
>Cag_0840 TPR repeat
MSIELKAKYLGKTPKDKAKILIAQSSWEKTQLARDENQCYKLSLTEENEK
FIDSILELGNGKEVDLIVFPEFSIPEKYLEKIREWTFHNQIIVIAGSANL
QREEKYYNTSSIFFEGIPYKTEKHDLSPLETSNLLGGYGPSSGTNQFYFT
ETPIGELGVMICADEFDRQTRNEFLKHNIDILCVIAFQQKGKDHHQSINE
IVKESNNGIYVAYANALCNSWTDGHSAFFSNEYREGRVEYVETGLTKDDG
LEMKLVEMPSNAGCLIVECNLKSKKPVIRNLDPNRALVNAELPYVFENGG
LRQFTKEELKKPDEKSNKVKAEYQPSIPPIKAFVADYIGRQTDVEYLSEF
LNNPQKHFCLLYGVGGMGKSHLLYCCMKDYKQKTFFYHVVSPNEEFTLNK
LFEVCLLPKPDAKLSLEEKQNHFVKKFQENNIHLILDDYYEVQLDEVKSI
LPKLTGIGKGKLLLLSRIIPSNISYIKADYLNHKILPLTEPDFKQVIQNF
ILDKNLTLTDEEIHLIYEKAQGYPLGGQLIIDAKPYSKNLLELLTNLGKF
EAEIDPDGKIYSGRILDNIFKKGNNKEIKLLCEFSALFGVSDIETVRQLP
SYNLNLFQGLHSRKSFVDMDVQGKFSSHAMIRDFAYHRLQNKEALHLKLA
KYFENNINGRTDDDWKWLNEAILHYTKSPKAEHFAFINRVERNFESRNIK
EQIDKNSILKTIRNYTTLLNLYPDKPAYYNELGIAYRMNRQQRNAIETFE
RALVIDPKDLPSLNELGITFRENNQKTKAIETFERALVIDAKHLPSLNEL
GITFRENNQKTKAIETFERALVIDAKHLPSLNELGITFRENNQKTKAIET
FERALVIDAKHLPSLNELGITFRENNQKTKAIETFERALVIDPKNLPSLN
ELGITFRENNQKTKAIETFERALVIDAKHLPSLNELGITFRENNQKTKAI
ETFERALVIDAKHLPSLNELGITFRENNQKTKAIETFERALVIDAKHLPS
LNELGITFRENNQKTKAIETFERALVIDPKDLPSLNELGITFRENNQKTK
AIETFERALVIDAKHLPSLNELGITFRENNQKTKAIETFERALVIDAKHL
PSLNELGITFRENNQIEEAIKVCKRALNISKDRQLYLNLLQIYLFFKSDK
QISKEIYDILLMPPRLHAFSASRKKYENIIRDMDYLLSISFDDVKQYESF
LFLAIQYKAYEKVLFILEKLNDQFPDNSKIKSRLGKTLSNQVIGEHEKGG
RFLKQAIGLFKKENNIQQLQGHIIYYFYNLLNQNQIELIEKEMMTYEKDL
IYDANYFRFMANFSFVKNSNINDAISYFEKAIEISEVLMDKKEFAESLLR
FLSEQKSLHYKTYFVKYEKYI
>Cag_0698 conserved hypothetical protein
MKTVSVSEACSTLSTLLKEVELGDEIGISFEHQQHTIAVLVPIAKYKKIK
DRKLGSLAGKVKVEFSNDWQITDEELFNL
>Cag_1920 hypothetical protein
MNTDMLVSSNAQSLLATAKSCCENIACSLVIADAHVDDVTLLTTSLTPFT
DIVLVTHEADALATLQAAFAAGYEHIHFLGHGEQGGITLGGKLWQTNDFV
ALAAEVDSARETSLHFWSCYTGAGDKGLAFVSQLSEVFGDAVTAFSGLVG
AASKGGSWVPDVIVGSVHVPEVPFINALTYAHTLDVTSNVYLTSVGRDEE
GDGDIDGVDVQLWLKAGTTINAVDFTLAYPSVATVNGIITHPAFSSWTWN
INNHSNEGVIIAGLAGDITNSSVYNSFTAPSDMWIGRVSFDYAPTPTVAP
SFVVSLTDVFLNDVELVTTSAEWPILSTDLSTVPVWNTDLMLPPSAPYEY
APGETVSLSFPISATDPDSGDVVSYSAVIGQVVDNVFQPLSGFSPIPLML
SNDVIGGSFTVPSIAPVGSYVVRLLADDHAGDAYLGTAFDVPFSIVMGGG
DLTFNTSGTIDGSPVAGEGYYKFAPGTISGELAIAGDTGFQFVITTDYDE
NPMTFNASWEWLNESDGADYGTVTFFDISSGIAGPETWSATFEDNTIGMV
IADSSTDADTLPDGIIVRDDMDNQVAVPLAWQQRDANGAIATFSATVKNE
DNEDISFSGSLIDSDENGEPDRVVGNWGDEQFNDSFLFADVYGDSQPDEW
MVTSTKIRAGRVQNDANGNPAGLYITWDNQRPVWNVPVQLPPLMFTQGQN
INFSDYALADLYATDPNGDAITYSAVVGYISPVGFVPVQEFSELPVWMEE
AGLQGSFTIPTNAPTGSYVLRLLADDHAGDTYAGTALDVLFTIEGVEPVG
NILLNTTQEVEGNPGTLESYYKFRTGSTGIAAVFGDSGFQVNLFEVASDA
NPYTFNAAWTWFDNGTYSDTNEGVLTFIDTNSNLTDGLEQWQAAINDRTI
GEVIADGNDADTLPDGIVVEDDMDNLVDVDLNWQTRDEVTGAIATFSTTV
KNKDNEDIAFSGSLIDINNDGVADRVVGNWGNDQFNDAFTFADINMDEQP
DEWVATQTKMYSGRVQNDASGTPAGVYMEKEPEPIPDPPYITQAELTYDI
GITGASLAETGAIVVKIASPYMGMSNITIPVTELTLQGSLLTIPLTSFIS
TYPQMGAALLVQIPAGVVVGQNELVNAWQIGEPYSGYYALSSMPVLPNDN
SSDGADWVLGTSNNDSIAAGAGDDVLDWSVGNDTIDAGDGYDHQYLPIPG
MYPHLMPQLDESGVLHLVKYNYEDSLTGGSTTDVYRITRLAPSEYRIDSM
DSIGVTVVQTLHLSNAEVLSAGYHPTYLAVQYNTESHYVSGTAWDDVISV
DLQSFIASPFTSVWGDSGDDMFVLNLPAIYSALELVPEGENMYLLQGIGS
GPLATTTTLGQLQVTSTGYVTLTVGSGDTALSVSLSNIEKYQFVAGSVVE
ELDVAASHENHLPVGTVTISGDPTEGWMLNALLDFTDEDGMSNSIITYQW
YANGVAISGATDSSYELTQTELGKQLSVTVSYVDDYGGHESVNSLATTAI
QNSNDEPEGKPTITGTAAQGKTLTVDVSGITDEDGLENATFSYQWYAGGM
PIDDTTASTFTLQETQVWHQISVAVSYTDDFGQEETVYSDYTDIVENVND
KPTGTVTIIGTVAQNEWLSVDPSAINDPDGLDGLFEYQWKADGAIIEGAN
ESQFLLTADYAAKALSVTVSYYDEHGTYEQVTSTATTPFSRVNNLPDGYV
FIVGNQQENETLTVGYYLYDADYANGEVNPNDISYQWQVWSDSAGTNGDW
VDLQGATSSTLLLDESLSDKWVWLTLSYTDPHNTTESLSSYYSVFIYNLN
DEPTGEITIYGTIKEGETLTVNTSTLADADGLGELYYQWYANGEEIGGAN
YSTYDLTQFDVGKRISVAAGYWDGHGTWESVASELTATVARDTTTNNEPT
GWVTISGTATQNRMLLANFNIVDSDGLSDAVYSYQWKASSDGINWDDIFG
ATQRSYKLTQADVDKHITVEISYTDDANHLNTISSDPTRAVWNVNDAPTG
KPTLSGTLTEDQTLTIVTSAITDADGIAQDTMSYQWQADGVTFAWTTENT
YILTQDEVGKAISVIVSYYDNGGTYESVTSSPTVAVANVNDQPQGEVLVI
GSAVVDETLWVSTGMLTDEDGPDLLYLNGNMSYQWQSSTDDGVNWNDIIG
ATESGYDVTLDESGEKIRVQVTYTDNGGKTEVVYSSATDAVVSNAIDPNG
TIAITGTFKQGETLNATVTDADGMGTVSYQWQSSTNGTTWDPISGATSAS
FILTEAQVVKQIRVIASYTDGGGTIESPSATTTTIENLNDNPIGSVTITG
TAKQGEALTAKNTLADADGMGTVSYQWQSSSDGTNWSAINGATASTYKLT
AAEVGKQISVVANYTDGHNTPESKASVATVAVANTNDAPTGTVKITGSGQ
QGAILAADTSTLADADGLPTTLAYQWYAGGVIITGATNGTYQLTKNEVGK
AITVKVSYTDGGGTPESVTSLATSAISNVNDAPTGGVTIDGLAKQGQRLT
VDTSTLFDDDGIPTNKLGYQWQAGGLNIANATESSYKLTQAEVGKAITVK
VTYTDLQGTTEAVTSDATASVANVNDTPTGTITISKIDDDGNKVDLTAAP
QQNDILVASNTLVDGDGPPALAVTYQWQANGADINGAVGRYFEVTQAEVG
KTMGVVASYTDAFKNPESVSSTATAAVVNVNDAPTGSVTISGNPTQGQEL
TAITSTLADADGFKSTLSYQWQSSSNNIDWSNITGATNRTYTLTNSEADK
VIRVVVSYTDKGNTDESVNSKATRSVTNDNDAPTGTVTITGTIKEGQTLT
ASNSIVDPDGIPAGTITYQWKANDENIYGATYATYTLTQEEVGKHISVVA
SYTDNGGTSESVSSTSTTKAENVDNDPIGTITITGTAKEKSTLTFVNTLQ
DADGMGIVAYQWQSSTDNGSTWSNIAGANASSLTLTELQVGQRISVVATY
TDGYGNPETVRSSNATSKVKNENNNPEGKLTIVGNAKAGKTLYADHTLTD
EDGMGAILYKWQSSTDNGSIWNDIDDATDSFYTLTKDDVGNNIRVVATYT
DGHGTVENVFSEKTATVKKVISGSSHDGYLVNALVWVDEDSDNTLDWTDT
NRNSKWDEGEGESWTLTDNTGQFTGLEGDGTKPLRITANPNGGTIDISTG
NEFDGSFFAPADATVISALTTLIAAAMDSTTNAAAAETKVETALGLDAAT
LGATLSLTSYDPLAEASKTSTTDAAKINAVKVHAATIQLNNIMDVAISVA
DAAGSTLSKAQIVENVSDSLLAQAGTDTVDVTSDAVIEVAIKTGLSTGLT
TKPNFNDVVAAIADALALANREIATIATNATGTNAVASITDIVEAQIVAQ
STIVPDAYAAVVADDSSAITTKADTFSSQLGEAAKEVETIFVNHAPTGSV
VINGVVMPGEILTAATDSIADNQGVGAISYQWLRGGEVISGATNATYQLV
AADIGKAISVKASYTDGAGFSESMNSNATIAVPDAPTSLSDVTVDVANTL
TNDATPTVEVDLTNKALEVGDVIQIIDSNHGNAVLYTETITTTGITLKEI
QLAVALIDDAHALQVRLVDSAGNEGLASNGVTTITVDTTISHLSGAVYNA
SSGTVTALLDMVLDSGDKLYGSVNNGNWEDITAKVNGTSINWDGVGSNAT
KIDLKVQDEAGNTDTEAVTIPVSTGHNLTIHTAYWKDSKAISGVTLENGA
QTDSVGAHLYTAVTDATKTISPELAVATADKAAIGLLDAVGILKSLVGLT
TLNKYQEIAADYDGSGKVGLLDAVGVLKYLVGLPGAAPEWVFAESTASEP
SAIDDMTVSLNDDKTVELIGILRGDVDGSWVNLH
>Cag_0836 conserved hypothetical protein
MKSFESEIIIEATPAEVWQMLTAFAAYGAWNPFLRRVQGSATNGAQLYVE
AKLPALPAIRFTATITTMQLLHRLGWHARFAGGLFRAHHFFRLEPQTSGG
CRLLHGEEFSGLLAAPILLLLGSQFRAGYTAMNEALKRQVETNSDR
>Cag_0544 hypothetical protein
MKKALLTALLVGMSAAPAQQLFAKGFNYNYVQGQYVKPSMDNVDGGSGFA
ITGSVALNDNFALNASYNDASFDNDIDASGYNVGVTYHMPVADSTDILLN
AAFEQAEASAFGISTDDTGYSIGAGIRQKVASAVELEAGIYNVSIFDDSN
IGFGAAALVDVAKNIALGVSYENLDESNTIGVGVRAGF
>Cag_0363 Acetyl-CoA carboxylase, alpha subunit
MSAKVVLDFEKPLFELEAKLHEMRECLRNTTRDQSPADADLLYGDIEALE
RKVELLRRSIYKNLTRWQKVQLARHPERPYSLDYIYMMTTGFVELAGDRH
FSDDKAIIGGFARLEDEAAGFSQSVMMIGHQKGRDTKTNLYRNFGMAQPE
GYRKALRLMKLAEKFRKPIITLIDTPGAFPGIEAEERGQAEAIARNLMEM
AQLTVPVICVIIGEGASGGAIGLGVGNRILMAENSWYSVISPESCSSILW
RSWNFKEQAAEALQPTAEDLLAQGIVDRIVPEPLGGAHTDPDAMGATLKA
MLIEELQQLLPKEPQLLVRERIEKFSAMGVWNEV
>Cag_0051 UDP-N-acetylmuramoylalanyl-D-glutamyl-2, 6-diaminopimelate-D-alanyl-D-alanyl ligase
MKATLQRNDLEAVGELVFHGEAPPSFELAEPHVVIDSREVSEGGLFVALH
GERTDGHRYVNDVFQHGATWAMVNRSWYEAEGHPLPPHHKGFLVVEDTVA
GLQHLAVRYRNTFSIPIVAIGGSNGKTTTKEMVAAVLASDSSAVSMSQGN
RNNHLGVPLTLLQMRHSTERAVVEVGINHPNEMAMLAELVAPTHLLLTNI
GHEHLEFLGDLDGVAKAETQLYDYARQHGATAFINADDERLRAAAEGMPF
RIDYSLHEAVDSLVWAEDVTVERDGRLSFLLVTKGRSEQERLRLHFTGRH
NVLNAVAAATVGLQFGISLHHICEGLAGLQPAPGWKRLEVVEVGGVRLLN
DTYNANSDSMRRAIDALCDMPCNGRRIAVVGDMLELGDAAEVEHQAVAHY
IQRSLVTKLFTFGTQAAAICRHAPELCYGSYSEHSALLDDLLHVLSEGDV
VLVKGSRGMRLELIVDGVVHALQPKS
>Cag_0961 Phosphoesterase, PA-phosphatase related
MLALLIPKTTVAADGVERAGSAIRLLIPATGAAMALVRHDDDGLGQLAAS
GAIAVGVTTGLKYAIPATRPDGDDHSFPSMHTAIAFSSAEFIRARYGWNY
GIPAYVAATFVGYSRVVSDRHYTRDVLAGALIGIGSSALLTTPYKNVQVQ
AELSTHFTGTHIVYAW
>Cag_0472 hypothetical protein
MKQHRHNIEAEVAKTMSLLDKPAAIEVSAPFRARLMQRLEAEKNNGLQGN
HAFHVDYRVAFMALLLVANLASSLLLFRQENRTNSPTQNVAATLNVDALA
EQELLGGDEQGEWYENILP
>Cag_0240 dihydroflavonol 4-reductase family
MSKLSIALTGATGYIGSQVLLELLKRFKGELDCRVLVRGSSNYAWLEALP
VQVIAADVLEPIALHEALRGVDTLFHCAGLVSWTRRFRSQLYEVNVVGTR
NVLHAALYNGVRRVVMTSSIAAVGMSEDGAPANEAALFKEWQRRNGYMEA
KHLAELEALRAVAEGLDVVLLNPGVVIGVDHHNPASLSSSNRTLRQMYDE
KLWVAPAGSTGFVDVRDVAMAHIAAWEKGKSGERYIVVGHNVSFHELLSR
LSALNNGVAAKVLTVPRSVGMVAALGGEAWSLLTGNPSFIAFESIGTSAR
QLAYNNERSLCELGIAYHDLEETFQTILK
>Cag_1619 conserved hypothetical protein
MSEIMEMVKEMAEDFHKADVMDEITMKSITALCLPDKRSFQPTDIKRIRM
NNNVSQPVFAEILGIGKTTVQQWEHGKKKPNGAANRLLDLIDRKGISILA
>Cag_1627 3-oxoadipate enol-lactonase, putative
MLRCITNVTAETAGRYPITVLLLHAFPLSAAMWQPQIEALEKAGYGVIAP
HAYGIEGSPEIAEWNFTDYAVELAQLLESLHIASVTVVGLSMGGYQAFEF
YRLYSNKVKSLVLCDTRAEADAPAARATREEFMKAVASTGSAEAIRRMVP
NYFSPAAYGANSTLVAQVEAIINKQSPEVINAAMRAIMLRADATPLLGSI
SCPTLILNGEEDSMTTKETAATIQAGINGSTLQLIAGAGHIANLEQPELF
NQALLEHLSLLQ
>Cag_1686 hypothetical protein
MKELSLLLRQIHPNFVQDGHLSSQAFRPTPKDEQQLSVYDGDMILPLDAW
EHYNNILGLTSCGVMAVNVAECTVLELPVMSDPQPFPEHVLIDFSAYNKR
EIEKKAKLLKAKAEVRGWLYKKAQL
>Cag_1239 VCBS
MSNTVVFIDSRISDVNALISRFAVGTEYYVLDSERDGILQITEALAGKSG
YSSLQIFSHGTAGSLMLGSTVLNNAALSNYTAQLAAIGGALTASGDILLY
GCNVAAGDVGQQFIAALAEATGADVAASDDLTGSAALGGDWDLEYATGAI
ESGVDTNVVQEYDGVLEPPTTTITITLPTEAEIRNNSNIDQGYITESTLG
SNVTLYLQDILAVDTTNLLSTTAPTSFVISVSYGKISFLKTKVDTIPIVS
GNGSNTVTLTGTITQIRTLLTNNTISYVGNTGFSGIETLSAGISGVLANN
SGNASGSANSELSIVGINDAPTILDAKATLTVSEGSLLNDVFEGSFTLSD
PDIAHYIMQADITVLHGTLKLSGDDLDSVTGNETKSVTITGSRANIELAI
TALQYTPDENYNGSDTLTVTVNDLGTTNVNQGNPDEDKTDIHQVTISVVN
HAPVLENDYTPILSEISEDINSSITETEDGDNAGTLVKDIIPENDPTDSI
TLDAITDEDVTSALQSIYITAVDSTNGKWQFKLDGQTTWSTITLTGDTAL
LLSENNSLRFIPNNNWSGTATFTFGAWDGTGRDINNNNALYEAGDYVVIT
QRGVLNAPFSLDVDTATITVNPVNDAPTFTAFSAPITPAITTNQPSLALE
DGGNTVGEPSTTPITITFDDLATKGNEADANDAAYGGAVTAFVVKRVLSG
TLTIDNTDYTAGTEDLNLTISESQDASWTPELNKNGILNAFTVVAKDNDT
TDSKESTTPSVTVTINLTPVNDAPTVIPATQSATLTAGSDDVTSAYISMA
MSDVDTGDIVKVDGDWLADESAANDGEEHWTSSDGGKTYTFDGSYGTATL
YRVATTDGHSAGEVTYSFDYADTDLDSLAANATATDSFTIVVSDDAKVTA
TADAVFTINGANDDPIITSGTQSGSVIENSSETATGDVNATDIDYGTTLS
YSGDATGDYGSLDVTEIDGTWTYTLTSNALDDGESDIESFTITVSDGDGG
SATQDVTITVWGDNDPPTITESAQSPDYVTEDGGINNGTAGTSSAHIDVT
LSDADGGDTVSYVTTGWSGSGNTYTKTGTYGTATLHPNTHVIDYVLNNSD
TDTQALDTNDVVSDSFSITVTDGTDFTSETIAFTIHGTNDAPTIDVSDTG
ESFTESDDASMQTLSATGSITFDDVDASDSVSLTFAESTDISWSDSNGGG
DYDNELDSDYSSVKAALLNGFSTTATGWSYSTSGGSSLIGSEFGAIPSNY
FLPPPFSFTSLITGVLGDQNDVTEGQFKYDVYTLSNVVSGTLVFAAIESS
AFPEYANVYTSANLQLRQPITPRIISNTTDGNARILAWFYLPGDTLWVGS
DAPQQSGAYNLYLGGTIAESGDGVDLDFLDEDEQLTWSYTVSATDGTAST
TSTESVSFTITGENDAPTISISDDSATITENSGSDVSDLQTIDVSGDIEI
DDPDTHDDTVTVTSSLDSISWSGGDTTDSVFDTLTDGFTADEDGWSYSVT
DGVDLNFIGEDETVVLTYNVTASDGTESDTDTVTITIEGTNDTPTVDVVE
TSITFAEDDTLSDSGTVSFSDLDTNDVIDVTESYNNDIAWSYSGGTLTDE
MIGEDISTLTNGFTAWGEDLSSDETVWNYSATLGTDDLDFLDAGETLTFS
YTVTATDNNGASATDTITVTITGSNDTPHFNQDSVDNGSVTDTSASDTFS
NDIIGTLTADDVDHNDTITYGVVNGTGTYGTLTVDSSTGVYRYDGNDNTI
NALNAGTWSDTFTVTASDGTISESATVVITIHGANDNPTVSINDDSLSFT
EDVSASAQDLTDSGSITFDDVDTSDNVDIIATYKNNISWSGGTLANQMSG
EDSGKLTSGFHASATDSTSDSITWDYSATDVDLDFLAEDETITLGFTITA
RDSHNAFATDDVTITITGTNDDVDITGGTTSGSVTELADKYSENGVLEND
YLHSITGSIDISDPDVNDSHTATFTDNSESEQVTYLGSFDVADNGEDWTF
TVSDEDLDYLDDDDDPLIQTYTITISDGHGSSDTQDVTITIHGSDDNASV
AGRVYYWSNFDDIDDVATEMYAQDSMHTDGEDSGIEFRNIEKHDNGTYTL
DIYKTADTDEADSFLIKLQLAKGSVATWEQSHDLFDGEGNELPDLTEFGF
VTYASSLRTGECNIGGYSASLLSLPNDQEVKLGTLTITAPIFDAKLLSGS
YIGDTAIDAGDIFSEMKLNVTEGEDITYYNGDGLYDYLNQLNSGEDSFYY
YDSVDPDDYNFDAIKEVTTEDANEVTPADALLALKLSMGINTALPDLIAL
FEESDTIPLIPYMYMAADVNKDGEVTIQDALNILKMSVNYACAPEQEWIF
SPLPHNEQNILENMMNGFYVTYVNDLGEDVKVDWLDIKGISGDETGIILS
IGDNAIPIEEWNIGVDWEQASPELHDITVTDNDLQFVDLIGILKGDVNGS
WGDPINFNPPN
>Cag_0772 N-acetyl-gamma-glutamyl-phosphate reductase
MKKYSVSVIGASGYSGAELTRVLLHHPHVELHNLYAFSNAGSNVMDIYPH
LHCNKVYQAYTRDTESDIYFLALPHGEALQLVPALQAAGKMVIDLSGDFR
LHNTVEHQAFYKQEKSAGAVMTYGMPELFREAIVASRTISNPGCYATSII
LALAPLFATNATVPSVKAVNCTATSGISGAGRSSKTELSFSEMSENMRAY
KVGCHQHTPEIMQTLGTSATNPSFDFTFTPMIAPFVRGIYSVLNVQLTAS
VTKEEVETLYLNFYKSAPFVRLRTAMPEIRHVAHTNFCDLHIAHVGNNGS
LVIISAIDNLLKGAAGQAVQNMNMMLGIDEITGLV
>Cag_0973 transcriptional regulator, putative
MAILGKIFGSDSPKSSNTPQAFTIKVDPAKFALSTKPQGSIHIQLEALKQ
KLTKLSSNVENNLMLAIRASTKGNAELAASAFKFDEEYIRKGKFEVEYLV
LAYAHFQNLTGDDLQAIKYAHLILRDLERLAMISLNIADKADYVKFANVQ
ELHKDEYDLKPMGDITAEMIKKAVEAFVSGNSKHASETLVMMNEIQSIYD
KAVAKMTSSVNDGNIRNLTGILSIVEQTKSAADISCSMASNFC
>Cag_1520 delta-aminolevulinic acid dehydratase
MSQLNLLNIIQRPRRLRKSAAIRNLVQEHSLTVNDLVYPLFVCPGTNVVQ
EVGSMPGSFRYSIDNAVKECQELWDLGIQCIDLFGIPEVKNDEGSEAYND
NGIIQNAIRAIKAQVPDMCIMTDVALDPFTNFGHDGLVRDGIILNDETVE
VLCKMAVSHANAGADFVSPSDMMDGRIGAIREALDDEDFIDVGILSYAAK
YASSFYGPFRDALNSAPQFGDKTTYQMNPGNTDEAMKEIQIDIEEGADIV
MVKPGLAYLDIVRRTKERFDVPVAIYHVSGEYAMVKAAAARGWIDEQRVM
MESLLCMKRAGGDLIFTYYAKEAAKILR
>Cag_0242 Dihydrouridine synthase TIM-barrel protein nifR3
MRIGALSIERPIILAPMEDVTDRSFRQLCKRHGADIVYTEFISAEALRRG
AEKSLRKLKVDDMERPYAIQIFGSTVESMVEAAIIAASYQPDYLDINFGC
PVKKVAGKGAGAALLREPEKMAAISAAVVKAVAIPVTAKTRIGWDFDSIN
ILDIIPRLEDAGIQALALHGRTRSQMYKGTADWQWIRAAKEKAHIPLIAN
GDIWNAEDAARMFDVTAADGVMIGRGSIGNPFIFQQAKHFLAHGTLLPPP
DFRQRIAVAIEHFQLSLAYKGEKYGVLEMRRHYSTYLKGLPMVSRVRNKL
VREDNPNNIVELLLAYREECESYAREGRLSEGVEFLNDHSPKLEMREG
>Cag_0126 CBS
MDQLIKLRTLPVSALMQKDFHVIKGSCTVAEALQLMKQTRESGLIVEPRN
EDDCYGMVTEKDILEKVIDPGEDVHRDPWNTPVFQVMSKPVISVNPSLRI
KYALRMMKRTNVRRLTVMEGNKVIGVLNMADVLHAVEELPVHDEHVAL
>Cag_0852 branched-chain amino acid ABC transporter, permease protein
MTILQPLISGVLTGGVYALAGIGMSLVFGVMNISNFAHGDLMMLGMYLAY
FAFTLLGVDPFVSLVMIIPIAFVFGYLLEKGFIHRVMHHPHQNQILLTVG
LGLIMSNTALLAFTSDPKILTTSYSSSAITLNEEISVSVPLLISFLITAL
ITFTLYLFLSKTSTGMALRATSQNREAAQLMGINVAQMSGLAFGIGTALA
ATAGALIAPVYYIHPLAGHSFLLKAFTICVLGGLGSVVGAGVGGVIIGVV
ESLSAAYISSDWKDVVVFVLFLAVLLLRPQGLFGKKGGQ
>Cag_0041 outer membrane protein, putative
MKQNISFHKKISATSLALLLATSSMSYAVEPTSSPSTAFAAPSVTPLTPL
TLAQALQKMQAHYPALHAASEEVMAADARVRQSKSSFLPQVTANAGYLWR
DPVSEMSFGGGTPMQFMPHNNYHATVSAEAILFDFGKRSRELALAQSGTR
TAEEQVALSRREAAWQVVQLFYGILFLQEEQRVQQKEFQALNKALEFTTK
RYQAGTATSFDLATTKARLAALQSRMADSAHALERSEMHFCRLTEMNATQ
PLALQGSLMASVAPSSNQAQLTEQALKNRVETRLAREAEAAAGQRQALAS
KGGAPQLRGNVAYGVANGYQPDIDEIRTTLSAGVTLDVPIFSGFRTTARQ
QESAAALRAATQRRLDAEAQAATEVAELLNALQHNGEKLNATAMQAEQAS
LAASHARARYENGMATTLDLLDTEAALSQAELARLQAAYAVTLNRYALQR
ATGEVFW
>Cag_0101 conserved hypothetical protein
MKTAFVKIWGELVGAVAWDDATGYATFEYDAKFKSKGWELAPLQIPVNAT
KSNFSFPALRKKADPALDTFKGLPGLLADMLPDRYGNELINLWLAQKGRP
LDSMNPVETLCFIGTRGMGALEFEPTTLKESKKAFSLEIDSLVEITQKML
TKKEAFVTNLQENEEKAILEILRIGTSAGGARPKAVIAYNERTGEVRSGQ
TNAPQGFEHWLLKLDGVSEVQLGASHGYGRVEMAYYNMAVACGIQIMPSR
LLEENGRAHFMTKRFDREGGAAKHHIQTFCAMKHFDYNLVTNFSYEQLFQ
TMRELKLSYPDAEQLFRRMVFNVVARNCDDHTKNFAFRLKKDGKWELAPA
YDVCHAYQPKHQWVSQHALSINGKRTNITKDDLLTIGKSIKNKKAAETIE
EISNTISQWKTFADEVKVLPKLRDEIAATLIRL
>Cag_1800 electron transfer flavoprotein-ubiquinone oxidoreductase
MNLEPESVDFDILFIGAGPANVTAAIHLQRLLNRYNETATTALEPSIAII
EKGRYASAHLLSGALLDPKALEEFFPDYRNQGCPIEATVSKESIWYVSAK
RKFPLPFIPEQFSNKKSLMVSLSRLGAWLAEQAEAEGIQLFDSTAAVAPC
IESGRLTGVYTDDKGVDSNGAPKANYEQGLLLKSKVTIVGEGAAGSLTRQ
LAKHFPALHGKTLQRYETGVKETWSIPEGRLQAGEVYHTFGYPLEQEHYG
GGWVYAFSPTLISLGFVSSPNQSNPTCDPHENLQRYKLHPLIQPILAGGK
LLECGARTITSGGLDAMPQIYGDGFLLTGESAGMVDMQRHKGIHLAMKSG
MMAAETLFDCLIADDFSTAQLQRYEERFRASWAYQELYDARNYRKAFDNG
LYAGLLAAGLQVNIPGLSFATKVGKKIKRELPEADVCADGVLTFTKERSL
FNANIQHEENQPCHLHINQADIENICLKQCTSRYGNPCQYFCPAEVYEIV
TEPKFALKLNPSNCLHCKTCDVADPYGIITWTPPEGGGGPGYKVG
>Cag_0670 conserved hypothetical protein
MSSIKTLVKDWVPPIAIRLLQSFRRKGVIFEGDYVTWEEASAQCSGYDAK
NILDKVLDATLKVKRGEAVYERDSVLFDKIEYVWPVLTALMWIAAQSKGK
LDVLDFGGALGSSYFQNRTFLADLKQVRWSVVEQSHYVDAGKTHIADERL
QFYKTIEHAVSAGSPNIVLLSSVLQYLPNPLIILKQLTSLNADCLIIDRT
PFIKDKDLSVIKIQHVPSSIYEASYPCWFFNMDKFLEYIESLGYRKIEIF
QSLDRLSNEAIWQGIIFKKKNEL
>Cag_0770 Exodeoxyribonuclease V, alpha subunit
MITYNERPIDRHFAKMLLQHCGNSKHELLPLLFSMVSNAIGQGSVCLNLA
DIAAQSVTYGNRTVQLPPLAELMRLFSTLPVVSRNGAEFRPLVIDNVGRL
YLYRYWRYEHDLAEALRQKASTKSCTIEKKSEAVQVLLQQLFPEGSDAQQ
KQAAEVALHRRFCIISGGPGTGKTTTVVRIVALLLEQAGGERLRIALAAP
TGKAAARLKQSISTIRGTLSCSQTLQQAIPSEVVTIQRLLGAIPNSTRFR
YHQRNPLPYDVLIVDEASMVSLSLMHALLMALKPECRLILLGDRHQLASV
EEGAMLGDLCSAVGEATPHSPLAGTLVMLEKSYRFQTGGAIAELSRAMNQ
GEGEQALALLQSNQSAALRWQPLPTPDALPSALGRAAVAGYRAYCEATTP
AEAMERFERFRILAALREGIYGVSGLNRFVEQALAREGLLAPTSLWYAYR
PVLITVNDYNVRLFNGDTGLLLPDAENGGVSAWFTTPDGGLRRLPPERLP
AHETAFCMTIHKSQGSEFDNVLLILPPTDTPLLSRELLYTGVTRAKSRVE
VWGDPTFVQAACKRTTIRHSGFREALALE
>Cag_0725 drug resistance protein, putative
MKRSPLFILLLTVLLDLIGFGIVLPLLPTYTKSLGANPFMIGLIAAIFSI
MQFIFSPLWGKLSDKIGRRPVMLSSILLTSLSYLMFAQATTLPLLILARA
LAGVGSANVSAAQAYITDVTDAKGRSGAMGMMGAAFGIGFIVGPLLGGVL
MHNYGIAVVGYVASLLIAIDFILAIFFLPESNKAAIPFTHLLKGNENKGT
RHGKKQSLGTTLATKVQEYNEHFRTTFSSRPLALLMVANFIYTLAIVNMQ
TASILLWKEYFKASDEQIGYLFAYVGIWSVVVQGGLIGKLTKKVGEHNIF
LWGHLFTFFGVFFMPFLPSYSLFSMGLTVLFFFAIGTSLVAPINLSMISL
YSYNQQQGQIMGLSQSVNAFARILGPFSGSILYGMSFHAPYIVAGILTLV
GAVIALRLFRYRIEAHD
>Cag_0295 transcriptional regulator-like
MNPLLHHHFFNQNSNAVIFPCEKAVWYYLPQIRADLAIELVATGMTQSSA
AKKLGVTPAAISQYIHKKRGMQPNKSAEYLAQIKQAVAVICKGTAPADLQ
RLVCSCCHLLQKASDEHAEACGGGQD
>Cag_1553 Sulphate anion transporter
MFKPKLFSLLSELTPDRLMKDVIAGVMVGIVALPLAIAFAIASGVSPDKG
LITAVIGGFLVSFLGGSRVQIGGPTGAFIVILYGIIQQYGVNGLLIATFM
AGLILILMGIAQFGSLIKFIPYPVIVGFTSGIAVIIFSSQVKDFLGLSMD
KVPAEFLDKWIAYGTNLVTLDPASLAIGVLALLIILFWPKVTRKVPGSLI
ALIATTLLAQFFELRVDTIGSRFGDIPSTIPTPSLPHFDIATIRNLIMPA
TTIALLGAIEALLSAVVADGMIGGRHKSNVELIAQGIANLATPLFGGIPV
TGAIARTATNVKNGGRTPIAGIVHALTLLLIMLLFGQWAKLIPMPTLAAI
LIVVSWNMSEIHVFRQLLKSPRSDVMVLLTTFSLTVIFDLTLAIEIGMLL
AVILFMKKMAELSNVGIITKELNDQEEEDDPLAIVKRTVPHGVEVFEISG
PFFFGAASKFREAMRVVEKAPIVRILRMRNVLSIDATGLNMMRELYLECK
KTNTHLVLSGVHAQPLLALQQFGLYDEIGEENIFGHIDDALQRARQLLNT
HKG
>Cag_0950 K+-dependent Na+/Ca+ exchanger related-protein
MEYLLLLLGIACAAFGGELFIRGAVGIAYAARIKPAIIGVTIAAFATSSP
ELSIAINAGASQHSSIALGDTLGSNIVNVSFILAVALLISPISVSKSTTQ
RDFTLALFVPIVSAVLLLDNELSRFDGAILLTLFIGWLIAAIVEAKKDRK
KTLHQENVQTLRKYVLPTLIGLVMLFAAGNFIVTSATTIAKSFGVNDFLI
GATIVALGTSVPELASTIMAKIRGHNDMSLGTILGSNIFNGLLIVATAAL
LNPITITFQEVLPTIAFGIAVLLFSYPYRNNTITRLQGGLLLMLYFAYLG
VMIWTQT
>Cag_0079 8-amino-7-oxononanoate synthase
MSRAALYQRLATELEALQAADRFRRFPPVEARDGNYLITEGKRLLNLSSN
DYLGLADNRSLLEDYFQTIASSVYPMSSSSSRLLTGNHPLYDELEAALQH
IYQREAALVFNSGYHANIGILPALCGRHDLILSDRLNHASIVDGMKLSGA
TFQRYHHANYEHLETLLAQSARRYRHTLIVSESVFSMDGDCADLARLVAL
KELYGALLMVDEAHGAGVFGERGLGLCEAAGVVQEIDIIVGTFGKAFAST
GAYAVMNSLVRDYLINTMRPLIFTTALPPVVLAWSLRTLARQTAMSADRQ
HLLRLAATLREALQEQGATVVGESHIVPVITGSNSRAVVLAAKLRKAGFY
ALPVRPPTVPDNSARIRLSLRATLSWDDVALLPALFREYLGGKE
>Cag_0065 ATP synthase F0, C subunit
MEGLGLGFLGAGIGAGLAVIGAGLGIGNVAASAAEGVARQPEATADIRTT
MIIAAALIEGVALFGEVICVLLALK
>Cag_1591 conserved hypothetical protein
MQNKAIPHSPAEQVALLDSNECSLPRFHERLASEGVSQLQADGIEILQLN
IGYRCNLRCTHCHVNGSPERHELMSREVMEQCLVALDKSNATTVDITGGA
PEMNPHFRWFIGELRATKPDARILVRTNLTLLTDNKTYSDIPELLKAHRI
ALIASLPCTTKKTVDAVRGDGVFDRSIAALKLLNSIGYGTSDSALELNLV
FNPSGAFLPEAQQQLEHHYRTELQNKYGITFSHLFTITNMPVSRFLENLC
TNGTYCDYMKLLVDSFNPTSVKNVMCRTTLSVGWDGTLYDCDFNQMLRLP
VECSVPQHINAFDAEVLSKRHIVTNQHCYGCTAGAGSSCQGCLV
>Cag_1379 Peptidoglycan-binding LysM
MLWLPTALPLEAAEPARNNPNALRRSSISDVLDSLVNATYFKDEYFTAPS
REGGVSFPSTFVPQFSDSVYSSRIAALRRKTPMPLVYNAQVKGYIRMYAV
EKRSYTAKILGLTKIYFPLFEEKFDTYNVPLEMKYLAIVESALNPTAVSP
AGAKGLWQFMYGTGKMYGLESSSFIEDRYDPYKSSVAAARHLRDLYQIYG
DWFLALAAYNAGPGNVNKAIRRAGGVKDYWAIWDYLPAETRGYVPAFIAV
HYIMSYHNEHNIRPLEPAYLYRDIDTLRTSRMVTFEQISETLGISASDLE
FLNPQYKIGVIPASTGNGNVIRLPRRYVAQFQRREQEIYAYRSARTMERE
ALYARLESVRAGAGEQSSESSKGMGNQKIHIVQRGETLGSVARLYRTYIS
QLIAWNNLVDADIMVGQRLVVFGGEDNSPVAAPEPPKSTVPPKAPPIERQ
PTAAPEVRAAAPPKRIAVTRSTQTVTRDELVALTETPTVATDNTSAKAEP
IFHVVEPGQTLFAIATQRKVTVNQLMLWNNLKSVQIKAGQKLIVSSDGQS
GRDNSQ
>Cag_1249 Nitrogenase MoFe cofactor biosynthesis protein NifE
MDSIGILEARQKQVIEKKAGESQPEIACDTTSLSGSVSQRACVFCGSRVV
LYPVADAIHLVHGPIGCAAYTWDIRGAVSSGPELHRLSFSTDLHEMEVIY
GGEKKLYSSLKELIAQYQPKAAFIYSTCIVGLIGDDIDAVCKKVEKETGI
PVLPVHSEGFKGTKKDGYKAACFSLMKLIGTGSTEGISKYSINILGEFNL
AGEAWMIRQYYENMGVEVVATLTGDGRIDAIRRAHGASLNVVQCSGSMTW
LAKEMEAKYGIPFIRVSYFGIEDMSKSLYDVARHFEDRPEIMEATKKIVS
DEVTKLYPSLQKFKKALQGKKAAIYVGGAFKTFSLIKALRSIGMSVVLAG
SQTGNKDDYNRLKEMCDEGTIIVDDSNPVELSKFILEKEADLLIGGVKER
PIAFKLGVAFCDHNHERKIPLAGFEGMYNFALEVYQSVMSPVWQFAPRKG
GAL
>Cag_0413 Molybdenum ABC transporter, periplasmic binding protein
MKKLTKSLALLCFVLVALTKPLMAGELRLSVAASLKEVMNEITAKYSAKH
PSTTFVKNFGGSGTLAGQIEQGAPVDLFIAANTEWMDHVKDKKLVDATTI
KNFAYNELVFAGSTNKKVTSMNDLLSLDKIAIGSPKSVPAGEYAMTAFTN
SGIAEKIASKLVMAKDVRECLMYAEIGEVDGAFVYRTDAMLAQKAKLRLV
VPQKLYPRVVYPMALTMQASKNREAQAFISYLNSNEAKSVLRKYSFLLK
>Cag_1937 hypothetical protein
MQRKAWFVVLVSLLIGLCANKGFAESPTQPSTLVATRFSTNEIAFTTGYG
YSLRRKAFEEHNFSIYPFAVRYGWNLNRPLGLSGASSALYATVEPFVNMI
VGKEQGREVGCGVGVRYRRAVSQHANFFAEGSVAPMELTINTPEQGAAGF
NFLDQFGVGLQHEVGQRTHLFVGYRFRHISHAGLIDRSNGGINSHGVMIG
ISLIQ
>Cag_2025 Protein of unknown function DUF205
MLTLLAILVVSYIVGSIPTSLIAGKMLKSIDIRDFGSGNAGGTNAFRVLG
WKTGLTVTLIDIIKGVVAAVSVVAFFRHHPIDVFPDINEVALRLLAGMSA
VIGHVFTVFAGFRGGKGVSTAAGMLIGIAPVSMLMVIGVFLLTVYISRHV
SVASILAAIAFPLIIAIRKYLFELGAGLDYYIKLFNAKFFFHDSLDYHLI
IFGLIVAIAIIYTHRANIKRLLAGTENRISFGKH
>Cag_0249 acetyltransferase, CysE/LacA/LpxA/NodL family
MLMGKILPYKGIVPQLHESVFLTDGAFVIGDVHIGANSSVWFNAVVRGDV
CPIRIGEKTNVQDNVTLHVTHDTGPLTIGNCVTIGHGAVLHACTVQDHVL
IGMGAVLLDDCVVEPWSVVAAGSLVKQGFRVPSGMLVAGVPAKVMRPITE
AERQTITESPENYVRYVQNYRAEDAQG
>Cag_1463 NusA antitermination factor
MAKKKIKDEGQERRLQIASAFGEIEQSKMFLDKRTESAAVKMDIADLLKD
IIQKQLRKDYDPEVEANIFINPERGDFEVYILRKIVEDVDIPTIEISLED
IRKIDDSLEVGDYYEEGPIKLDDYLSRKSIQIIKQSVQKKVRDLERLIVY
EECLEKVGEVIAGEVYQVRSNEVIFTYNTSKDHRVELVLPKSEMMKKDNP
RRMPRMKLYVKRIEREKNKIRMDDGTVVEREKPDGGMKVIVSRVDDRFLY
KLFEHEVPEILDGLIIIKAVARVPGERAKVAVESTSARIDPVGATVGYRG
KRIQSIVKELNNENIDVIYYADEPQIYISRAMQPAKIDPMTVHVDIKTRK
ARVMLKPDQIKYAIGKNGNNIHLAEKLTGYEIDVYRDVMDKTTEDPTDID
IIEFREEFGDDMIYQLLDAGLDTAKKVLRAGLEEIELALLGPAKVDEPQL
FHKAPKGTIPMVRERKITDEERRYWHKIADNIYRTVREQFSDDDFDELWD
EELLPSEDAPLDSATEVEPQQDGEGED
>Cag_0055 N-acetylglucosaminyltransferase, MurG
MKVLFAGGGTGGHLYPAIAMAAELQRLSPNVSVAFVGTKSGIEATEVPRL
GYKLHLVSVRGLKRGFSPKLLIENLRILFDFARSLGITIQLLRSEAPDVV
VGTGGFVSAPLLFAAQLLGKKTLIQEQNAFPGVTTRLLSLFASEVHVAFN
EARRFLLNKRHVFLTGNPARLFQPMDAAQARARFGLQHNRPTLLVFGGSR
GARSLNNAIASQLDTLLASVNLIWQTGSLDGEKLKAEVKPSPYLWMAPYI
EDMEAAYSAADVVVCRAGASSLAELTNLGKVALLVPYPYATADHQRHNAQ
SLVEHGAALMVADSEAFTKLVPTALELLQNSGKRAAMSVAAAKQAHPDAA
GVLAKRVLGLSR
>Cag_0088 DNA gyrase, subunit A
MQRERIVPISIEEEMRGSYLDYSMSVIVSRALPDVRDGLKPVHRRVLFGM
HELGLQAGKPHKKSARVVGEVLGKFHPHGDTAVYDSLVRLVQDFSLRYPL
IDGQGNFGSVDGDSPAAMRYTEVRMKSIAGEMLKDLEKETVDFALNFDDS
LEEPTVLPSAIPNLLVNGASGIAVGMATNLAPHNLREVVNGIIALIEQPE
IEIQELMKHVIAPDFPTGGIIYGYEGVRQAYLTGRGKVVIRARALVEVTQ
KNGRESIIVTELPYQVNKVRLIEKIVELVHDKKVEGIADIRDESDREGMR
LVIELKRDAVAKVVLNNLYKHTPMQDTFGVINLALVDGVPKILNLKEMMQ
YYVKHRNEIVLRRTRFDLAAAERRAHILEGLKICLDNLDEVISTIRQSPD
TATAQERLIERFGLSEIQAKAILEMRLQRLTGMERQKIDTEYIEVLALIE
ELRFILNSPEKQMEIIREELLKVKDVYGDERRTEIVPQEGDFSIEDMIAQ
EDVVITITHDGFIKRFPVSGYRRQARGGKGVTGAQAKNDDFIEHMFIAST
HNYILFFTTSGRCYWLKVYEIPEAGRAARGRSLANIMELPPGEKIRTYIN
IRNFEEPGFIVMATTHGIVKKTALEEFSHPRRTGIAAITIDEGDELLDAR
LTDGDHQIILAKNSGFVVRFPENEVRPMGRTAMGVKGITLDEDEKCIAMV
TTRRMDTALLAVTDNGFGKRSRVEDYRLTRRGARGVITLKPHEKIGALVG
LLDVNDEDDLILITVNGIVNRQHVSDIRITGRNTSGVRLIRLMQGDSISA
LARVPKSDEEGDGDFPLEDADGQIPLFE
>Cag_1746 MesJ protein
MTHVNAVIVQYFFTGSLLTINLTKQTNNLMTPLEKQFLEQLTKRRLVQRG
DKVLLAVSGGPDSMALLSLFCAVQPFLQCSLALAHCNFQLRDSESDSDEA
FVQEQATARNLPCFVERFETRNVAATWKKSLEESARLLRYAFFERLMQVH
GFHKVATGHHVNDNAETILFNLVRGTSLLGLRGIRAQHGTIIRPMLLLHK
PAITTYLHEQQIPYRVDSSNEGTEYDRNFIRHRLIPLIEERFPRKLLPSL
QRLSEQAGELEEFLELYFEQLTQREPLLQLHNNQLNVKALRQLTTFEQKE
IIKRALWELGAPVDGQTLQRMVELLQTQSGRMVQLGRKWQVEWKGNSLYF
SHQP
>Cag_1574 conserved hypothetical protein
MNRYLYDGTADGLLSAISWILEEEQEPEQVVLAEREDTLFEEGIFLNTDV
ARSEALFSRFRKQLPDVAQTLYFFMLAESNGMATNLLHYMALALQYGDRV
NGYLTHPAVKAIVHLSRKASRELHRMKGLLRFEQLCDGAYLAQMSPDHNI
LHPLSHHFRHRLKAEHWFIVDRKRHTAAHWHQGSLEFGTIEQFNVPALSE
QEQKVQTMWRTFFATIAIQERKNPALQRSNMPMKYWKYLTEKQ
>Cag_1837 30S ribosomal protein S8
MPVTDSIADYITRIRNAGRAKNSTTDIPYSKLKENISQLLLEKGYIKNFT
VITTEKFPFLRIDLKYMQSGEPAIKELTRVSKPGRRVYDGKDIKKYLGGL
GLYVLSSSKGVITDKEARAQGVGGEILFRIY
>Cag_0392 Glutamyl-tRNA synthetase
MVGQRVRTRFAPSPTGYLHVGGLRTALYNFLFAKKMKGDFIIRIEDTDQS
RKVEGAQQNLIKTLEWAGIIADESPQHGGSCGPYIQSERLDIYAKHCKQL
LEDHHAYYCFATPEELEENRQLQLKQGLQPKYNRKWLPEEMGGTMPESLI
RQKLEAGEPYVIRMKVPDYISVWFEDIIRGPIEFDSATIDDQVLMKSDGF
PTYHFASVIDDHLMEISHIIRGEEWLSSMPKHLLLYEFFGWEAPKVAHLP
LLLNPDRSKLSKRQGDVAVEDYIRKGYSSEAIVNFVALLGWNEGEGCEQE
VYSLQELTDKFSLERVGKAGSIFTLDKLKWLEKQYIKNRSVEKLLSVVKP
LLLEELEKKPSIMAREQITSDSYLSSVIELMRERVGFEHEFVTFSTYFYF
EPESYDEESVKKRWQPNTNELLADFIPQLEALSDFTAENIEAALKAFVEP
KGMKNAVLIHPLRIVTSGVGFGPSLYHMLEVLGKETVLRRIRKGLECITM
PA
>Cag_1476 glycosyl transferase
MVHAAKGLADKGHTVVLASKKHSKIIDYAHSKGVKTTVFEIRGDVSPITT
LKIAHFLKKHAIDVLICNLNKDVRVAGLAARTVNTPVVLARHGMLLCGNK
WKHKVTLTQLVDGIITNSKTIKEAYQNYGWFDENFVKVIYNGLSIPENIQ
THDFSKQFPNKKIIYSAGRLAEQKGFTYLIEVAAQLQKERNDLIFVVSGE
GKLEETLKQEVNNAGLSDSFYFLGFTADIYPYLKGCDLFVLASLFEGMPN
VVMEAMAMKKPVIATDVNGARELMIDGETGIIVPPREPKNMADAIRKIID
NSDALIEMGQKGYERVTSTFTTQAMADALEHHLLEKLAEKKSYKTT
>Cag_0324 Chlorophyllide reductase subunit Y
MQKKECELLHPQSMCPAFGGLRVLTRIDGVQVCLVADQGCLYGLTFVTHF
YAARKSLVSPELMNVQISGGTMIDDVRQTITEIASDPSVRFIPVVSTCVA
ETAGIAEELLPRRVGNAEVLLVRLPAFQIRTHPEAKDVAVATLLQRFGAF
GEQRKPKSLLVLGEIFPVDAMAIGAILQKIGVEAVIVLPGTALEDYIEAG
QVEACAVLHPFYERSVALLEKHGVKIITGNPIGANATAQWIRRIGEVLEL
DSATVNAIADEESQKAREVLGRFSHLEGKVMVAGYEGNELPVVRLLLEAG
LDVPYASTSIARTALGEEDHNVLSMLGTEIRYRKYLEEDMEAVRHYQPDL
VIGTTSLDSFAKEQGIPAIYYTNNISARPLFFATGAATVLGMISGLLARK
DAFRAMKEYFEG
>Cag_1923 Twin-arginine translocation pathway signal
MGMTRRQFFSTIAGAAASAALVAALPERLLAAWSDKAFTASTLANAIAGK
YGNLPIEDSTAIQVKAPEIAENGAVVPITVATNIAGATNISIFTEANFAP
MVASFDLLPRSLPEVSLRMRMAKTATVVVLVQAGNKLYRATREVKVTIGG
CGG
>Cag_1433 possible NtrR protein
MGMQNNRYMLDTNIASHIIKGDIPVVRERLIALPMEAIMVSSVTKGELMY
GLAKRDYPKVLTQKVNEFLLRIQVLAWDQDVAVVYGKFRSACETIGVTLS
PLDLMIAAHAHASNAILVTADKAFSRVPNLVIENWASEPS
>Cag_1964 molybdenum enzyme related to thiosulfate reductase and polysulfide reductase, large subunit
MSYTHKPTVIEAIAEKLHLIPNLHEGDGAAPLPRIAAEGSEVSCPPPDQW
DNWVEYDAKSWPERKSNEYMLIPTACFNCEAACGLLAYVDKQTMQVRKLV
GNPYHPASRGRNCAKGPATLNQLEDTDRILYPMKRVGKRGEGKWERVSWD
SVLDDIAARMRKALLEGRNNEISYHVGRPGHTGFMDWVLPAWNVDGHNSH
TNVCSSGARFGYAIWEGYDRPSPDYTNTKFMLLVSAHLESGHYFNPHAQR
IMEAKMKGAKLAVLDPRLSNTASMADYWLTTFPGSEAAVLLALAKVLIDE
GLYNRDYLENWVNWQEYLQKEYPKEPVTFERFIEALKAEYRDYTPEFAEQ
ESGVKAATIVEIARNIGEAGTQFATHVWRSACAGNLGGWAVSRTLHFLNV
LTGSVGTPGGTSPSSWNKFHVHVHAEPKPQTFWNPLHMPNEYPLAHFEVS
MLLPHFLKEGRGKLDVYFTRVFNPVWTYPDGFSWIEALEDESKIGLHAAL
TPTWSETAYFADYVLPMGHSTERHDIISYETHAAQWIGFRQPILRVASEK
MGKPVTFTYEANPGEVWEEDEFWIELSWRIDPDGSLGIRQHFMSPYRPGE
KITISEYYRYIFEHTPGLPEKAAEEGLTALEYMQKYGAFEVETNIYNYNE
RPLSPDDRQGATVDAESGLISKNGKAVGVKINGMECAGFPTPSRKQEFYS
QTMVDWKWPEYRKPTYVKSHVHHEQMNLSNGEFALVPTFRLPVLIHSRSG
NAKWLAEIAHRNPLWMNAADARMLGFEDGDLVRVNTDIGYSVNRIWATEG
IRNGVVACSHHIGRWRRSQDPEANRWATNRVAIKREGGSTWRMRVEESIE
PYESSDPDSSRIFWSDGGVHQNITFPVHPDPISGMHCWHQKVRVEKAHDG
DQYGDIVVDTNRSHEIYREWLAMTRPAPGPNGLRRPLWLGRPYRPDEKTY
YI
>Cag_0595 conserved hypothetical protein
MLKVRRSKLYTLWFGWYSHRQFRRFFNSVQVFMPSEVRNMDKRIPVIFYA
NHAYWWDGFWSQLCTEEFFNQRLHIIIELQQLQKHQFFTRIGAFSLDQSH
PRTWGETIRYAAEVLTEPHHQQNALWIFPQGKIEHVDKRPLHFFSGTSAI
ARQVLEKSEQLYLVSIVSRIEYLGEQKPELFLSFKHHHLLAPHNFSGTHE
LTAYMQTVTEQHCDELRERIMQRQLEGAITLIQGSASINRKVEEFQRFFC
FGKEKK
>Cag_1009 CRISPR-associated protein, CT1134
MKNSIEFKVYGREAMFTDPVSRIGGEKCSYHIPTYEALKGIVKSIYWKPT
IIWVVDKVRVMKPIRTKTKCLKPLKYHEKGNSLAIYNYLCDVEYQVLAHF
EWNLHRPELAQDCINGKHYSVALRALDKGGRQDIFLGTRECQGYVEPCKF
GEGKGEYDNISELAFGFTFHGFDYPDETGQKMFRSRFWNPKMVKGVIDFI
RPEECLPEHCKPIHEMSMKPFGIDTNQRSVIKEEEVWQ
>Cag_1050 hypothetical protein
MTIAELQEQPLAERLMLMEELWETLCNEKHHIQSPAWHQEILEERINLIN
SGEAEYLSIEELYVLPKPREIRKGFPSLNYSRATFLSFPRRRESRIV
>Cag_1301 Excinuclease ABC, A subunit
MNAHGQLTDTSLPDIVLKGINTHNLRNISVRIPRNKFIVITGVSGSGKSS
LAFDTLYAEGHRRYVESLSAYVRQFLERMPRPDIEHVEGIAPAIAIEQKA
LPKNPRSTVGTVSEIYDYLRLLYARIGKIYSRDTNELVLKHTPDDVSLQA
GFIEDGKKFYVGFFFPHHHTAQQLDCSPEEEIANLLKKGFFRLLAGDELL
DLNQEADYQKVLDMPAKVRAELLVVVDRFVARNNDKLFSRISQAAESSFM
ESGGHAVLKVVDGKTYRFSDRLELHDIEYQEPSPQLFAFNSPIGACTTCQ
GFGRIMGIDEDAVIPDKSLSIEEGAIACWNSEKYRWNLLELMHYAPKFGV
PLREPYEKLTFEQKEIIWKGTPDGSFNGIRAFFAEIEKDAGYKMHYRVFL
SRYRGYAICPDCEGSRLNPDALQVKISGRHIGEVTRMSIGEVAEFFRNLN
ISPFDRSVAEVILQEINRRLGYLLDVGLDYLTLDRLTHTLSGGEFQRINL
STSLGSPLVGTMYILDEPSIGLHQSDSARLIALLRKLRDLGNTVVVVEHD
REIIEAADEVIDLGPFAGRLGGEVVFQGSMEAMRSSGTSLTAQYMNGEQQ
IEVPQQRRTVDFSACITISGAMQNNLKNIDVQIPLKVMTCITGVSGSGKS
TLINDILCKGILREKHGSRGTVGTHRSLTGAWLIDRIEHVDQSPIGKSSR
SNPVTYMKIFDDIRTLFANTPDARKKKVKAGYFSFNIPGGRCEVCSGEGS
VHIEMQFLADIEAVCEACNGLRYQPEALAIKFNGKSIAEVLDMTVSEALS
FFKGEKNIVKKLSVLDQVGLGYIRLGQSSSTFSGGEAQRLKLATFIAHAD
TTHTLFVFDEPTTGLHFEDIKKLILCFEKLLEQNNSLIIIEHNLDIIKQA
DWVIDLGPGAGDKGGHLVEQGTPEEVAQCTESLTGQYLRGVV
>Cag_1087 SpoU rRNA methylase family protein
MMSREVEKKFRKLHGSEMARLLPEEYAASPCHPIVLMLHNIRSMWNVGSM
FRTADSAGIERIILSGYTATPPRKEISKTALGADESVPWSYTEDAYKTVR
KLQAEGVKVCALEIAEGSRAHSSICMADFPICLLVGNEVDGLDDELLQAC
DAVLEIAQYGTKHSLNVAVAAGVALFELVRVLRA
>Cag_0955 Rhodanese-like
MKKRVASLALAVALTALPSPSYAIDANKLPIGKRTSSGKYLSAKEAYKMK
IKDPYSVLFIDVRTPAENEFVGIADEVDKVIPLKLNNHAVWDHKKERYGQ
YLNPNFVSAIDELVLLKEKDKATPIILMCRSGDRSAEGANLLAKNGYTNV
YSIYDGFEGDLETEGSKKGRREVNGWKNSKLPWGYSLKQDRAYLVVPPAA
PTTLPAAPASVVAPTVQPPAAPAVPTTVAPAPQVVQPPAPVAAPVSKSQT
TVPTPTATPAPAPVTAAAPVQ
>Cag_2021 ribosomal protein L11 methyltransferase, putative
MPHIGKQPYIEIAFTIEAEHHEMCIALLASEGVESFLEEDQRLLAYVPQS
AWSEEKKQAITIALDNWLGQVPPMHVNVIADRNWNEEWEATLEPIEISDR
LLIVQNNKQITPKEGQMVISINPKMSFGTGYHATTRLMLRQMEELALADK
RIMDIGTGTGVLAIAARKMGNQYPIFAFDNNTWSVENAVENCEVNQTADI
AIHLLDAEEELVSELRNGYNLILANINKNVLDRILPVLRSAAPTATVLLS
GVLTYDEAWLKKLCKKLGYDIAKTLYEDEWLSALLVGK
>Cag_0509 conserved hypothetical protein
MVIKRAQKALLTERLENEPRNFIQVLYGPRQVGKTTIAQQFMQTTSLPVH
FVSADYVAVEQSHWISQQWETARMKLRQSEQQQAVLIIDEIQKINNWSEV
VKKEWDSDTANQLSLKVVLLGSSRLLLQQGLTESLAGRFETLYVGHWSYS
EMREAFNVTPEEFVWFGGYPGAASLIYDEERWQRYITDSLIETSISKDIL
MLTRVDKPALMKRLFELGCSYSGQILSYTKILGQLQDAGNTTTLAHYLRL
LDSAGLLGALEKYSIETVRRRASIPKFQVHNSALLSAQQPLAFRDVVSNP
ALWGRWVESAIGSHLLNYTRTHNLELYYWREGNHEVDFVLVHKGRAIGLE
IKSEHSQQTAGMGAFAKQCKPYKVLLVGDSGIAWQEFLTLNPLELF
>Cag_0908 Prevent-host-death protein
MAVEQVTSLTDFRNNPDYYFEELSKNQQPLLLTRRNKSSAVLLDATLYQT
LLEQIAFMKSVADGLEDVRHNRLCSMDEVFDSVERIIVEAEKQ
>Cag_0145 DNA mismatch repair protein
MPIITRLPDSVANKISAGEVVQRPASVVKELLENAIDAGATKISVTIKDA
GKELIRIADNGVGMNRDDALLCVERFATSKIKSADDLDALHTLGFRGEAL
ASICSVSHFELKTRQADATLGLLFRYDGGSLVEELEVQAEQGTSFSVRNL
FYNVPARRKFLKSNATEYHHLFEIVKSFTLAYPEIEWRMVNDDEELFNFK
NNDVLERLNFYYGDDFASSLIEVAEQNDYLPIHGYLGKPALQKKRKLEQY
FFINRRLVQNRMLLQAVQQAYGDLLVERQTPFVLLFLTIDPSRIDVNVHP
AKLEIRFDDERQVRSMFYPVIKRAVQLHDFSTNISVIEPFASASEPFVGS
SSQPIFSSTSSQAPRMGGGSRRFDLSDAPERAITKNELYRNYREGAFSSP
SVASYDAPSPLQQGGLFALASAEESLFGAQAVHEASENIEAFQLSPLDNI
VEHKEVEPKIWQLHNKYLICQIKTGLMIIDQHVAHERVLYERALEVMQQN
VPNAQQLLFPQKVEFRAWEYEVFEEIRDDLYRLGFNVRLFGNRTVMIEGV
PQDVKSGSEVTILQDMITQYQENATKLKLERRDNLAKSYSCRNAIMTGQK
LSMEEMRSLIDNLFATREPYTCPHGRPIIIKLSLDQLDKMFGRK
>Cag_1250 nitrogenase iron-molybdenum cofactor biosynthesis protein NifN, putative
MKHEHAKSVTQNACKLCNPLGACLAFRGIEQCVPFLHGSQGCATYIRRYL
ISHYKEPIDIASSNFNEETAVFGGSHNLKVGLKNVSQQYKPQVIGIATTC
LSETIGDDVPRILREYQKEFKNGTPMPLLIHASTPSYQGSHIDGFHAAVH
AAIKTLATKGQKQEQINLFPNMVSPADLRHLKEIFADFEIPLMMLPDYSQ
TMDGGPWAEYHRIPPGGTPATAIADSANSRASIEFGSTIEANKSAAHYLD
VMFGIPAYRMALPIGIKASDRFFSLLETLSEKGRPEKYDDERRRLVDAYA
DGHKYVFEKKVILYGEEDLVVAITAFLREIGMIPVLCASGGKSGMLKERI
AEIVPDMEELGIKVRDGVDFVDIEDEAKVLHPDLLMGNSKGFTMSRKNEI
PLLRLGFPIHDRFGGQRMHHLGYRGTLELFDRIVNMIIETRQNASPIGYT
YM
>Cag_1831 SecY protein
MKLTDSIKNINKIPELRQRILYTLLLLFIYRIGSHITIPGVDALAVSTAS
QSHANDLFGLFDLFVGGAFARASIFSLGIMPYISASIIVQLLGAVTPYFQ
KLQKEGEDGRQKINQLTRMGTVLIAILQAWGVSVSLASPASFGKVIVPDP
GFLFIVTTILILTASTMFVMWLGERITERGIGNGISLIIMIGILARFPQA
VVAEFQSVSLGSKNWIIEIIILALMGAIVASVIFLTVGTRRIPVQHAKRV
VGRKVYGGNTQYIPMRINTAGVMPIIFAQSIMFLPATFLSFFPENEMMQS
IAGAFAYDSWWYALIFGAMIVFFTYFYTALAFNPKDVADTMRRQGGFIPG
VRPGKSTAEFIENILTRITLPGAISLALIAVLPTFLTKFANVTPGFAQFF
GGTSLLIIVGVGLDTLQQVESHLLMRHYDGFMKSGKTRGRQGR
>Cag_1101 putative transcriptional regulator
MDSRLRGNDRNIKHTTMNEEFIKELISKGETSSTQFKLNISNELSIAQEM
VAFANTKGGRILIGVDDKTWEVIGLTDNDIRRLTNLLVNASSEHIKPPLF
IETETFIIDDKKIIVVVVPEGSDKPYKDKDGIIFLKNGANKRKVTNNEEI
LRLLSKGKHLFADELPVNQATIEDINKDKFDKFFLREFNAEYEALGLSYQ
EALKAKRVLKEGKITLAGFLFFGKMPQNIKPAFCIKCVAFYGNSLGGTEY
RDSRDINGTIPILFDEGMAFFKRNLLHTQQGQNFNSQGILEISIIALQEL
LENALIHRDYIKNSPIRLLVFDNRIEIISPGCLPNSLVVEELRYGNPVVR
NNLMVSYALHTMPYRGLGSGLKRAFEQQPNIELINDTEGEQFKVIIPRPE
KR
>Cag_0934 ATPase
MSYQVIARKYRPAKFSDITAQEHVTRTIQNALRSGRIGHGYIFSGLRGVG
KTTAARIFARALNCQKLIDDADYLQQVTEPCGECESCRDFDAGTSMNISE
FDAASNNGVDDIRTLRENVRYGPQKGRYRVYIIDEVHMLSIAAFNAFLKT
LEEPPPHAIFIFATTELHKIPPTISSRCQRFNFKRIPLEAIQQQLQQICE
AEHIQVEADALQLVARKAQGSMRDAQSILDQVIAFSSENALEGSITYRGV
ADLLNYIDDDTMFAVTDAVLANNPVAMLEVAHFVLKNGYDEQDFLEKLLE
HLRNLLVVLNLSSTRLVERPDAVRERYQRDAAKFSPHTIMQMAELLLQTQ
KELKFLFEYQFRFELALLKLLEIAHPPASAAALTIAPEKKKPLSNQ
>Cag_0421 6-pyruvoyl tetrahydrobiopterin synthase, putative
MRISRKIEIDYGHTLPNSFSFCNQLHGHRGVVVATVEGNIISQEQHSQQG
MVMDFKFLKDIMVEQIHDKLDHGFAVWKHDTEDLEFITRRNTRVLITDEP
PTAECLARWAFHQMQPHLPAGVTLQRIRWYETPNNWADYEGE
>Cag_1536 hypothetical protein
MLRGVELVQFLWIHEFFVMSASYNESVMPSQGINGLAMLTPIIGFPIFFH
ALSGMVVAGIGVTAYNNVVAPLAGKLVEFTQDTLPQLLPPLTPSILSIIP
AQEVPVVIPITIKAKETSLLEA
>Cag_1147 hypothetical protein
MPKEEKEQQSPIEQPKNPPINLPIEPLIAAPAVLGAPAMLGVPLVIHALA
GAAIGAFTFVTGSLLLKMTKDKALAAKPDPNEVIPPPMSVPNFPRHYSRN
DPHIDSPLLSSLRK
>Cag_0379 conserved hypothetical protein
MHNILTKQYGSFLQEIKDKIRNAQYEAAKAVNTTIITLYWEIGCQLSEKL
KTGWGKAVIQTLADDIQKEFPGIKGFSTTNLWYMKQFFEEYSASEFLQPL
VGEISWTKHLLIMSKCKDLQERQFYIVATKKYGWTKNVLNHKIELKSFEK
FALGQSNFEQTLPSSIKNQAILALKDEYNWNFSELDDEHSERELEEAIIK
NIRAFLMDFGPDFSFIGNQYRIQLEEKEYFIDLMLYHRAMQCLIAIELKT
GEFLPEYKGKMEFYLNVLNNTVKLPHENPTIGIIICKSKSRTIVEYALKE
CAKPIGVATYTLTDTLPENYRKQLPSAEELAIKLEAFIEMTKKK
>Cag_0712 Queuine tRNA-ribosyltransferase
MKFTLVAEDARSAARCGIVHTAHGDIPTPVFMPVGTRASVKSVEPHELDE
LQVPIILGNTYHLYLKPGNAIMHKAGGLHRFMNWQKPILTDSGGYQVYSL
SELRKISEEGVIFKSHLDGSRLQFTPENVVDTERIIGSDIMMPLDECPPA
TAERSYVEASGELTIRWAERARQHFSKTAPLYGHEQYLFGITQGGIHADL
RTRSTKALVELDFDGYAVGGMAVGEEASEMYRILELSHPLLPIHKPRYLM
GVGTPANILNAIERGIDMFDCVIPTREGRNGRVYTRNGTINLRSARYADD
FSSIDEGFENEVCRNYSRAYIRHLLNVGEILGLKLCTLQNISFFTWLTKT
AQQEIQNGSFVDWKEDFLSRFTHGEKR
>Cag_0178 Membrane proteins related to metalloendopeptidases-like
MALCSQQDFPIFAKIKSIAYHYLFSLSMNKRVVIPAKAGIHTCNKCNGLP
RRLRLLNIPLHWRGGRRSLTGWLTKSAMDCFTSFAMTVWGTSVVTERRRL
LSIMVALGSIVPIHKAATLYGAEKQNLQAALLPEQAGQMIEELVFALEPE
EGAATEKGELTDNVAPTSSLFASIPNIKPVYGTLSSLFGMRMHPIYNMPL
FHSGIDIAAPIGTKVHATGDGIVAFVGNSKGYGQKITINHGYGYKTIYAH
LSKMVVQQGDNVRRGDTIGLSGNSGTSTGAHLHYEVLRYNQRLDPSAFYF
EEHGARKFTAIQSKQSPKDNS
>Cag_1485 conserved hypothetical protein
MQVTYIVLDDHNPLHRELSIYRTGIIQRICMDDAAYKTYGSLEVDGHNYA
ACFHYGLVESLNRLPFLSESGSGLESGEEALLHRSRLAEFLCIVKEALAT
LDNTHRETILVGWQQEPVAIAYLRALDAERFATFLISLLHFVEESELQQY
DLEFLW
>Cag_1997 TPR repeat
MWGTFLFCAVMNFSLSHLARTIALLLLTASPAFAESADELFNRGFALHMQ
GKLQEAVSCYSDAIDEVPTFAMAFQMRALAYQQLKKFPKAVNDYSSAIEQ
GDASFKVVGYYNRGVVKNIMGDFVGAVDDFSQAIVLNKKMATAFFHRGIA
RHQLGDNDGRFEDFRQAALLGDRTAEQWLNTYHPNWKPVPPSPPSIPPSI
PSSLPATAPPIQPSSPPTAPTDAPKPASVQPAPSEPNDSTRTSATTPA
>Cag_0027 SecF protein
MRIFHKTNFNFLAARKVAYIISLVLLLVGIGSLALRGLNYGIDFRGGSEV
VIRFEKDIDVSHIRSVLDAAGVSGTLKQYGMDRSFLFSTVFQGDSGELKT
LLENALNDRITSNKHEIVRIDAVGPSIATDLKWSALKALAGALFAILLYV
GFRFEVKFAAAGVVAIFHDIIVVLGLFSLLGGVFPFMPLEMDQSIIAAFL
TIAGYSITDTVVVYDRIRERIRNQKPSEYERIFNESMNQTLSRTVITSGT
VLIAVLVLFLFAGPAIRGFAFAVFSGILIGTYSSIFVAAPLVFDWLKRTN
STVQLRGSQK
>Cag_1417 Protein of unknown function UPF0047
MKILTHTLTIPTCKPIELIEVTDQIKDLLMASELQQGQVTIISRHTTAFV
NINEYEERLLEDMEIFLKRLVPKDGNYLHNISPLDGRHNAHSHLMGLFMN
SSESIPFADGKLLLGQWQSIFFVELDGPRPKRELLMQIAGV
>Cag_0100 tRNA modification GTPase TrmE
MRNSLSFQDEPIVALATPLGVGALAVVRMSGQGVFDIARKVFHKQGAPDF
HLASSKGFQAHFGTIHDAQGVVDEVIALVFRSPRSFTMEDMVEFSCHGGP
VVVQHLLKALIDAGCRLAEPGEFTRRAFLNGRIDLLQAEAIGEMIHARSE
SAFRTAVTQMQGRLSRQLEEMREKLLHSCALLELELDFSEEDVEFQNREE
LREDVQRLQGEINRLLDSYQHGRLLKEGVATVLVGSPNAGKSTLLNALLG
EERSIVSHQPGTTRDYIEEPLLLGSTLFRLIDTAGLREGEEEVEHEGIRR
SYRKIAEADVVLYLLDVSHPDYCNELSDITSLLEQASPNVQLLLVANKCD
AITNPTERLAQLQAAMPQATVCGIAAKEGDGLEALKQQMSNMVAGLDKLH
EASVLITSMRHYEALRRASDALENGACLVAEHAETELVAFELRSALEAVG
EITGKVVNDEILSLIFERFCIGK
>Cag_1477 glycosyl transferase
MLKILCLHIAHKQASYRYRVEQFLPYWKQYGIEFEPVCIVGKNYFEKLQL
ALSSNKYDYVWLQRKLLSPFFINLITKRSKLIYDYDDALYSIESQRNNKP
KPTHPGSKQSIERLNYILKRASLVFAGSEALYNYSARYNASATFLIPTAF
PAQSNISLPSKINNNSVTIGWIGSIQNLFFLSIIDDVTAAIQQRYPDVRF
SVMSGKPPEGLKTHWDFVAWSKEGEDAWLRSIDVGIMPLVDDEWSRGKCA
FKLLQYMAYGKPVIASAVGANYAAVLHGESGFLAKTLDEWRSAFEIMITN
RALSFSMGQASLNHFLLHYELRHVQNKIVSLLQ
>Cag_1974 2-dehydro-3-deoxyphosphooctonate aldolase
MQQKFSIGSITVPDCELPLLIAGPCVIESRAMAFEIADELQRISQAEGVR
FIFKGSYRKANRTSAASFTGIGDEDALTILADIRQKYGMPVLTDVHESAE
VALASRYVDVLQIPAFLCRQTELLVAAGESGLAVNIKKGQFMAPDDMRLA
AAKVARTGNNRILLTERGSSFGYHNLVVDFRGIAKMAESGYPVLYDATHS
LQLPGAGQGMSGGEREYMLPLARAAVATGVDGLFCEIHPNPEKALSDAAT
QIPLAEFGVIIHQLLHLYRCVQPLLSH
>Cag_0127 Succinate dehydrogenase or fumarate reductase, flavoprotein subunit
MKPFDIVIVGGGGAGLYAAMEAMKVNPSLNIAVLSKVYPNRSHTSAAQGG
ANAALANKAKDDTVEMHIFDTIKGSDYLADQDAVELLCSEAPKLIRELDN
IGTPWSRMDDKTIAQRPFGGAGRPRCCYCSDKTGHTILQTLYEQCLKKGV
IFFNEYFALSLSVSGSRTKGLLAMNIKTGHIEAFAARTVIFATGGYAKMY
WNRSSNAAGNTGDGQAIALRAGIPLKDMEFVQFHPTGLRKSGLLVTEGAR
GEGGYLINKDGERFMARYAKEKMELGPRDLVSRSLETEILEGRGFDSPAG
PYLHLDLTHLGADLIKSRLPQIREMSMHFEGVDPIDAPIPVRPTAHYSMG
GIDTDNFCRTVMKGVYATGECGCVSVHGANRLGGNSLLDILVFGRIAGHT
AAQEARQFEPGTISDAEVKEYYDNLRSTMQPSGHYERYGIIREELGHTLA
ANVGIFREASLLKKGVEDVAALKERFQKVRVFDSSDVFNTNLVQVLELRN
MLDLAQVVAAGAQVREESRGSHTRVDFPTRDDAKWHKHTLATLENGKVVL
DYKPVTMGRYELQERTY
>Cag_0601 hypothetical protein
MVAITTTELRKNFKKYFDIAHSERVIVHYGKNKSYEIIPTQKECENDAYF
SNPKLLAALKEAEEDIAAGRFTEIKDPKNLWDSIK
>Cag_1134 conserved hypothetical protein
MPQKASSNPAWIGYAGLALGLLLIGLLFSQLDLQRSFALIADTGWYALLI
LLPFGALHLLETFAWQHLFPQTSGRVPFVRLLKIQIIAETVSMTFPAGVA
VGEPLRPFLCHRLLGIPVPLGVASIAVRKLLLSVAQGVYTLVGSLFGFAL
LQQLSPTILRFNGLGYIMVTMGLSVLLLFLFVLILLLNGNVAEKVHSLLM
RIPFEKVRQKLLAHEAGFLATDKALQAFRGNHHPRLWLVLLLYVTAWCML
ALESYLILQALGIQIPFMQVLTIDVALTMLRTIFFFIPSGLGVQDVGYLL
FFQALGIPEAVVVGGAFVLFRRLKELLWYALGYVLMFFSGIHLGDAASLQ
GEAE
>Cag_1981 transposase
MKDTVLFQQALCLPMPWFVKSSAFDIEQKRLTIQLDFQKGSTFSCPTCGQ
HDLKAYDTAEKQWRHLNFFQHECYLTARVPRISCPTCGVKAITDLPWARR
DSGFTLLFEAMIIALVPSMPCKTIANYVGEHDSRIWRIIHYYLDEALEQQ
DLSAVTKVGLDETASKRGHNYVTSFVDLESSKVLFVTEGKDATTVEKFHK
HLLAHKGKAENIKEICCDMSPAFIKGVTTNFPETHITFDKFHIIQVLTKA
VDEVRREEQKERPELAKSRYLWLKNQVHLNQSQQVKLEKLQLKKLNLKTA
RAYQVKLNFQEFFKQAPAYAQSFLNQWYYSASHSRLEPIKEAARTIKRHW
YGILRWFTSNITNGKLEGLNSMIQAAKARARGYRTTNNLIAMIYFIGSKF
EFTLPALTHSK
>Cag_1170 conserved hypothetical protein
MVKKRGEKFTIKIVYRNFDGEKAAFYSGNAMRQRNKKHPYLEHLLKSAHP
ITLAANSKVLILSDLHMGNGGRRDEFKRNADLVRTMLESYYLTGGYSLVL
NGDVEELFKFKLQHIMQEWDSLYDLFLRFQRNGFFWKTYGNHDAALVDEK
EYKLASAMVHSLKFYYGNETMLLFHGHQASELLWETYPLISSTAVFVIKY
IAKPMGIRNKTSAYNSPRRFALEKSIYDFSNQAKILSIIGHTHRPLFESL
SKVDFLNFRIEELCRSFPAADDERRTEIRQNISTLKNELDACYQQGKKIG
LRSGRYNNITIPSVFNSGCVIGKRGITALEIEGNKIRLVYWFNGKQSRKF
KSDRNNRPIELSTTGYYRVVLNEDSLDYVFSRIHLLA
>Cag_0556 Hydrogenase expression/formation protein HypE
MQCPTPITQHETVQMAHGAGGRLSAALMQRVFMPNFHNALLDQLDDQAKL
DVPIGRLAFTTDTYVVSPLFFSGGNIGELAVNGTVNDLAVGGALPLYLSA
GFVLEEGLPLTELEAVVRSMAEAARRAGVTIVTGDTKVVGRGQCDKLFIN
TSGIGVVREGVEVSCRNLQVGDSIILSGSVGDHGMAIMSTREGLSFQSRI
ASDTAALNGMIAKVLDAAPKAIHAMRDPTRGGVAATLNELAGSSNVGIEI
HETAIPIKPDVRGACELLGIDPLHVANEGKIIVVVAADQAEQVLEVMRSC
KHGSDAALIGSVVADHPKMVVMRIAFGSRRILELPVGDQLPRIC
>Cag_0406 Alkaline phosphatase
MATLRFDGSQSPLLNASVDGLATRLTLYFSAPLDSSKLPSLSQFAVTNDG
NAVTASSMQVVDNQLHLYFATPLNSTKSIAISYTDATAGNDVLALQDVAG
NDAQSFEHVAVTTALSLTANESESLAFASSLLLAGAEISAYDTASQRLFV
TSSSGLQVVSVGNNLAMSLLGTITTLGTNDITSVAVKNGIVAASVVATDK
TQAGTVYFLDADGDVASPSMVLGSVAVGALPDMLTFTADGKKLLVANEGE
QDTLGNNPEGSVSIIDLTNGVASATVTTASFTSFNSQVDALKAEGVRLFA
GETGFETITVAQDLEPEYISISPDGATAFVTLQENNALGVLDIASATFTD
IVPLGLKSFYGIPFDGSDKDGVSGAVAINLQTDQPVYGIRMPDAISSFRG
ADGKTYYVIANEGDDRDDFIAPDETARLSTLNLDDTLFPNEATLKTNSEI
GRLTVSNAPGNNGDIDGDGDIDQILAYGARSFTILDEAGRVVFDSGSHLE
EFTAVAGLFTDANGAGLFDDTRSDNKAAEAEGITIGHVGDKVLAFVGLER
GGGGVMVYDVTNPQEVSFVEYLRNVGDVSPEGLTFVPNDISPTNQGLLFV
TNEVSNTVSLFTISNKNDAPTVAEPLSDITVAEDSALYQRIAANSFADLD
AGDSLTWTATRADGSALPEWLKFNASHHDLESMEDYFLSVASGSGSYSTP
ATDSASAGTAIATGEKTVDGNIAWERTDSAGGALTTIAETLRAEKGFAIG
VASTVPFSHATPATFVSHDVSRNNYWDIAHEILFEVQPEVVIGGGMENSN
FAKAVTSAGNVDRDIDNNGYNDDYDAFLNDTDGTEYVFVKRTSGTDGGTS
LMNAAATVSLADGDKLFGLFGTSGGNFEYYELSDTPGSVTITRSTTDATP
TVDEDPTFAEVTNVTLSVLNQDSDGFFVMLEQGDIDWTNHANDYENMVGG
VYDLETAVTEAENFIEEGVNGINWSNTLVIVTSDHSNSYLRAQEELGLGD
LPTQNGSSYPDGDVTYGTGSHTNELVSVSARGAGASYFGELAGQIYAGTE
IIDNTQIYDAMMHAATEAGAEHIVLFIGDGMNIEHEIAGSRYLYGTDFGL
TWHDWSTLENGWNGYATTWDVTAYNKYATAASATAYNETSVDPLIGYNPA
LGGNTPYPVAMTFSGTPTNENVGAIDIKVTATDESGASVSDTFKLTVTNT
NDAPTVAEPLSDITVAEDSALYQRIAANSFADLDAGDSLTWTATRADGSG
LPEWLTFNASHHDLETMEDYFLSVASGSGSYSTPATDSASAGTAIATGFK
TVDGNIAWERTDSAGGALTTIAETLREDLGYAIGVASTVPFSHATPATFV
SHNVSRNNYWDIAHEILFEVQPEVVIGGGMENSNFAKAVKSAGNIDRDLD
DNGYNDDYDAFVNDTDGTEYVFVKREAGTDGGTALSAAAATVSLADGEKL
FGLYGTSGGNFEYYEVADTPGTVTTITRSTTDATPTVDEDPTFAEVTNVT
LSVLNQDSDGFFVMLEQGDIDWTNHANDYENMVGGVYDLETAVTEAEYFI
ASGANGINWSNTLVIVTSDHSNSYLRAQEELGLGELPTQNGSNYPDGDVT
YGTGGHTNELVSVSARGAGASYFGELAGQIYAGTEIIDNTQIYDAMMHAA
TTAGAEHIVLFIGDGMNIEHEIAGSRYLYGTDFGLTWHDWSTLEDGWNGY
ATTWDVTAYNKYATAAGATAYSETSVDPLIGYNPTLGGDTPYPVAMTFSG
TPTNENVGAIDIKVTATDESGASVSDTFKLTVTNTNDAPTGSILITGIAG
EGNTLQAEAMLADDDGLGTLHYQWKRDNQAITNATGSNYTLTAADKNQEI
TVIVSYTDARGNAEQVTSAYGVRYNTLTSNSNSTPITLVQGSGSGSDPLL
SVVMPTGFEMVTQQVTGNNLQSQLEAAAPTFAQQTDIQTAIEEYLANVGS
QAVTVRTIAFPTSMANATVNDAAPIIINGSDTAGTQEALVINTQNLPSGT
LIQLNDVEFATVIGAARITGGEGNNTVFADGSAQYIVLGTGDDTLHGGAG
NDTIGSLDGNDYLYGEAGNDTLSGGDDNDHLSGGTGNDLLDGGAGSDIAM
FSGALNDYTITHNAITDTYTVADKVSGRDGVDTLTNMEYCQFADTLYDLE
AANPYHHDTDFGHDTTAIGIAGLGLVGLLLFL
>Cag_0726 ABC transporter, periplasmic substrate-binding protein
MPKPFTHNIRMLSLVATWLVLMLSVVGCQPKATDENTLASSAQSSAKPRI
VSLAPSVTEMLYAIGAGEQLVGRTSACDFPAEAEKVPVAGAFGKPSLEML
ASMHPDMVIAVDLEDKRNSDKIRELGIRMEHITCTTPDEIPEALRTLGRL
TKNERQADSLASVISKGLAEFRKQAPPTAQRPTIYLELWDDPFWTGGKTS
YTSALIATAGGRNIGDVVEKDYFEISQEWVIQQNPDVIACMYMARNSSAT
DKVKERAGWNTIKAVQQKRVYDRFDNSLFLRPGPRVLEGIAEMHKTLYPV
NVGK
>Cag_1253 TonB-dependent receptor, putative
MKSIYQRTLTLTALAVMLAHSAHAAENQESTQPSYMAPELIISGQKGDVL
QHVVGKESVVLNPSQMSVYKAINLMPSLSQQSVDPYGLADIVNYHESFRF
RGIEATSGGVPATTVNVEGLPVTGRPGGGATIYDLENFSSLNINTGVMPA
HVGLGLADVGGKISMEIRKPEEQFGVQLKQGIGSDSFKRSYLRIDSGDLG
DGLKSFASVSNSNADKWKGEGDSERNNVMMGLAKPLGKNVTLETYLTWSK
GNIHTYKPLSYTQLSDLDGSYKSDYGTNPNSYDYFDYNRNEFEDWMLMAN
LSIKTGEQSTLTIKPYYWSDKGYYLETITLANGQNRIRRWDIDHDLSGIL
SEFSTNVADVDLRLGYLYHTQERPGPPNSWKTYKVVNGALVFDTWSILSN
PSRHELHSPYMDATWRFGSYQLEAGAKFINYTLPSIVTYQTTGIGDVAHA
DALALQPTLDTFASASDTKSFSRVFPNLTLTHFVSDNLTLHAAYGENYVT
HVDIYPYFISQRALFAAKNISFQKLWNAREMETSRNVEFGMHVNGSNWSM
TPTLYYAKHINKQAVLYDPALNATYPMNNAEATGYGFELEADYKPSNAFK
WYGSFAWNRFYYAQDIYSEGGSLIDVKGEQVPDAPEFLAKSMVSWKVGDV
TLSPIVRYSSTRYGDVLHKEKIDANTLCDFDITWSKALAGLKQVDCSLSF
MNIFDERYVSMISTSDYKTLKTSYQPGAPFTVVASVALHY
>Cag_0074 Transaldolase C
MQFFIDTANLDEIRAAAELGVLDGVTTNPSLIAKVAGSSKPFTWQAFKDH
IAAICEIVDGPVSAEVTALDANGMIDQGEELADIDGKVVIKCPVTLEGLK
AISYFDENDIMTNATLVFSPNQALLAAKAGATYVSPFVGRLDDVSTNGME
LIRQIVTIYRNYDFITEVIVASVRHSQHVVEAAMIGADIATIPFNVIKQL
ISHPLTEAGLKKFTEDAAIIQM
>Cag_0868 RND efflux system, outer membrane lipoprotein, NodT
MMLFPRKRESRKILMVGLSTTKRVRTKIIMKKIIWLALPTAIVLAGCSSS
HTLQSPTIALNDRYQQNSAHPQLTEAEGQQQQLVVESVAARWWEAFGSPK
LNRLIEQSLKQNPTLAAAEATLRQAEALANAKYNSTLYPRLDAVGSAQRL
QLNNSRNGVEGGEKRFNLYNGSLSSSYNFDLSGANNRQLDALQAKANYQH
YQLAGARLRLATEVAVTAIRQAQLGAQMEALERLIALGNEQLTINRERLR
LGAIASHELLEVERMVAEQRAALPAMRHAYQQSRHALALLEGSTPDNATL
PTFTLAEFQLPATLSMRIPSQFVRYRPDIQAAEALMMAANAEYGAAAAKA
YPQLTLSASLGSQALTTAALFGSGTAVWSVAGQLVQPLFNPSLGDEKKAA
NAAFEAATAHYRQSILAGLRDVADLLSALYNNAIALAALASGAAFADEQV
ALTEQRYKLGAASYLEVVQAQSEATQLQLELLAARAQRLSNSAVLYQAMG
GGEMLSPSGRE
>Cag_1373 conserved hypothetical protein
METVFIETTIPSYYVARRPRDIIQAARQELTIEWWDKHSSRYELLSSQIV
IDELARGEEIMAAKRIELLANIPLLLINEPVIKIAEELLRDRVVPQKAAD
DAFHIACAGVHQVDFLLTWNCTHIANPHNRHRIERCFAKHGIIIPIICTP
QEFIGDDYAN
>Cag_1631 3-phosphoshikimate 1-carboxyvinyltransferase
MRVFRGEVTTLPPDKSISHRAAMIAALSEGTTEITNFSAGFDNQSTLGVL
RNAGIVVQQEEVQGAHGRTMRRVVITSNGLWSFTPPTAPLQCNNSGSTMR
MMSGILAAQPFQCTLIGDASLMKRPMKRVADPLRQMGATIELSPDGTAPI
HISGTKELRSLEYRLPVASAQVKSLIAFAALHAEGETRIIEPLQSRDHTE
VMLGLETIVKKDERIIVVPGQQRIAAKPFFIPADPSAACFIIALGLLARG
SEIVIRDVCLNPTRAAYLDLLAEAKAGIGVDNQRVIGGEIIGDILIDNLN
GIEPLHISDPQLVAFIIDEIPMLAVLSAFATGSFELHNAAELRTKESDRI
EALVVNLQRLGFACEQYPDGFVVKKRTEVPQGKVTIESFDDHRIAMSFAI
AAQATGEALDISDIEVVGVSFPNFFSLLEELSSEA
>Cag_1483 acyltransferase, HtrB/MsbB family
MVRRLAWMKLNKHKSKELAQKIVYYLIIFFGFIFRKLNYATTVQLAAWLG
DLFFNVVKIRRQLVLQNLAMVFPARREKEIRQLARQVYRNQAENLLDMLR
LSSMRTSEDASKLLTIDTTEFLAKTTHVKKGAVLVSAHFSNWELLARSIG
LLVTPLHVVVKRLRNSFVDQKINQWRAECGNHVIYKNAAFREGIRMLQKG
GVLSFLGDQSDPKGTFFMDFLGRRTSVFVGPAYIALKAEVPLFVVMGYRL
GNGGYKAELQEIDMNGLQATKQDAEELVRRYTAVVERYIYQYPAEWFWFH
DRWKRVG
>Cag_1388 hypothetical protein
MLRNNNKRILWTFLNYLFMTVNELLPSVTTLSHVDKIRLVQIMLEQLAND
AVNSAQQKSLSSETFNPRYFFGADHQSKQIIDDYIASSREEWH
>Cag_1095 SMR drug efflux transporter
MHWLYLTLAIFLEVAGTTSMKLSEGFSKWLPTLFVFVFYGLSFTFVAQAL
KILPVSITYAIWSAVGTAAIAIIGIVWFGEQMSALKLGSLLLIIIGVMGL
HFSQEPLH
>Cag_1652 ATP-binding protein, Mrp/Nbp35 family
MSTSHDASSCSHHHEHHANGEHHTCSKHGGHGEHGGSCQQPEQPLQHIKH
KIAIASGKGGVGKSTFAVNLAVSLAQSGAKVGLIDADLYGPSIPTMFGLV
NEKPEVFEQKLQPLEKYGVKLMSVGFLIDSETPVIWRGPMASSAIKQFIT
DVAWPELDYLLFDLPPGTGDIQITLAQTLPMTGAVIVTTPQDVAISDVAK
AVSMFRKVNVPLLGLAENMSYYQLPDGTKDFIFGTKGGEKFAKIQGVPFL
GELPIERAVREGGDSGVPCVIEHPESATAKAFAQIAREVIRNVTIFEAAK
GSACC
>Cag_0546 hypothetical protein
MKKALLTALLFGMAAVPAQQLHANGFNYNYVEGQYVKSSMNNVDGSGYAI
TGSVALHDNVALNAGYSNDSYDYDIDTNGYNVGLTYHVPVADSTDILFNA
SLEQAEYSQPLIGSDDDTGYSIGVGIRHKVASAVELNASVYNVSIGEDSA
FGVDAAVLVEVSKNFYLGVEYGTSEDIDAIGFGVRAGF
>Cag_1001 conserved hypothetical protein
MRPFSSSFNPRAREGRDGTHVRNLQGDSCFNPRAREGRDVPLLFIKRIHR
VSIHAPAKGATVSARDFGYTVLVSIHAPAKGATNVGTACRKHRS
>Cag_0159 conserved hypothetical protein
MQKITPLQQKLLKAIHLLASGLWLSSVVVLLFLPIVAVRITSGDEIYMYN
VIYHFIDMYLLTPAAILTFITGLIYSIFTKWGFFKHGWLVYKWIVTLLII
VVGTFYLGPLTSDLLQIADAERLAALQNPSYIFGQTVTMWAAVINTLLLS
VALFFSVYKPWKR
>Cag_0217 Anion-transporting ATPase
MRIILYLGKGGVGKTTVSASSATAIARKGQRVLIMSTDVAHSLADALNTE
LSSTPVEVEKNLFAMEVNVLAEIRENWNELYSYFSSILMNDGATEVVAEE
LAIVPGMEEMISLRYIWKSAKSGKYDVIVVDAAPTGETMRLLGMPESYGW
YADKIGGWHSKAIGFAAPLLNKFMPKKNIFKLMPEVNDHMKELHTMLQDK
SITTFRVVLNPENMVIKEALRVQTYLNLFGYKLDAAIVNKILPQNSEDTY
LQSLIDIQTKYLKVIDNCFYPVPIFKAKHSTSEIINTDRLYELSQQMFGD
KNPSDILYTSDKTQVLEKINGKYVLSLHLPNVEVKKLNVNIKGDELLVDI
NNFRKSIILPNVLIGRKTEGADFVEGNLNITFANN
>Cag_1120 type I restriction-modification system specificity subunit
MAKQKATLRQTQGKQEEPLEKQLWKTADKLRKNIDAAEYKHIVLGLIFLK
YISDSFEELYAKLQAEEANGADPEDKDEYKAENVFFVPQDARWNYLQSKA
KQPEIGKFVDDAMDVIEKENASLKGVLPKVFARQNLDPTSLGELIDLVGN
IALGDAKARSADVLGHVFEYFLGEFALAEGKKGGQFYTPRSVVELLVEML
EPYKGRVFDPCCGSGGMFVHSETFVTEHQGKVNDISIYGQESNQTTWRLC
KMNLAIRGIDSSQVKWNNEGSFLNDAHKDLKADYIIANPPFNVSDWGGDL
MRSDGRWQYGTPPTGNANFAWMQHFIYHLAPNGQAGVVLAKGALTSKTSG
EGDIRKALVENGLIDCIVNLPAKLFLNTQIPAALWFLRRDAKFFVSTNGK
FRDRSNEILFIDTRNLGHLINRRTRELSKEDIYKIASTYHAWRTLPEALN
GSAYADILGFCASVAISKVAELDYVLTPGRYVGLPDDEDDFDFAERFTAL
KAELEMQLQEEAQLNAVISANLLKIKY
>Cag_0786 hypothetical protein
MGSLTVKDYVMLKMAFSSKQHAFLAGFGSLFDFTGHKLNAKHFGNQLTDR
SALQADWYAISNDIHKASNAVVTEMANTKATRNASK
>Cag_1659 Ribonuclease III
MPPETLAYLQTLIGSLDGNARLYQTALTHRSVIGNTALSHHNESNQRLEF
LGDAVLGLLISHFLFQNFPASAEGDLSKTRAKIVNSKSLACFARSIGLGE
HLLLGESAAHYNIRDSESALADAFESLIGAIYLDKGLDAAYGFVDKHIIH
HQSFSAIVASEQNYKSCLIEYSQAHHVAAPVYTVIAEHGAEHDKEFTVEV
SCNNIKGCGTARRKKDAEQLAAKEAMERIIALQPLPHEPKDEENSTTT
>Cag_1865 hypothetical protein
MASYRTKLDSTYFSDAAHALRWQKTLAFLRESEVVGTNCSLGLDLGDRTP
LTTALEELFACTFHNSTIDLDVGSLFGSYNVVTAFEVLEHLYNPLHLLLQ
VRNVLRGNDARLFVSMPLWKPHILASPDHFHEMTRRAALSLFERAGFAVV
RRAEFRIREPLFYVTGIKPLLRAWYEKIQIYELAMQVETVPCNEAALVF
>Cag_0767 possible abortive infection phage resistance protein
MNINASIIDQRITGIVDEHPEWMAESNDRNKKKSVAFVLLSIAMCLDIPL
DEAAELITDGGNDAGVDGLHIGEVEDGEFMVTIFQGKYKVELSGEANFPE
NGVQKAVDTVQVLFDPYRNVALNKKIAPKIEEIRSLIRDAYIPNVRVILC
NNGAKWTRQAENWIDNAKKDYGDKVDFIHFNHDSIVSILQRSKKVDTTVT
LSGNAIIEDMNYMRVLVGRVSVQEIHRLFNEHGDKLLERNIRRYLGLHTN
RVNTAIHQTLCDPQKSDKFYFYNNGITVVCDKFDYNAFQKADYKVQLKNM
QVINGGQTCKTIQETLNSDVSNMIGESAYVMIRIYQLAETHQNFVQEITY
ATNSQNPVDLRDLRSNDDIQKQLEIGISDFGYVYKRQREEGGGGSHVVTS
SIVAESVLAIWRQRPHQAKFRRKEHFGKLYENIFKDLSAAQALLAVLIFR
AVENERKRPTSLTPPDFLPYASHYIAMVVGRTLLQDMNISLANVSHQNFN
EILKKFEANEAAYYAHAVSDVKEALTACYGEREVSLQQLSATFRRGDLLE
MLGAVGLSGDYCFVSQS
>Cag_0456 Ribosomal protein S20p
MPLHKSAEKRLRQSEKRNARNRARKKELKVLVKTMQKLIEASAAKPEVET
AYRSVVQKLDRLGVKRYIHANKASRKKSQITRDFNTYMQSAQQ
>Cag_0026 SecD export membrane protein
MFASNNFFYLMKNKRFNLLLIALITLLSLWSLWPTWRDYSISQELQNART
PKDSAAVAVKHRAELEEVRQKSLKLGLDLKGGMHLVMEVDQVDLFEQKAW
NKDATFTAIMQSVRAQALAQSDARVIDLLVQEFNKRNIRLSRYFYDIRNS
DKEIIGKLEKESEEALSRAKEIIRNRIDQYGVAEPMITTQGSRKLVIELP
GVSDEGRVRNLLKGTAKLEFKLLREPELLVRALDRINSGLASGTMATSLA
PTVAPDSAASAQSVSVSASATPSLPAKANVAPTSAVAPAPRSLSNLIVLM
QNGMAYTEERNRAEVKALLERADVQALLPPDSELLLAAKPEVDAEGKKFY
PLYLIKKTPELTGGVITEAKATFGSQGIQPEVTMAMNTEGTSRWARITGA
NIGKRIAIVLDGAVYSAPVVQSKIPNGNSVINGIESLEEAKDLEIVLKAG
ALPAPVRITEERSVGPSLGADYIRAGMLSLVWAFVAVSFFMLVYYRQAGI
AANIALILNILIVLSVLAGFNASLSLPGIAGIVLTIGMAVDANVLIYERV
REELAEGKSIAAAVAQGYDRAFSSILDSHVTTLAAGFLLYTYGIGPIQGF
AVTLMIGTAASLFTAIVVTREIFNFMLFKEKLSTKSFG
>Cag_0946 Peptide chain release factor 3
MDLQKETARRRTFAIISHPDAGKTTLTEKFLLFGGAIQTAGAVKSNKIRK
TATSDFMEIEKQRGISVATSVMGFEYGGKRVNILDTPGHKDFAEDTYRTL
TAVDSVILVVDCVKGVEEQTERLMEVCRMRNTPVIIFINKLDREGRNPFE
LLDELEHKLQIHVRPLTWPISQGQTFKGVYNLYKQELNLFEANATRITES
YVKVNGLDDPALDQWVTSSFAAKLREDIELIEGVYEPFSEEYYREGELAP
VFFGSAINNFGIRELLETFLTIAPYPHEREATERVIVATEPTMTGFVFKI
HANLDPNHRDRTAFFRICSGTFERNKFYHHPRLQKKLRFSSPTQFMANEK
NVIDEAFPGDVIGLYDNGSLKIGDSLTEGEELHFHGIPSFSPEIFKSLEN
RDPLKAKQLDKGIRQLTEEGVAQLFIQYGSRKIIGTVGELQFDVIKFRLE
HEYGAQCDFTPLRYHKAYWVTSSNKEMLEEFLRRKSNAMAFDKEEHPVFL
AESEWMIKVAREDYPDLEFHTTSEFKTMQG
>Cag_1985 Cold shock protein
MVNNRDLSLTRIGVFYDGNYFLHISNYYNYFHERKARISISGLHHFVRNY
IAQQEGSDEQLCQIVDAHYFRGRLNAYEAAQEGNALFYDRLFDDILSSEG
VTTHYLPVKTSQTGVRYEKGIDVWLALEAFEQAFYKRFDVLVLIASDGDY
VPLIRKLNTLGTRVMVLSWDFEYTSDNGKQMTTRTSQDLLEEVTYPVAMH
EIIDNRIKKNEPLINNLFVPPSKKRIFEKPLDREYDESEQQTGTILALKD
GYGFIKFPPNNLFFHYSNLNGVDFNDLKVDDAVQFVIGKNDRGDEVAKDI
VLVAEESAIS
>Cag_0538 hydrogenase, putative
MSKHLRKQLCPFHIRKKTTTPMYKIVSAQLLAENIKKIEIEAPRIAKKRK
AGQFVMLRVEPNGERIPLTIAGSDPEKGTVMLIVQGMGKSTRALNLKEVG
DTIADIVGPLGTPSHIENFGTAVSIGGGVGTAIAYPTAVALKDAGNYVIT
INGARSKELVILENEMKAVSDEAYITTDDGSYGFHGFVTQKLQELIDSDK
KIDFVLAIGPIPMMKAVADVTRPYNIKTVVSLNPIMVDGTGMCGGCRVTV
NNEVKFACVDGPEFDAHLVDFKNLSDRNKHYLPQEKSSLEAFAEHKCHLQ
QSL
>Cag_2032 ribonuclease P protein component, putative
MFVVSKKNVRHAVKRNRIKRLMREAYRLEKVQLVDAVSQQEMVPIAVAIA
YTGKASQIPPLALFRRELRELFTALSASNVVGEVS
>Cag_1747 Phosphatidylserine decarboxylase-related protein
MLTSYGTSTIVKTTLLCLLMCVVALFIPFIAQVWLIGFAVVFLLFTLYFF
RDPERKTPNETAIIVSPADGKVMQIAPCTLPDSGLPATRVSIFMSPFNVH
VNRVPISGKVTMVRYVPGKFLMAFDHASMEHNERMEIALDNGQFEVRFSQ
VSGFIARRIVCTLQPNDVVTIGRRFGMIKFGSRVDMVLPATVRCCVQQGD
NVHAGETIIGRY
>Cag_0093 heptosyltransferase
MQLPIQSILIIRLSSIGDIVLTTPLIRLLAAAYPHATLDYCTKLPFIPLL
AHNPRLSHLFTPELLPTTAYDLVVDLQNNRRSQLLVNSLKSRYHVAYHKE
NWKKWLYVHLKLNLYGNSWLHVVDRYRQAMDSFQPLADDSAGCELYPASE
ELEFAASVTAHAGKVLAICCGANHFTKRYPALHFATVVRLLMERHPSLTV
LLLGGKEDVPQVNAILEAVPSALRSQMVAFAGECSLMQSAALLARSDAVL
CNDTGLMHIASAFGKQLFVLFGSSVREFGFLPYHTPYELFEVSKLRCRPC
SHIGRSRCKKKHFRCMHALSPESIAQRIHLYFEKR
>Cag_1362 hypothetical protein
MYMTSTFMKGIHICRKIYFIIMEINNGHRIAIITGIFSIITAIISGIFLQ
TKENISDKGTTINGNQNAIINGNNNIISNTRENSSKNVIVVATTKNVIEK
NIIGKIKPYITKNYLLTCLGVSQEQEIKADCTYGGKDIVLYFYSFDNLYI
TALISNDIVIGLNFELKNRTTEFPINTIQGKTWILGKISFGDVLYDDDKI
LYTQGGNKFTSTLTCGTSYPQYIGAGPYYYIWNSNDFKNISNKFENEIHR
YDKKHGTEFFDEKKITIDVVKKCRITSLTILGYEYKELSAYLNSPMSQFH
ADIEFSTEVPR
>Cag_1759 conserved hypothetical protein
MKLEILESFENKYPNRDYTIEIVNPEFTSVCPITGLPDFGTITIRYVPNQ
RCVELKSLKYYFFEFRNAGIFYENITNKVLDDMVALLEPRSISVITEWKA
RGGITETVSVHYTSQS
>Cag_1350 Protein of unknown function UPF0001
MESIASNLAAIHAQVAAACKQAGRKPESVRLIAVSKTKSAELVREAFDAG
QLEFGESYMQEFLEKYESRALQGCPIQWHFIGHLQSNKVRSLVGKVSLIH
GIDKLSTAEELSKRAVQQNLTVDYLLEVNTSGEASKYGMAPHTVLSEASA
FFALPNVRLRGLMTIATYEREAARREFQELRELLEQLQAIAPDPTLVTEL
SMGMSGDFEEAIQEGATMIRVGSAIFGWR
>Cag_0296 hypothetical protein
MIFQAKSIASWFIIVLMLLQFVPLPRRNPNERQPLRAPVAVVTVLRNSCY
QCHSHETRWEFPLGTVAPFAWWMSQRVEHGRRALNFSTTASLSVYEKERV
TVMLRNPKEHQPLYYVLHSNVLPDSVGVATVQLWATHR
>Cag_0754 FecCD transport family protein
MATKASPICADNCYTSPKQGSATLRSLLFFVALLLLLALSLLLGPSTLGI
PDVATPAGKAILALRFSRLLMGMMTGAALSASGVVFQALLRNPLAEPYVL
GVSGGAGLGATLTILATSGLFATLSLPVVAFLSAVVTLFLVYGIASQGMG
GQPSIYSLVLSGVIVSSICSSVIMFLVSTASVEGMHNVMWWMLGSLQPVS
AEQQLFSALLIVLALLAIWLLSPRLNVLALGREMAHYQGLHANRVIVVGL
LFATLLAAIAVSLSGMIGFVGLIVPHVMRALFGPDHRRLVPFAALGGGVF
LVVCDALARTLLAPVEIPVGVVTALAGGPFFLMILQRRMKQAWMG
>Cag_1802 Methylpurine-DNA glycosylase (MPG)
MEPLPKQFYQCSTIELTEKLLGKCFVRILPNGTRLAGRIVETEAYLGEGD
EACHAWRSRTPRNEIMFREAGTLYVYFTYGAHYMLNIVSEPEERAGAVLI
RAMEPLEGIEFMQQQRNTTKFPNLMSGPGKLTQALAIERSCNGRTLFDGE
FFVADAPAIPSHQIGTSGRIGISRSTELPWRKFIMGNAHVSGGKVGGVVS
SLQ
>Cag_0049 Penicillin-binding protein 3
MTNPEFRNRLGIVIGGFAVFIIIIIAMLLNIQVINVEKYKKKAERQYVKQ
VTEYARRGAILDRRSRVLAESVESITFYASPKQISKSLLFDEEGEAVINK
RNNKQQTFDNTQGVATLFAKHLGANRQIYLKALKKRKGVAVLAKKVPIEK
ALPLITEIKSRKMHGIWHEKEQQRSYLNVAAQLIGMTGNEKSSVDGGSGL
ELQLNKELKGVNGKRYYQRHATGELYTAPDVAQKAPKSGNSVQLTIDSDI
QSIVEDELSKAVAEFQADAATGVVMDVRTGEILAMASSPTFDLNRRSTWT
QDNSRNRAVTEMFEPGSTFKLVMAAAATEVLHRKSSDYVYAHNGVMPYYK
LKIRDHEPYGTITFKEALMHSSNIVAATTAMKLGRETFYAYTKNFGFGQK
TGVGLVGEARGIVRPLEKWDSTTLPWMGYGYQVMVTPLQILQAYATVAND
GELMRPYIIKKVVSPEGKVIRETAPEVVRRVLKPETARYVSREYFKAIVD
SGTAKNPILQSLHAAGKTGTARRASAGSYAVRSYVSSFVGYFPVSSPRYA
MIILVETPRTSYYAAAVAVPVFARIASRMVACSQEMQKMLAIRSPEQELI
DSLATVTVPDLRGLKGREAQRMLSWLGLTMEHSGDFNGVVVSQSVSSGTQ
VAKAKTVVVRLSK
>Cag_1725 Kup system potassium uptake protein
MKRLGGLALAALGVVFGDIGTSPLYAIRECFHGDYGIAINEANVFGVLSL
LVWSLLLIVSLKYLTFIMKADNDGEGGILALTALLIRHCKKNGGSERFFL
IAIGLFGAALLYGDGMITPAISVLSALEGVQMVAPAFHDLIIPATVTVLV
ILFLFQHHGTARVGTLFGPVILLWFVVLGVLGLVEILRYPQILQAFFPWH
GIMFLLNNQLHGFMVLGAVFLSVTGAEALYADMGHFGKRPIRLTWAFLVL
PALLLNYFGQGALLLDSPADAHHPFYGLVPSWGLIPMVILSTSATIIASQ
ALITGVFSLTQQAIQLGYLPRLTVTHTSAKHMGQIYVPAANWSLMVATIS
LVIGFGSSSKLAAAYGVAVTATMLISTILFYYVARDLWRWNKSVLNVMIV
VFLLVDLAFFGASASKLFHGAWFPLVIAAVMFTVMMTWKQGRGLLLKQLQ
DRTLTVEEFMSSLALQPPYRTNGQAVYLTANPDLVPLAMLHNLRHNKVLH
SEVALFHFSTERVPRVPNNRKVEVVKLGDGFYKVVARYGFLEYPTIRQVL
ALANHQGLHFKPEAISFFLSREKIVAGVKSKMTIWRKKLFAVMSRNAISA
TSYYDLPPGQVIEIGLMVQI
>Cag_0936 MFS transporter family protein
MQRAFRAFTLRRKRLQAQRNYQILFWLLFDFANTAFSVMMVTFVFPLYFK
NVICSAQPYGDALWGLSISVSMLLVALVSPFLGAAADVLGRRKHFLLMFT
LAAVVGTALLSLTGAGMATVAVGLFIMANMGFEGGIVFYDAYLKELASER
SVGRLSGYGFAMGYLGALSILLLVSPLLADGINVANAAKVQQSFLVAATF
FALFAAPLFFVIRDRRSLSTSPHPSTKKILSTATSVKNVVVATHHVERKI
FGQGAWRNLLQTVQHIRRYPDLARFLLACFFYNDAILTVIAFASLYAEQT
LGFSSRELMHFFMVVQIAAMVGALFIGFIADTIGAKRALVVTLILWIGVI
AAALFAESKELFFYTGMLAGISMGSSQAASRSMMTRLTPQEHVTEFFGFY
DGTFGKASAIVGPFLFGVISSQAGSQKVALSSLLIFFAIGLVLLTKVKSS
STNVPSLQ
>Cag_1748 Twin-arginine translocation protein TatA/E
MFGLGGQELVLILLIVLLLFGAQKLPELAKGLGKGMKEFKRAQNEIEEEF
NKSMDDNPKKEKATTASKS
>Cag_1203 Hydrophobe/amphiphile efflux-1 HAE1
MFERFISRPVLATVISILLVILGVVGLSQLPVTRFPDIAPPSVSVSATYP
GASAETVARSVAPPLEEVINGVENMTYMTSTSSNDGSLNISVFFKQGTNP
DQAAVNVQNRVSQATSRLPAEVNQIGVSTVKRQNSQIMLINLASTNPEYD
VVFLQNYAKINLVDDLSRVPGVGQVSVYGNLDYSMRIWLKPQVMAAYSIT
PQEVLSAIQSQNFEAAPGSFGENSNEAMQYVMRYKGKNRYPVEYEQMVIR
AGEDGTLLRLGDVARVEFGAYRYGVNTKANGLPAVVLAIFQAPGSNANAV
ESSLQKVLQKVSPTFPKGITYSIPYSSKKVVDESITQVLHTLIEAFLLVF
LVVFIFLQDLRSTLIPAIAVPVALVGTLFFMKLFGFSINVLTLFGLVLAI
GIVVDDAIVVVEAVHAKMEQKGYGAKVATVSAMRDITKAIVTITLVMSSV
FLPAGFLEGSTGVFYRQFAFTLAIAILLSALNALTLSPALCALFLEKEHR
HSSIGKRFFTSFNTAFEAMKRKYLGALLYLLRHKKVAYSGLALLTALSFW
MFKATPTAFIPDEDNGFVIVSATMPPGASFARTKVVMDDAARTLQAMPTV
KKVIEVAGINILTRTSSPSSGLLFVQLQDHNVRGKQGDIKNVIGAMSKKL
ANREASFFVLAQPTVPGFSTVGGLELVLQDRNGVELSRFNEIAQNFLNGL
RQHPAIGVAFTNFKANNPQYELEVDPMQASQLGVSTRDVMSVLQAYYGSV
QVSDFNRFGKYYRVIMQAEPSERTDESSINSMFIRNVRGDMVPLSSVVRL
KRVYGVEAVDHFNLFNAISVNAVAKPGFSTGQAIQAVDDVARKTLPTGFT
YDWKGQSREEISASGGLLLIFLLSIIFVYFLLAALYESYLLPLAVMLSIP
TGLLGVFIGIKLAGIANNIYVQVAIIMLIGLLAKNAILIVEFALQRRIAG
RPLAVAAIEGARARLRPILMTSFAFMAGLLPLLFVSGPAAQGNHSIGAAA
LGGMFAGLVFGILVVPLLFVTFQYLQERITGVAKPIVEAGDLLDAVVMKE
KGTHV
>Cag_1077 conserved hypothetical protein
MKALDTNILVRFLVRDNQEQAERVYRLFKAAEADKTLFFISIPVLLELIW
VLDSVYGIARYDILNAIEELLLLPILSFDGQPAVRSFIAEARNNNLDLSD
LLIACDAALSGCEQMLTFDKKAAKSELFILLEI
>Cag_1798 transcriptional regulator, XRE family
MTTTSLLKRTVDKISPVTKKFVKRQGEFAVRVSEIMKSTGMTQRQLAEKL
GKKESYVSRILAGWANPTLKTITEFEVAIGHDIIDISVQKKFTYPVVIKN
PSWSTDGSQKTIATSSHVCAGNANYDINNDYALVG
>Cag_1460 tRNA pseudouridine synthase B
MCNVHAQQGSAQEAPPPGELLLIDKPLDWTSFDVVAKVRNTYRKCGSKRK
VGHCGTLDPKATGLLIVATGRKTKEISQLEQLDKVYDGVIKLGAITASHD
TETPEEQLCDVAHLSEAELHAVAATFIGNRLQQPPMHSAAWHNGRRLYEH
ARKGEVIRERKAREIVVHRFTITNVALPFVSFELHVSKGAYIRVIADEFG
AALGVGGYLAALRRTAIGEWQLTNALSVNDTIEQIHNNASKGALCGTAST
IHPAT
>Cag_1465 hypothetical protein
MLYPISSIPNVLTQERMLKFLLLLVVTFLAIRLVFRLLRNGIFLFKSQNS
VNPYPKSSPFQRGQRVEEADFEVIETQLGESEKRRDVA
>Cag_0923 amino acid permease
MAGKLRKKPLSLLLREVNNEHRLHRILGPVALTSLGIGAIIGTGIFVLIG
VAAHDKAGPAVALSFAIAGMACIFAALCYAEFASMVPIAGSAYTYAYATL
GEVMAWIIGWDLILEYGVASATVAHGWSKYFQDFIGIFGLGVPHIFSNAP
FDFDPTSGLLVLTGAWFDLPAVLITFLVTIVLVKGIRESANFNAGMVMVK
VAIVLLVIGLGAFYVKPENWTPFAPFGYSGLSIFGHTLMGQTGPNGAPVG
VLAGAAMIFFAYIGFDSISTHAEEARNPQKDLPIALIGALVICTILYIAV
AAVITGMVPYHLINIDAPVSNAFLQVGIGWAQFVVSLGAITGITSVLLVM
MLSQPRVLLAMSRDGLLPQSFFAAIHDKFKTPWKSTILTGVFVAVLGGML
PLRLLAELVNIGTLFAFVVVCGAVLIMRKIHPEAHRPFKAPLVPFVPIAG
ILTCLMLMFSLPVENWIRLFVWLAIGMVIYYFYGRKHSIVKDIEE
>Cag_0665 probable polysaccharide biosynthesis protein
MRYLKNTSWLFGEKVLKLFVGLFVGVWVAKYLGPKQFGLFSYAQSFVGLF
STIATLGLDGIVVRELVKDEKRRDELIGTAFWLKVIGAIAVLLVLAIAIN
FTSNDSYTNILVFIIASATVFQSFNVVDFFFQAKVMGKYITYTNTITLFT
SSIVKITLILSNAPLIAFAWTVLFDSIVLALGFIYFYLQQTKYSKPHFTF
RKEIAVSLLKDSWPLILSGFVISIYMKIDQVMIQEMIGSEAVGQYAAAVR
ISEAWYFIPMVVSSSLFPAIINAKNQSDDLYYARLQKLYNLMVWMAIAVA
LPMTFLSDWIVNLLYGELYNEAGDVLMIHIWAGLFVSLGVARGSWIMAEN
LQLFSTYFIGIAGAINVIGNYMFIPTYGINGAAFTTLVSYAISVIVAPYF
FRATRHSVYMLMKAMLMFNLFRSLDE
>Cag_1714 Initiation factor 3
MKKPKVTTQKPKLTYRVNEQIRVPEVRVIFTDGTQKVMPTLEARRMAEER
NQDLIEVQPNAAPPVCKFDNLGKLLYKMDKRDKDLKKKQKATTLKELRFH
PNTDKHDFDFKTAHLEEFLRKGNRVRATIVFLGRSIIYKDKGLELAERLT
ERLSIVSNRDGDPKFEGKKLFVYFDPDKKKIDTYERIKAKTSQPFVPLAP
LSPEDLIEEPELESESDSDAEPESDN
>Cag_1667 conserved hypothetical protein
MALPSNLSFSIFFFVHWFSFFKRLHMRKNNERVGLRIVSLIQGSNRLELC
CQAADFGERETHLLEAGFDGNIAVSIMAEKSDNKIVVTLSAQTTAHCTCD
RCLALLSLPIHGTATVIFTCETVVDEAAITLDDYRSYNRQSEYLDLADDV
CDALLLALPMKITCTNNPYCRVFQAGESDATHNDASHPINSEWQEALDKL
KQKYSS
>Cag_0533 Hydrogenase maturation protein HypF
MQRQRLLINGIVQGVGFRPFVYRLAIEHELTGFIRNTALGVLIEVQGSAD
RLDSFCTALHHQPPPLARIASFEMHFIPCVFEEATFVITDSHSGGDVETL
IPPDIALCSDCRRELYDPTNRRYRYPFINCTNCGPRYTIVAHLPYDRPTT
SMQSFAICPECEQEYHNPLDRRFHAEPTCCPICGPSLSLLDADGNAIADD
PLEETASLLHHGAIVAIKGIGGFHVAVDALNDDAVLRLRKSKGREAKPFA
VMMRDVAVVERYCMVNDDERQALLSAEAPIVLLKKLANTQLAASVAPDND
RLGVMLPYTPLHLLLFDRVPEVLVMTSANMSEEPMVHENSEALQKLQGIA
DAFLMHNRPIYLKCDDSVTIHLAGQLRQLRRSRGYVPAPLLLRHSGATLL
ATGGELKNSVTLLKAHHAIMSQHIGDVKNFDAYCHFEQVVAHLQHLFQAT
PELIVHDLHPAYLTTQWAEKQAIPTLGVQHHHAHLAACLAENQVDEPAIG
VILDGTGYGTDGTVWGGEVLIGDVANFERFASLELMPLAGGEAAIRQPWR
AAVGYIYKSCGSLPDLPFLQNRDIAPIMELLERQLHLVETSSCGRLFDVV
AALCNLRGTITYEGEAAIALMHAANGTVGRQPFPYDLCYQLNRWIIGIAP
MIRSIVNSLLHGASAQEVSQRFHGTLVHCFCEVVQKASEATGLKIVALSG
GVFQNELLFTALVHALEQAGFTVLTHSRVPTNDGGLSLGQAIIGRRFLA
>Cag_1034 hypothetical protein
MTSNQLPTTQQPDDYDSPWKEAIEHYFPEFMAFYFPNAYTAIDWSTPYHF
LDQELRTIVPQSVQGKRVVDKLVKVQLLDGKERWLYIHIEVQGRREANFP
RRVFICNYRIFDQYGVPVASFVILTDTDYNWRPTSYSYEFAGCKHTLEFP
IVKLLDYEPRMEELLASDNAFGLITAAHLLTQKTSDNAFHRLDAKKQLIL
LLYEREWERDRIKELFRVLDWFLELPKELNQQLQTEIQQIEEGQKMKYIS
TFERYAKEEGLLEGIEKGKEIGVLEGIEMGKAEGLEEGLMQGRLEVAQRL
VASGMGKAEAALLAGISVDLL
>Cag_0422 Para-aminobenzoate synthase, component I
MLFHNPHEVLMLNAFDGVEDFFKKIEERVAAGFFVAGWLSYEAAYGMDSA
LAEMATAQTWQAPLAWFGVYKAPQRFTADEVAQLFPPSLTTAITAPHCST
TEIDHAEQVAAIREEIAAGKVYQVNLTARYHFSMAGEAPALFAALRQQQP
ASYTAFLNCGERTILSFSPELFFRTDGCAIETRPMKGTAPRGSSAEEDAH
LRLQLQQCEKNCAENLMIVDLLRNDLGRICTPATIKATKLFATESWPTLH
QMISTISGELRNNVSLYELFQALYPCGSITGAPKISAMQLIQQLEQSPRG
IYTGAIGYITPPSAQVSAQTMRFSVAIRTLELQGQHGIYGSGGGIVWDSV
AADEYCECQLKTKILESIAAPPFELFETMLWHDGCYLWLNEHLNRLANSA
KALGFAFERQATLQQLLAFEVELQQSPKKRCKVKLTLFRNGEVQLDAEAV
SPDLSGRLMLVTLAEKPVSSNEEAWLQHKTTLRHSYDSAFAAARAAGYDE
VIFCNQRGEITEGAISSIMVRHGSQLLTPSLACGLLNSISRRYLLATRPN
LREATLYPNDLVTADMLYIANSVRGIRPAVMEQEMKRIEK
>Cag_0890 Methylated-DNA-(protein)-cysteineS-methyltransfe rase
MPTTQPPPHKSRLLVQPTAIGRIAIAERNGNIVQLLFEGERVPFVYEEGE
SALLLEAFQQLDEYLLGKRTNFTLSLAPMGTPFMQAVWKALTTIPYGTTL
SYGALAVQLGSPKAARAVGMANHRNPLPIFLPCHRVVGSNGRLVGYRGGM
ALKQQLLELERRVVGNTALHL
>Cag_1690 hypothetical protein
MGRGLVAKFVITILFSQRDVVLRFSLGSSIISCCKSVYKYSALNGNQQFF
ESSLSVPDSGKAIFSRILNDNKINMKTIGAC
>Cag_0960 Methionine sulfoxide reductase A
MQYNHLTPDEERVMLHKGTELPFSGIYYDHHDSGTYHCRRCNTALFHSDN
KFNSGTGWPSFDDAIEGAVRQIPDADGRRTEIVCANCGAHLGHVFFREGF
TTKNVRHCVNSISLNFQPQATQATIPTTTQTAVFAGGCFWGVEYHFSKLK
GVLSVTSGYTGGAIENPTYQQVCSGKTGHAEAVEIVFDAAQVSYETLAKL
FFEIHDPTQVNRQGPDVGTQYRSALFYANEEQRRIAEKLIGELQAKGYRV
ATSVEPASQFWAAEAYHQEYYARHGHEPYCHVYTKRF
>Cag_0942 carbon-nitrogen hydrolase family protein
MSKESVSIAVVQSECKGDAVANRAEATAKIREAAALGAQIICLQELFVTR
YFCQTEAYEPFGEAEAIPDGATTRLMQELAAELGVVIIASLFERRARGLH
HNTAVVIDADGSYLGMYRKMHIPDDPGFYEKFYFTPSDLGYKVFKTRYAT
IGVLICWDQWYPEAARLTALKGAEILFYPTAIGWATDEDSAEVRHAQQNA
WITMQRSHAIANGVFVAAANRVGTEENLEFWGNSFISDPFGQMVAEAPHQ
HETILLAQCDLSRINFYRSHWPFLRDRRIETYGGLQQRFLDNNQ
>Cag_1522 Histidinol-phosphate phosphatase, putative, inositol monophosphatase
MKNPFDNPHRLFAFTLINQASQIALTYYGNQSLKVDTKRDASPVTIADRK
AEAFIRKELELHYPDDGILGEEFGEKLSQNGRRWVIDPIDGTKAFIHHVP
LWGMMLALEVNGEPHLGIIAFPALGTIYHAVQGEGAYEKETPISVSSVTS
VADATIVFTEKEYLLDSPSNHPVDMLRNSGGLVRGWGDCYGHMLVASGNA
EVAVDKIVSPWDCAAVIPIVTEAGGCCFDYKGNKSSSGEYGLVSTNRQLG
EQLLQEIAGKG
>Cag_1537 Elongator protein 3/MiaB/NifB
MLVNNHNLDDNTQPRLKELISTSSKSRKKWLLIQPKSSTSMMVDSGSVSM
PLNLIMVATLAGKLFDVTLLDERTGDVVPEDLSAYDVVAITSRTLNANQA
YRIADKALAQKRIVLLGGVHPTMQIDEALTHCTSVIYGEIESIWDELASD
ILNGSMKQIYRATQLHSMTTMQRPDFSYALNSQNAKRYNSRIPLLATKGC
PVGCNFCTTPTIYGKSYRYREIDIVIDEMKYHQDRVGKQNITFSFMDDNI
SFRQNYFMELLNEMSQLGVNWNANISMNFLYNPEVAELAARSGCDLMSIG
FESLNPDTLKSMNKGSNRLQNYSTVVSNLHKHGIAIQAYFMFGFDDDTEK
SFDAVYEFIMENRIEFPVFSLVTPFPGTPYFDEMKPRMFHFDWDKYDTYH
FMFEPKKIAGQQFLSSFIKLQKEVYKKKAIMKRMQGKPLNWIWLANFAMH
NFTSKLKPEAYL
>Cag_1062 hypothetical protein
MMQNYGMYKIYQWEAIPYSSHNDKWNCDQIDLINRPIESVYFHRNQDLSI
SMIAYIKANALNPKATYQLGELHSNFESIKIHHTSGAQGTATITHIPKNT
TKFDNNGNPINESIINTLELEIKYNNLPISYTIEWITNLKSEGVWHWPNV
VNTKLQGKFNMDFSSKSHTISIEKDNFLEFNNSLSCLQFSFEDELIFIGE
TKTDEIEKKYHPGFILYQGNPSQEKMERIRLALSFLLGRFLPSLGYTAFD
EKWDIVSYKCITPYDLDGSVYNSSTMPPIKLKTINFFINTNIVTRSVNAF
YENFLKFDLKYISYLYWNAINSPAHIKASQFGATLEAIQRNYRQVHANDF
QTALFKHEEWKIIQKALLDSLDKVLNSNSDNKEIDEKKIIKNKIYSLNQT
PQSKLTNRFFDLIDISLSEIEENAFKQRNYSAHGIKTTQEDFSVIKNNYI
LMTLIHRILIKILNISQNYIDYYAIGHPSRNIKEPIGG
>Cag_1609 hypothetical protein
MGKLKAHQKLPAEWNEREILRDKGQFWTPSWVAEAMVAYVTENTDLVFDP
ATGRGAFYEGLLKLNKQNISFLGTDIDPDVLSDEIYNKENCFVENRDFIK
DPPNRKFKAIVANPPYIRHHRIDEATKILLKKIAISITGNSIDGRAGYHI
YFLIQALNLLEKDGKLAFIMPADTCEGKFAKNLWEWISEKFCIECVVTFD
ERATPFPNVDTNAIIFLIKNTKPQQTLQWIRANQAYSDDLLQFVTSNFKL
IEFDSLEITTRQLKEGLTTGLSRPEQNHNGFKFHLNDFANVMRGIATGSN
EFFFLTSEQVKELNIPKDFLKRAVGRTKDASESVLSLKNIEDLDRENRPT
YLLSINGQESFPKPISDYLKVGEEMGLPTRSLIQQRKPWYKMEQRKVPQI
LFAYLGRRNTRFIKNEAGVLPLTGFLCVYPIYDDQEYIDNLWQALNHPDT
LENLKLVGKSYGSGAIKVEPGNLNKLPIPEHIVANFNLKRPYKNAYEQLE
IFREPKTKYGLKKRKTAGNKC
>Cag_1088 hydrolase, alpha/beta hydrolase fold family
MLHYKTYLHHNPKAAWVVFVHGAGGSSSIWYLQIKEFMQHCNVLMVDLRG
HGRSKDMGIPEGMRRYDFNCVTRDIIEVLDFLRIEKAHFIGISLGTILIR
NICELAPERVASMIMGGAIIRLNLRSTVLVTLGNTFKHVMPYMWLYRFFA
WIIMPRARHKKARNLFVNEAKKVEQKEFLRWFSLTYDLTPLLRYFEEKDA
ATPTLYLMGDEDHMFLPFAKRIVTRHTYATLEVIANSGHVCNVDQAREFN
QRAIRFIKLHS
>Cag_0073 deoxyhypusine synthase
MSSGFKKEELLSAPVRHIDMKSLNIVPLVDQMADTAFQARNLARAASIVD
LMQQDKECAVILTLAGSLISAGLKQVIIDMLEHNMVDAIVSTGANIVDQD
FFEALGFLHYKGTQFMDDAILRDLHIDRIYDTYIDEDDLRVCDDTMAIIA
NSLEPKAYSSREFIIAMGKYIADNNLDKNSIVYKAYEKGVPIFCPAFSDC
SAGFGLVHHQWHNPEKHVAIDSVKDFRELTRIKIANDKTGIFMIGGGVPK
NFTQDIVVAAEVLGYEDVSMHTYAVQITVADERDGALSGSTLKEASSWGK
VDTVYEQMVFAEATIAMPLIAGYAYHKRNWEGRPARNFNAMLDSQQASA
>Cag_1963 conserved hypothetical protein
MPLMALLLLAFSLPAFAADAAPLAVQSDWWIWVLGLFTFSFFLGIIAVIA
GVGGGVLFVPIVSSFFPFHIDFVRGAGLLVALAGALSAGAPLLRKGLANL
KLALSMALIGSISSIAGAMVGLALPENIVQLSLGATILFISVIMLLSKNS
AYPDISKPDSLSQALHIHGIYYDEQLKKDVSWQIHRTPIGLLLFIVIGFM
AGMFGLGAGWANVPVFNLVLGAPLRVSVATSVFVLSINDTAAAWVYLHQG
AVLSLIAVPSVAGMMLGTKIGAKLLTKVHTSVVRWIVIALLAGAGLKAFL
KGLGI
>Cag_0210 conserved hypothetical protein
MAATFFISDLHLALQASNVEAQKLDKLQVLFERIGNEGGALYLLGDILDY
WLEFHHVIPKEFSRFFCMLSSLVQQGVQVFYVAGNHDFELGSYFMQSLGI
TTAYGINEVTIEGKRFFVAHGDGLGKGDTGYKIFARLIRTRFSRFVLQWL
HPDLTIGFMKWVSQLSREHKPTNSSFEVDRLLHFAHSLRAEQEFDYFVCG
HNHVQGVHELAEGSRYLNLGTWINGNYSYGVFKEGKFDFFDL
>Cag_1370 Cytochrome oxidase maturation protein cbb3-type
MNSTFFLIGIGFFIGVAAWLLFIWAVKSGQFDDPEAPKYRMLQDDDNPKP
PRKP
>Cag_1126 conserved hypothetical protein
MITHIDHIAIAVQNLDNALDTFCTILGVDRQSVHIEEVASEKVRVGCIPL
GASTIELLEPLEAGSPIEKFLATNGDGLHHIALATDNIETETARLVESGM
KPLSPPKEGAGGKRIVFLHPKQTNRVLLECTQKNG
>Cag_1340 hypothetical protein
MSAIKKLFGILWALMGVGIIPLAIQQAMKEIAEKPSEENWIFWSIVMVVL
MPIIAFSLITFGIFALKGEYDTIE
>Cag_1064 Molybdopterin binding domain
MITVEAARTLVQQSIQPLGTEQLPIAEAFGRITAEAIHAPFALPRFTNAA
MDGFAVRWDDIAQASDATPITLTVQEMIAAGSEPTVAISQGCCSAIMTGA
PMPQGADTVVPFEQTSGFGSNSVTIFKAPKRQANVRYAGEEVAANELLVE
NGVALNPAALSVLASFGVAQLKVRRQPRIAIITVGDEVQLPGKPLIGAQI
YNCNRFMLDAACRSLGIIPTFIHHAPDNREVLRHSLGMALTMCDMLLTAG
GISTGEFDFVQSELTALGINKHFWSIAQKPGKPLYFGTSHEGKAVFALPG
NPISAIVCFAAYVVDALALMQGKTLSTSRFTATLAEPFPTDKKRYRFLPG
MVWVDRGQLFCKAASKIESHMITSLSGANCLLEAEAAQYDRPAGELITCT
MLPWGKVC
>Cag_0689 PucC protein
MKKFNLVRLSLFQMGFGIMLGFLLDTLNRVMTTELRISATIVFGLISLKE
LLAIFGVKVWAGNLSDRSQIFGLRRTPYILLGLVSCVFSFIMAPTAAYEV
TVGGVGFAEMIPAMLQDVGLLKLALIFLLFGFGLQVATTAYYALLADTVD
EANLGKITGASWTLMVLTTIVSTRIVGSYLDHFTPERLLFVAEVGGFIAL
ALGLIAVLGVEQRNGEIKEGKEKHALSFAQSLQLLTSSPKTLFFASYIFI
SIFALFANEIVMDPFGAHVFDMSVGTTTKLFRPTMGGMQLLFMLIVGFLL
GRIGQKRGALIGNIMCMIGFGLLIGAAFSRDEQFLRIALVVTGIGLGASS
VSNISMMMAMTAGRSGVYIGLWGTAQSLAIFIGHLGAGMIIDVVYHFTGQ
YVWAYAAIFAMEIVAFAIATLMISHVSKEEFEAESKAKLAELALSAKG
>Cag_0121 conserved hypothetical protein
MVLMEQTTSQAQDVIVRLKKVSGQVDGLIKMLEREDECMRIITQFQAVKA
ALDSTFSLILHRNLRECVSRDNTESMERILKLISKQ
>Cag_1128 alcohol dehydrogenase, iron-containing
MNFFLSTDLVIGVNEALKLNEHLSTMPIAKPAIIYDANVASSLYFQEVLK
NLQAAFPNAVTYCNDFAGEPTYAHLENVATMFRSKMPDAVVAIGGGSTLD
LGKGIALLMTNDVPALSLKGFPTGVNDPLPLVTVPTLLGSGAEVSYNAVF
IDEAEGRKLGINSRKNFPKKSVIDPQLSMTAPIESVIASAMDSLVHCVDS
FGSIKHTPLSRIFSIEGFQRTFYALQQDQLDRAESRLDLAIGSICGVVAL
MNSGDGPTNGFAYYLGVKHRVPHGLAGAIFLKEVMRYNHKNGYEKYALLN
PMRSTPLAKEVTAELLEEMDALYKQLRIPNLVPYGYGKGNVDEFAKNASE
ALKGSFSGNPVPFTPESAREVIYQLT
>Cag_0138 Threonine synthase
MIYFSTNKSSVPVSLKKATLEGLAPDGGLYIPSEIPHFSEAEIKQLKGAS
FTNVAFAIAQKFAGSDIAAERLRAIIEECYNFPTPLVPLGSNTFIEELFC
GPTLAFKDYGARFLARMTGYFAEEEHKLITVLVATSGDTGSAVAYGFQGI
KNTRVVLLYPSGKVSPLQEQQLTTAGDNVFALEVQGDFDDCQRMVKEAFV
DPTLHEALTLTSANSINISRLIPQSFYYAWASLSIDEAFGLQNPIFSVPS
GNYGNVTAGILAKKMGFPIGHFIAASNANDSVTRYLVHGRYEPHPTIATL
STAMDVGNPSNFARLRYFYNNDFAAMADDISGIAVSDEETRATISKVHKQ
FGGYIMDPHTAVGYRALEKYRESTNNANTPAIVLSTAHPAKFLEAIEATL
QIPVPIPDHLQALMAKEKRSTIIGAHYQELATFLKRLDR
>Cag_0201 conserved hypothetical protein
MELFYTPTEQIHPATNQVSIEGEEFHHLVRVLRKREGELILVTDGNGLRC
EVRIASISKHDVQGEIISTTTIEPPSTRVTVALSLLKNPQRFEWFLEKAT
ELGISTIIPMVTARTVVQPSNDRVHNKLKRWRTIVLSAARQSKRYYLPQV
VEPQRFSDVLNRSGYDERCMPFEAATSSPTLHCAGKNILFLIGGEGGFSA
EEVQQAEATGCRTMSLGTTILRAETAALFAVAYVRSQLLTEAPSAWL
>Cag_0944 thioredoxin reductase, putative
MTLNHFYCSIEEEIRDLTIIGGGPTGIFAAFQCGMNNISCRIIESMPQLG
GQLAALYPEKHIYDVAGFPEVPAAGLVDTLWKQAERHHPEIILNETVMRY
HKRDNGMFEVSVSSGHTYVSRAVLIAAGLGAFTPRKLPQLENIEALEGKS
IFYAVKNVADFTGKHVVIVGGGDSALDWSVGLLKSAASVTLVHRMHEFQG
HGKTAHEVMEARDAGRLNVMLDTEVMGIDVENEELKAVHVQSKNGKTRTF
PADRLLLLIGFKSNIGPLAEWGLEIVDNALVVDSHMKTSVDGLYAAGDIA
YYTGKLKIIQTGLSDATMAVRHSLHYIKPGEKIKHTFSSVKMAKEKKKGM
QNG
>Cag_1935 Succinate dehydrogenase, cytochrome b558 subunit
MRSNLFLSSLTAKVLMALAGLFLLTFLFLHLTLNLLLLLPDGGQAFSTAA
AFMGANPLIKIAEVFLIASFVLHGVLGFMLSAKSRAARPVAYATANRSDT
ALFSRYMLHSGIIILLFLLWHSVDFYFIKLGIVLPPAGIEPHDFYQRALL
LFTHPGISLFYLLSFIALGVHVNHAMQSAFQTLGLHHSRYVDGVKLASSL
IAIVIATGFSIVPIVLCFFRP
>Cag_0396 conserved hypothetical protein
MKPDSKVVLINAVVIISFGLLSAWHNSRDTTPSVTPVTQEVSVEPTETSP
SLTPNVPVTTTPEPAAPAEVVPAKPSVSVTPTKPKSSHVLKPRVLIAKKP
RPAASAEVVLAEPSVSVTPAKPTASPVPEPQVSVPAKPEPAAPAVAVPAE
PSVSVTPPTPTTANVSEPQVPAPTTSQPAQ
>Cag_1472 glycosyl transferase
MNILFMNSARTWGGTEKWTHMAAESLAHEHKVALVYRKNVVGDRFTTSKF
QLPCLSHVDVYTLYQLTRIIRQEKIEVIIPTKRKDYLLAGIASRICGISN
ILRLGIVRPLKLPIIHKLMYHSLVDAVIVNAQQIKTTLLQSPFMVADKIY
VIRNGLDTTQLNKKSQPIAPKIFDFQISTVGILTKRKGHDFLLRGFAQFI
NQEPNANAGIVIIGDGVLKNELQELVEKLHLTQHVHFTGFVENPYPLMAA
SDVIAMLSTNEGISNALLEGMYLENVPISTFVGGTTEFIQDGKNGFLIDY
GNENKLATTLLTIYNNNILKENISLAAKSTILTQFSLTRMTQTLTQLCKT
TIKNKAAQHAAYN
>Cag_0602 fructose-1,6-bisphosphatase
MSNLITIERHILEQQKFFPEAHGELTDLLTDVAFAAKLVRREVVRAGLVD
ILGLAGSTNVQGEEVKKLDLFANEQIISAIGAHGRFAVMGSEENEEIIIP
TNNESGNYVLLFDPLDGSSNIDVNVSVGTIFSIYKLKTSDPAKASLADCL
QAGSEQVAAGYVIYGSSVVMVYTTGHGVHGFTYDPTIGEFLLSDENITTP
KRGKYYSMNEGSYAQFNEGTKRYLDYIKTEDKATNRPYSTRYIGSLVADF
HRNLLTGGIFIYPPTGKHPNGKLRLMYEANPLAFICEQAGGRATNGKERI
LDIKPTELHQRTPLYIGSTDDVMVAEEFEQGKR
>Cag_1653 NifU protein, putative
MSKDYLPSSDSLYDRVIAALETVRPYLQADGGDCQLVGISKDMVVDVKLL
GACGSCPMSTLTLRAGVEQAIKKAIPEIVRVESV
>Cag_1543 conserved hypothetical protein
MQDSDGQNIRVELSLLEKNIEQLVLQLTDCRKENEALRSELASLQNILRS
CKLPGSGSAQSPTDGSMSEGALSGSEKMQFKQRLVLLLQKIEMELRNSQP
L
>Cag_1106 glycosyl transferase
MIFSYQPTVSIILATFNRAHYVAHAVQSVVEQTIDDWELLIVDDGSDDAT
FDVVAPFLAQHSNIRYMKHKNRNAALSRNAGIQASFGQYITFLDSDDRYA
PNHLASRLKIMAENPAVDLLSGGFWCEEPTLVKDRDNPKRLINIRECIVC
GTLFGKRELFFDLEGFRNVAYAEDTDLWERASLRFVTKKIAAPESYLYQR
ALDSITLTYQATMNG
>Cag_1432 possible virulence-associated protein
MRLPKEFRVSTKDLFIRQDEVSGDIILSQRPHSWNGLFELDKLEKSPIDF
MNNNDRNLALHNRDPFNGYAE
>Cag_1994 conserved hypothetical protein
MPNTLTLNHLSREEKLQMMDLLWDDLSFNQEALDSPNWHREALQETEARV
NAGAEQLMEWSAVKKILRNECK
>Cag_1681 cytochrome c biogenesis protein
MTSNFTIAATIAIACWAIGILARLFSRRIPALRLPAQLFPLIGSAVMVGL
ISWYWIALERPPLRTLGETRLWYAMLIPLAGSLVEYRWKIGWLNYYCMAL
ATFFLAINLLHPEVFDKSLMPALQSPWFVPHVVVYLVGYVLLAAASATAL
HNVVLHLQAKENLNGTMVAHYLALLGFVFLTFGLIFGALWAKEAWGHYWT
WDPKETWALLSWLAYLGYLHAWHYNIESRKLQWYLALSFVVLLICWFGVN
YLPSAQNSVHTYSQS
>Cag_0953 polysulfide reductase, subunit B, putative
MARYGMVMDMRSCVGCQACLAACATENHTPFWSGKFRTHVEDKEKGSYPN
VRRILLPRLCMHCENTPCMSACPTGATWKNKDGVILVNYDRCIGCYACCI
ACPYDARYAYNNHDVEEAEKLYGKLSSHTMPHVDKCTFCDHRIAAGREPA
CASTCPTHSRIFGDLDNPASEVHQLAVSGKATALNAGLGTSPKVFYIPS
>Cag_1405 conserved hypothetical protein
MAFKIRDIEVALEKKGFKRVESDHSYFIYFTIENKKSRVRTKTSHGHKGQ
ALSDNLFSVMAKQCKLNKNQFSELIQCPLSRNDYEKILDMQGMVK
>Cag_0294 conserved hypothetical protein
MNSEILIYQNQTGDITIDVRLEEETVWLTQAQLCQLFQKSKATISEHIKN
VFEEGELDEKVVVRKFRITTQHGAMVGKTQEMEVNGYNLDVIISVGYRVK
SQQGTQFRIWATKRLKEYIIKGFVLNDERFKSGNAMNYFTELQERIREIR
LSERFFYQKIKDIYTTSIDYEPRAEKTIEFFKVVQNKLLWAISKQTAAEL
VYRRANAELPLMGMQSYDKKGAASIKKAEVSVAKNYLNEDEIKLLGLLVE
QYLAFAETMAQQRTPMYMKDWIDRLDTILQLNGRELLTHAGNISHDMALK
KSEVEFEKYRLSLKAVEKEESLRELEEDLKQLTKSAS
>Cag_0227 Magnesium-chelatase, subunit H
MRDKIRIAAIVGMEQCNQRVWREVTAKIADHAELTQWTDQDLEHQNPETA
EAIRNADCIFTTLIQFKGQADWLQEQIEQSNVQLVFAYESMPEVMQLTKV
GNYVVSGDGGGMPDIVKKVAKMLVHGRDEDALYGYMKLLKVMRTMLPLIP
KKAKDFKNWMQVYTYWMNPTGDNLASMFNFIMAEYFSVGVKVEKVQEVPT
MGFYHPDAPEYQKDLHHYEKWLHKHDRHATSKRNVGLLFFRKHLLQEKEY
IDNTIRAIEAKGLNPLPIFVMGVEGHVAAREWFTHNNIDMLINMMGFGFV
GGPAGATTPGASAAAREEILGKMNVPYVVSQPLFIQDITSWTSQGVVPLQ
SAMTYSLPEMDGAVCPVVLGAIKDGRLQTVPDRLERLAGIAKKFSELRHT
DNSHKKVAMVVYDYPPGMGKKASAALLDVPKSIYNILLSLRAEGYNVGEL
PESPEALLAMLDKATDYEIQAHEPDCFAINREQFNSITTVRERERIENRW
NGFPGEIAPIKPDNLFIGGITLGNIFIGVQPRLGIQGDPMRLLFDKENTP
HHQYIAFYRWISRIFGAHAMMHVGMHGTVEWMPGLQLGVTGDCWSDALLG
EVPHFYIYPINNPSEANIAKRRGYATMISHNIPPLARAGLYKELPNFKNM
LNDYRERGLQSQADAETEEVILTKAQQLNLTDDCPRTEGEDFQDYISRLY
TYMMELEGRLISNSLHVFGETPKADTQLTTVTEYLKVRGNEKSLPSIILY
AIGEGEKWGDYATLATHARQGESEAMRVREIVDEHTRTFIDETIFGRSNP
STVFNTITGGSRVSQEMAEALNSALQDGLALKRALEDNSNEMKGLLRALC
GEYIPSGAGGDLVRDGAGILPTGRNIHAIDPWRIPSELAFKRGKQIAESI
LQRHVEENGEYPETIAQVLWGLDTIKSKGEAVAVIVHLMGAEPAYDAQNK
ISHYRLVPLEKLGRPRVDVLIQISSIFRDTFGVLVDHLDKLVKDAARADE
PHEMNHIKKHVDAALAEGKDFESATSRLFTQAPGAYGSQVEELVEDSAWE
SEEDLDNMFIKRTGFAYGGNRYGDQQTDILQGLLSTVDRVVQQVDSAEFG
ISDIDRYFSSSGALQLSARRRNPKGDSVKLNYVETFTADVKIDDADKALK
VEFRTKLLNPKWFETMLEQGHSGAAEISNRFTYMLGWDAVTKGVDDWVYK
EAAETYAMDPKMRERLMKVNPKAFKNIVGRMLEASGRGMWSADPDMIEKL
QEIYSDLEDRLEGIEV
>Cag_1691 serine esterase
MLTRHHHDLTYLEYATPTLGNNAPLLVMLHGYGSNEKDLISLAPMLPDGL
RIVSVRAPLTLAPEMYAWFSLEFLADGIRVDEAEARAACERFVLFLRDLI
TRYQPAGSKVFLMGFSQGSVMSYLTAFLAPELLHGVIACSGQLPEKNMPS
ESAFALLRTIPFVVLHGIYDDILPIEKGRHAHAWLQQQVDDLTYREYPIA
HQIADDGIALISSWLTERLEKVGNSRL
>Cag_1592 conserved hypothetical protein
MPCVTQRIMAPSECLLAILTRNPELGQVKTRLAKAIGKEAALHIYELLRH
RTAEVAQALASERMVFYSNYLPTSDCFSPTHFHYSLQAGADLGERMHHAL
ASGLTAGFRSVVLIGTDCYDITPEILQAAFVALERYEVVIGPATDGGFYL
IGMKQPMPHLFFQRKWSTSSVLKESCIRLQQAGTKYALLKELSDIDTLED
LQQSSLWLTPELDALRSLFEEKAAQPTRQQP
>Cag_1122 Restriction endonuclease S subunits-like
MATMSEWKEYKLKDLGLLQRGRSRHRPRYAFHLYGGKYPFIQTGEIREAS
KYITKFEKTYSEEGLKQSKLWPKGTLCITIAANIAELAILNFDACFPDSV
LGFIPNDKIANADFIYYILTHFQKELKHIGEGSVQDNINLGTFEDLLFPI
PPLPEQRAIASVLSSLDDKIELLHRQNATLEKMAETLFRQWFIERKSLNY
DSYDLLDEHDLKNQKNHNNQKNHSSDNGEEAIEEWKIGKVSDYALHLKDS
IQPQKNQSTFYFHYSIPSFDNDKNPIKELGKEIQSNKYKAPRYCILFSKL
NPHKDKRVWLLQNEVEKNAICSTEFQVVLPIKRQYLYFLYGWLTLNDNYN
EIASGVGGTSGSHQRIDPNTIYDFQCPLVTESVIEKFNIQIKPLFKKQVI
NQTQIRTLTALRDMLLPKLMSGEVKVDY
>Cag_1496 Nuclear protein SET
MPPLFSAECLTPDSFNCLLIIAAVVLLLGIVLGFVVALKAPWIHRKTVIA
KASTVSGRGVFALVNFREGDIIERCPALEVRDRDVDGELLNYVFYGSTEQ
HRLVAMGNGMLFNHDNNPNVAYYREDTPLGAELVLYALRNIRKGEELFYS
YGEAWWATRQNG
>Cag_1135 long-chain fatty-acid-CoA ligase
MALINPAFTTLPELFASVFAHYRSNTKAFPVSRKINGVYTPVSYQALYAD
EQKLQAFLKQRGVTANDRVAILSENRPAWYLADMAILNLGAITVPLYPSL
PANQIEYILNNCSAKAVIVSNSLQLSKILSIRNQLTSCEFIVMLNRQTEQ
IEGVTDLNHAKEEGKKLLAANPSFLTPSPAKPDDVATIIYTSGTTGLPKG
VMLTHRNLCENVKSCSTVLELNESDCALSFLPLSHAYERTGGYYLLFACG
TQIYIAESIETVSLNMSEVKPTIIFTVPRLFDRIRAILLKQIGEQPPPAQ
KLFEWALQTGEEYYQALSSCGSAPPLLAMQHNLASQLIYKKIHQKMGGRL
RYFVSGGAALPQKIGEFFQALDVPILEGFGLTETSPVTHVNRPEKIKYGT
VGKAINNVETRIAEDGEILLKGPNIMKGYWNDEEATREVLKDGWFYTGDL
GEIDSDGYLKITGRKKHIIVTSGGKNIAPLPIENLIAENPFIGQVLVIGE
KRPFLVALIVPAFPHLQAHARKESIQATTNRELMEHKKVQQIYEELLRTI
SMQLATHEKIRKFILIENPFTIENEQMTPTLKLRRDVIINAYANDIENAY
NSLNMCYNAE
>Cag_1646 Uridylate kinase
MLHYKRILLKLSGESLAGEDGYGINAGVLDQYAEEISEARDMGAQIALVI
GGGNIFRGVSQAAANMDRVQADYMGMLATVINSLAFQDALERKGVFTRLL
TAISMEQIAEPFIRRRAIRHLEKGRVVIFGAGTGNPYFTTDTAASLRAIE
IEADIIVKGTRVDGIYDSDPEKNANAQFFPDISYLDVFHKKLGVMDMTAI
TLCSENSLPIVVMNMNEKGNLSRLLRGEKVGSLVHHEGV
>Cag_0430 conserved hypothetical protein
MRNLRNLGFCYGQNMMWLHRRKHPRLAQLLRSALPIQLNESSKIVILSDL
HMGDGSRFDEFRTNAELVYTMLHNYYHPRHFSLVLNGDIEELLKFPLYAI
ETQWNEFYPLFRSFQHNGFFWKTWGNHDAPLLDEKEYQLSDYLLESLKFH
YKEESLLLFHGHQASVFMWETFPMVSHLAVLILRYLAKPTGIKNFSVAHN
SRRRFAVEKAIYEFSNREKIVSIIGHTHRPLFESLSKLDFLNYKIEDLCR
HYPTTIGEERSTVRQQMQTMKKELEACYQQGKKIGLRSGIYHTLTIPSLF
NSGCTIGKRGITALEIEGNTIRLVGWYNSKEQPRFATNHNQTPHQLASTN
YYRTILNEEPLDYIFSRLHLLA
>Cag_0998 hypothetical protein
MRVSIHAPAKGATKVKVKREIEGMFQSTRPRRARPKVNDRRNKNTVSIHA
PAKGATKKQLVSYTAFTVFQSTRPRRARHRLMLFACYSPRFNPRARGGRD
SEILRRNHELQSFNPRARGGRDAVQTQQTTINGVSIHAPAKGATVIRSLF
ENEAAVSIHAPAKGATLMQ
>Cag_0692 membrane protein, putative
MDNASSSRLLKTALLLGIITIGYNVIEGIVSVWFGLQDETLALFGFGIDS
FVEVLSGAAIVHMVVRMQKSAVHARDQFERQALRITGWAFYLLAFGMVVG
AAVTLLQQHHPETTLSGIMVSLISLASMWWLMQAKMKVGNALHSEAILAD
ANCTKTCLMLSGLLLASSLLYELFHIAQVDAIGSIGIAWFAWSEGKEALE
KARSGKLGCCCSGSCKG
>Cag_1571 hypothetical protein
MPHCIHRAVHSFAEITNLPHPADLVGTWRRFGLLGPVYEIICMGNTLPNG
DVMMRVRVVESGEELDYRFADILDDPKER
>Cag_1982 putative ferric uptake regulator, FUR family
MLLTLSLLMQSTGANNNLEQAEKLFRQLMKEQGFRCTNERIAVLHELYNA
ESHLDADELYVLLKQKHVAISRATVYHTLHLLFKFRLISKIDLGHKHAHY
EKAYGVTNHLHMVCSICGRVTEVDDNRIAPQLQEICRHNGLNLEHFSLQL
VGTCIKHDSPIPTNN
>Cag_2006 Glucose-inhibited division protein A subfamily
MYDVIVVGAGHAGCEAILAAARMGATCLLITSDLTAIARMSCNPAIGGMA
KGQITREIDALGGEMAKAIDATGIQFRLLNRSKGPAMHSPRAQADRTLYS
LYMRTIIEREPNIDLLQDTVIAIETKGECFAGVRITSSRVIEGKSAILTC
GTFLNGLIHVGMNHFAGGRTIAEPPVVGLTENLCAHGFQAGRLKTGTPPR
IDSRSIDYRKVDEQPGDIEPILFSFESKGALQSKQVSCFITKTTEETHAI
LRKGFERSPLFTGKVQGIGPRYCPSVEDKIFRFPDKNSHHIFLEPEGIDT
NEMYVNGFSTSLPEDIQLEGLHSIPGLEQVKMIRPGYAIEYDYFYPHQIQ
ATLETRLIENLYFAGQINGTSGYEEAAAQGLMAGINAVLKMRKHEPLILT
RSDAYIGVLIDDLITKETNEPYRMFTSSAEHRLLLRHDNADIRLHEFGYR
VGLLPEHRYQATRQKIEQINAVKTLLSQIRLESGLANKLLQELEYGEVTG
AQQVTTLLKRPRVTLAKLLATSPELHSQLSNISNNPLVYEQVEIDCKYEG
YLKRDALVAEKIQRLEAHHIPALLDYHAIAGLSNEGREKLKKHRPENIGQ
ASRILGVSPSDISILMVHLGR
>Cag_1607 Phosphoenolpyruvate-protein phosphotransferase
MSSRSRREERTPNACNQSETLYNGIGASKGIAIGECYAFIKEENTHEPRE
LNEKNCKEEVERFLTALSRSEHELKKIEQVTIRKLGKSYSNLFQAQIMML
HDPVLVETISKRIVSERKGAHLVIDDEFNKYLQHFKNSDHTLFQERADDL
LDIKNRIIRNLDIRKLHSWIPEGVIVCCDTLSPADIILLSRSNIKGFVTA
TGGKTSHISLICKSLKIPMVVGLGQIADKVATGMAVIIDGANGTVITNPS
AATLEEYFLKQKNEQQSSSNLQAAMLPATTQCGVRVSFYANIDFREEAFS
LAAAGAEGVGLFRSENLFTEGTKTPKEEEQFTCYRAIADAIAPMRLDVRL
FDIGGDKLLYSPVKEINPNLGWRGIRILLDLPEILDTQIRAILQANTHGN
IDILIPMVMSLHEIRTVRESVERQFAELQAERDGQITQPGIGAMIELPAA
VELIEEITQCVDFISIGTNDLTQYTLAVDRNNVIVQDLFDRFHPAVMRQV
HRIIQVANKNHCCAMMCGDMASDSLALPFLLGCGLRNFSVVVSEIAELKA
HVARYALAETEALAQECLALNNPAAIKARLEAFQAAH
>Cag_0328 ATP-dependent DNA helicase RecG
MDGTTSVAFLKGVGSRKAVVLGEVGIVTVDDLLAYYPRRYLDRRSIKRVR
ALVDGELTTVVGTIVRTQLEQPTSGKARFKAWLDDGSGLLELTWFRSVRY
FSRFFTKGESLAVHGKVSFFGNQAQMQHPDYDRLTPENAVGGEKGSDDFA
LFNTGAIIPLYHTTEAMKQAGLASRQLRVLIKRALEEVPFREQENLPLSI
IRQYGLIPQWEAEREIHLPSSPEKLEQARYRLKWTELFYAQLLFALRRST
LRRNRAAVRFTHSGELTRKLHESLPYQLTEGQKQAVRDIYRDLRSGSPMN
RLLQGDVGAGKTMVAMFAMALAVDNGLQAMVMAPTEILAVQHALVMKRFF
APLGIELGLLTGKQGKKERRATLEKLRTGDMQLVVGTHALLEPDVQYANP
GLVIIDEQHRFGVLQRKALQEKAANPHVLLMTATPIPRTLSMGMFGDLDL
SIIRDKPVGRQPIKTVLKKEQDKPSVYHFVREQIAAGRQGYIIYPLVEES
EKMDLKAAVESYEELSTAIFPDLSIGLIHGQMSPDEKEHVMERFRQREFS
ILVGTTVIEVGVDVPNATVMIIEHAERFGLAQLHQLRGRVGRGEHPSTCI
LLTAKMTADARERLLAMVSTNDGFVLSELDAKIRGVGNLLGKEQSGTLSG
LRIADLNTDEAIMAAARQAAFTLVEADAQLRATEHRMVREHYMRYYHERF
SLADIG
>Cag_1597 hypothetical protein
MSTNTAPRISIPIELSFATPIRYTTGDKPSDVVIGLLGADTYPDLIVANA
GSKSITVHFNNGLGEFTSSISYASFYNSPPLALSVAKINNDAYGDVVAIT
NSSVSIFLSNENGTLQTPTFYTNNGWQSLTAVATGKIDTDTDIDVVVTDA
TTNKLYVVQNDGTAVLNNQTPPSYATGNYPIAVTLGDVDKDGWLDALVVN
NDNTSTPTLSVLINNKIGGFKTKVDYTLATGALDVTTADLNGDGWLDIIV
GQSQEQGNTLVLLNKGDGTFGNTQSYSAGAYPLGVAVGLLGNDNRADIAV
ATSGEKTFAVLQNQGNATFNAPNTFAVVSTIDTPKPTDIAVGDLNGDGKN
DVAITSEHLDSVSLLLNTTFQTRNFTEQTPLLIAPTILIEDPENNWIDGW
LHVAITKNGEAGDDLVLQTSFNEESDLTDYIWFDKVNNGVRAGSSIILGR
FENKVTSTLGKEVGKIEIHFTNPNYKTNEWVQKIAQSILFTNESDTPSTA
TREVTITAIDASHQTSSITQQIAINAVNDAPQELFIEGVPIVGQTLHADT
STFSDADGLNKANITWQWLRDSVNITGATNSTYTVTNNDLGKSLSLQAKY
RDGAGNNEVFTVTTDTVKALNEDPLIARPSTVAFQTSTTLKSFTDASSLP
VIGTLDLNHDGILDLLVAKSSSAPSNQLVVLRGSSNGTFTQDSLTYTLGN
TPSAIAFGDVDNDTFTDIAITNKDSNSVSLLRVVNGVIKNPFTFSCGSKP
TALAIADFNDDGFEDIVTANSGENTLSFLQGNGNGTFAPTNSITTASAPY
GVVAADFNGDGKSDIAYSDSGNDRIVVCTYNNESWSEITNALVGDVPHTL
VATDFNGDGKSDIATINSGSNNVSVLLNNGDGTVATAKTYTVKSNASSLT
ALLASDVDGDGFADLAVSHTTGVSLLMNNGTGGNGTFALAQEIVSNRITA
PPTLASADVNGDNLTDILFPGSYTSINAQLNSQSSSATLFTEQTPIEVTP
NLTLRDPNGDASWRDGKLQVQINYNTTAYDVLALPTEKPELGGVWIDSEA
SNALMVYDQQTDTNLQIGTADNTSVSNNSTWTFTFNRYATNALVEKVAQS
ITFSNNRDNPSLETRTILFTATDSLGASSSATQHITVQPVDDAPLTLISA
TPVDGAGNVEINSNLSFTFSENIVFGSGFIELHRDAPNGALIEHYDVATH
TNLGLNGTTLTIDPYNQLAYGTRYFVTFGEGAIDNGYGTTFSGSEYDFLT
ATDPYVPPTPNNDGSDGGSSTGTILIGTGSLALLALAFL
>Cag_2016 muramoyltetrapeptide carboxypeptidase, putative
MKTLLPKALRAGDTIGLISPSSHCAYPDKINQAVRYLEAHGYRVKASRYL
NCIDTDPAIADRQKLHDIHAMFSDPAVHAIVCLRGGAGATRLLGQLDYGL
IAANPKIVVGYSDITALSLAMLAKVGLITFSGPMLATELYEPDPYTEEHF
WGMLTSPAYSRSLINYEHHPITCLQSGTVEGRLLGGNLSVVTSLLGTPYF
PHIEGEALLFFEDVNEPVYRIDRMLSHVGNAGLLTQAKGLLFGYFSTGTP
LPVAEEQRLQYSMEYYSAMLPPDAVAIRGLSFGHIRHMMTMPVGARCKLH
VAADGLFSFGTLEAVVRA
>Cag_1338 universal stress protein family
MITIKKIICPVDFSDLSRKALQYANEFAQLSGGQVFLVGVIENDPSINYS
HGLEKERAEEEQKLVALIEEENMQGIVADYVIYEGFAEECILDYAKRQEA
DVIVMGSHGRRGLKRMILGSVAEHVIRRAPCPVLVVKENEHEFIQ
>Cag_0329 GTPase EngC
MKQVVEHVLVGTVTEVAGTSYIVQGDDGTLYRACTVPSTKSANSDASLVA
VGDRVELKASVSGHAGYEAIITNVLARRTTLARQRDVRRNRSKERVQVIA
ANIDQLVAVVSAFEPPLNRRLIDRYLVFAESEQLPILLVVNKCDLDDEED
GSSYVREMMHPYHALGYSVLYTSAENGEGVEELRQALAHKLSAFSGHSGV
GKSTLINMLSGQERLRTAETNVKTGKGLHTTTNAVMLVLPDGGAIIDTPG
LREFTLADITRDNLRFYFREFLPVMAQCAYSSCTHTVEPECAVRNAAESG
TIDPERYESYLALYDSIAE
>Cag_1621 Magnesium chelatase ATPase subunit I
MTQTETSAKKAAKKSVAANATADSEAAVKPAGKKKSGLAFPFTAIVGQEE
MKLSLILNIIDPRIGGVLVMGHRGTGKSTTVRALAEVLPLIDRVEGDTYN
RTVEQFVEAEGSKKGGKAIDPASVKTEKIPVPVVDLPLGATEDRVCGTID
IEQALTSGVKAFEPGLLAQANRGFLYIDEVNLLDDHLVDVLLDVAASGQN
VVEREGISIRHPARFVLVGSGNPEEGELRPQLLDRFGLHARIVTILDVAK
RVDIVKRRREFDEDPDAFMVKYQEEQMALSAKIQTAKNLLPQVTMSDDVL
TDIARLCMNLGIDGHRGELTITRTAHAHAAFMGDNEVTMKHVRDIAGLCL
RHRLRKDPLETLDAGEKIEREIAKVLGEAEAV
>Cag_0541 putative transcriptional regulator
MRKMAQQPSSRSFALPPPDGRVAMAVVQQQIEAIVFAANEVVTIATIRQV
TGLLLRPAEIKNMVERLNEEYEATGRTFRIHAIAGGYRMLTCPEFSDLLR
PLNAPKTRRAFSRSLLEVLAVIAYNQPVTRADIQQVRGVSPDYAVERLLH
HGLIEVRGRADAPGRPLQYGTTAAFLDMFHLSSLTDLPKLREIREILQEQ
EEERELQPSNKSNNPLNVYEAGKE
>Cag_0094 Ribosomal protein S6
MKTNKLYECTAIIDGGLQDEAVAATLAMVQRVITEKGGTINSVLDLGRRK
TAYPIKKKSMGYYVHIEFNAAAPVIAEIERVLRYEEELLRYLIIQLTTPL
LEMRKRVEKYSVVLGSVEEGASTSGDAEGSNE
>Cag_1241 conserved hypothetical protein
MKEIPFGLQTFSDLRQQNFIYVDKTAEIYNLTRVKSYIFLSRPRRFGKSL
LIDTIKELFEGNKALFDGLYIADKWNWTTTYPVIKIDFAAGTIHSIDAFE
KRVKDMFITAQEKLAIKCRVDTDLAGCFADLIRKAHEKYRQTVVVLIDEY
DKPILDNIEDTAIALQIREGLKNIYSVLKAEDAHLRFVMLTGVSKFSKVS
LFSGLNNLNDITLHPAYATICGYRQIDLETSFAEHLQGVDWEKLKRWYNG
YSFLGEAVYNPFDILNFIEKQHTYRSYWFETGTPTFLMKLFAKECYFLPN
LENIEVGDEILDSFDVERIQLTTLLFQTGYLTLKQRIESFGRIRYLLKMP
NQEVRLALSDHFINVYTAQQSVQKYAQQERFYNYLMQIDMLGLQQALQAL
FAGIPWKNFTNNDLPQFEGYYASVLYAFFCSLNATVIAEDITNQGQVDLT
IIFDSIIYIIEIKRDTSENYQVSSENVALQQLQQKRYFEKYQRQGKEVIQ
VGMIFNTVQRNLVQLDWAR
>Cag_0656 nucleic acid-binding protein, containing PIN domain
MTEKSEKYLIDSNIIIYHLNGENSATEFLRSNVSQSYISRITFIEVLSFD
FMKDEKEDVLNLLRRFEIIDTTDAIAMRAIENRKLKKIKLADNIIASTAQ
VDDLVLVTKNIKDFNGLNVRVLNIFA
>Cag_0351 SecE subunit of protein translocation complex
MGKYIGKVSQYYRDVVVEMRKVVWPTKQELKDLTVVVLTVSGILALFTFL
VDWVINGVMGWLL
>Cag_1743 Ribosomal protein L28
MSKVCLLTGKKPKYGNSVSHANNHTRTRFEPNLHTKKIWIEEEKQFVKVK
LSAKAMKIIAKTGTAELAKLLK
>Cag_0445 hypothetical protein
MSIYTSPETSPSPIAILCSEYVQSVEAMAESLPFIMSTLIEAEDTFDKKL
DAFIDFHAIDVEKLEDGRRYGLKLEDKSAHDRLHRRIRVFREALGVTPRS
FLVALVSAYDAFLGRLIRSLFYARPELLNSSERVLTFAQLQDLQTLDAAR
EYLVEKEVESVLRKSHSQQFEWLEKTFSVPLRKGLECWPRFIELTERRNL
FVHADGIVSSQYLNVCGEHGVTHSEVLTSGTRLHVERSYFQLSAHCLMEI
GIKLAHVLWRKLVPTDREKADENLIEIVYDLLIKQKYRLAADLASFGTNT
IKSHGSDQTRRILVVNLAIAHKFGGDAEKCTAVLDAEDWSATADDFRLAI
AALRDNFDEAAKLMKSIGKDNRLGMFEYREWPVFRDFRKSYQFAAAYKEV
FGEEFVLKAQEPKASESPQTTAKGENVLDEE
>Cag_1702 TPR repeat
MLKLGKGVSTQPSIGRIGGEWYIGGMKQSFKHQALRRIKKGALLLCIFAA
TTLTACSNNELEKLQQEAWKNPNDAALTLQLGYKYAQEGRYMEANESFQK
VLALDPKRDEALQALGATAFRQKQYSQAISYFQQHLERAPADSARLYNLG
NAYMQLKQYDKATELYNKAIDNSTAFIDAHYNLAVCYAKTGRRNEAQAIY
EWLLTKNNYLATSLQKHLDKENQAPK
>Cag_1750 hypothetical protein
MKILLHIGCGIKDKTQTTLAFHNEEWDEIRYDIDPDVAPDIVGNMTDLIT
LAPASVDAIYASNCLETLYPHEVPQALAEFRRVLTEDGFVVINSPDLQAV
CTLVAKGKVLDPAFVSPENGLVTPFDLLFGHRPSLAEGNMFMAHRCGFTS
QMLSGALQAAGFSMVASMVRPKHYDLWTVASKSQRTEPEMRALASEHFPG
LGVM
>Cag_0021 conserved hypothetical protein
MTSEELQQLLTPEAQAMLQAHQHDNPTTFALRYSNRHDLPIRALAEQLAC
RRKAERKLPTLSRHNLLYTTLSLEQASSERTARFKCTFMQGKRCIDLSGG
LGIDAIFLAAHFEELLYCERNELLCNVVRHNMVRCGIGNVRLQQGDSLSF
LASQPDNAFDWIMVDPARREEGKRSIGLEAASPNVVASQELLLAKAPHIC
IKASPALEISNLKMLLPALHTILVVSVSGECKEILLLLKRGAEAEHPITK
AICLQADNNAVVEIVGTHEQHRSLAESLQCYLYEPDAAIIKARLSGVVAK
QEGLEFLNKSVDYLTSNHVVASFAGKVFQVIESVPYKPKEFRKFLDRHAI
SAASIQRRDFPLSADELRKKFRLREDEKHFLIFTRNRNAEPICIYAERC
>Cag_1897 Nidogen, extracellular region
MSTLLTNLGGTLGFGEYYLTRNDDSYKNGIEVASVFGADGLNFFGRHYTY
FSVNNNGNISFANDANSGLSTYTPFGLQEGGYALIAPFFADVDTRFLSDA
AAEANQITPTPDGTSQGSNLVWYDLDPEGNNGKGVLTVTWDDVGYYSYAT
DKLNAFQLQLIGQGNGNFDIVFRYEAVNWTTGIASGGLYGLGGTVARAGY
STGDGSAWYELPQSGNQDAMLSLDTSAGNTGEAGSYLFTVRNSQEVGVLN
GTEGDDLLAGSTMHDTIYGFAGNDYLIGNSGDDLLVGGAGDDTYTVDEGD
IITEEVDGGFDTIFAATTYTLPNNVEVLRLTGAASVNAIGNNGDNIFVGN
IGNNLFDGGDGFDTVDYSRSRSEITVNLTQTTPYSIGGYEGSDTFLRIEN
LYGSTYSDYLQGNNSENILRGNAGADTLQGNGGNDTLDGGDGVDTIWLAN
SFSEYTITYNAASGFLQSVHNSTIESSDGIDSLRYVEYLRFSDVLYCVKI
VENALELTRENIAPSFKTILSTVASTNEDTLVAITFQDILVTTECVDIDG
SITSFNISDLHSGSLWIGADSNSALPYNYYSTSLIDANNNAYWKPDQDAN
GLLGAFTVVAFDNEGAITESSHTLNVDVLPQSDAPSISIPHPLFDNPLTY
STQDYPTDIALGDLNGDMLKDMVVVNEESNSLSVYINQGDAIFAPQELYF
VGNSTRGISFIDINSDEALDIALAINGDWNSYILLLLNNGSGEFHALPTT
LPTGYYATSVASGDFNGDQLMDVVVANFGTYSVSVFINNGNNTFTAQEPY
LLDDSPYDLVAVDYNEDGSLDLLAASNYGNNISVLKGNGNGGFTDCKNYQ
VGDNPKALATADFNGDGKSDIAVANSGNNSVSLLLQNEVGEFIAPATYEV
GNNPQAITVADLNGDGFLDLLTANYNSNAVSVLLNNGDATFIAQDDYSVG
YAPIALASSDVNSDGYADIAVVNYQENSVSVLTNISFLTTFYSGTVPVIV
SPTITIDDADNRYGWNNATLGIQISTHADSDDMLHLPLTNAGDGSIWLHV
SDAGYALMAGELQIASANSAKAEGDAAWIFTFNGEASSEMVQAVGQAILF
SNSNTLDSFAERTVTYTVTDADGLSCSASQTVSVMVDELPPTLLSAYPTH
NAEAVSVTETLFFTFSEPITLNSGAITLHAASPTGIALAADISIVGNSLQ
LDPLMPLQPDVEYFVTFEEKSITDRAGNAFVESISYSFTTESAPILRHAL
SGVVQFWNSNDAITDVATTLMTLPYEHGSQLVEFHNLQRDDESFSVEVWI
TPTEALHSVQLHFLLPTTNATPSWQEATGLPSGWHMAYNSNNSGSFALSS
MGDAVIEAGSVKLGQLTFAAPSGIERAELLLTSGEFGCDGGEDLLETVDI
APTGIAFSCVDTANDGSYHHFDMLEGSYALQAAKEAVPEGQSAVTAADAY
AALQMAANINPNGNESEVLQWQYLAADVNHDGKVRAADALNILKMAVNYE
GAPEEAWIFMPDYVAGYEMTRSTVDWSAEEIVVDLLSNEAMSLIGIVRGD
VDGSWMG
>Cag_0194 conserved hypothetical protein
MIVTGLDVLLRNLDMLRHRSVGLLVNQTSLTASMEYSWQLLQKQGITIRR
IFSPEHGLFATEQDQIAVSYQPELGCDMVSLYGDSAATLVPDMALLDDLD
VVIFDIQDVGARYYTYVNTLALFMEAIAGRDIELMVLDRPNPLGGEIVEG
PMLDMAFGSFVGVFPVPVRHALTAGELAVLYRDVMQLDVNLRIIKMEGWK
RTMLYGETGLPWIPPSPNMPTVATAEVYPGMCLFEGLNVSEGRGTTTPFQ
LSGAPFIHPIELAERCHSYGLEGVRFRPVWFKPTFHKFAGEVIGGIWQQV
TDARRYRSFATAVAMTAALRELYGEQVTFLRGVYEFNDTIPAFDLLAGNA
TIRTAIESGNTIHTLLTLWQKDEAQFAETKTRYHLY
>Cag_0812 conserved hypothetical protein
MSLLEKIHEATCSVRFSNDWLKIFTCPVPSSDFLSFVAVATIDAPQHSVL
SLLYDVDVATQWVWKTKEMRVLKELGADGTDGRIVYQLVSAPWPVSDREI
ISSSTAYMDPNTKEAFIKIECMADYLPKNSKAVRVPEMQGAWNIMPLGEN
KCRVVFRLHIEPGGEIPFWLANIAVIDTPYHTLVNLREWVKKEKYRTPVD
APFKESAADVIQHYDKFIKE
>Cag_0634 NADH dehydrogenase I, subunit 3
MDQTLTQFGSVFVFLLLGVIFVVGGYLTSRLLRPSRPNPQKNSTYECGEE
AIGSAWVKFNIRFYVVALIFIIFDVEVVFLYPWATVFKPLGTFALIEVLV
FAGILILGLAYAWVKGDLDWVRPTPNIPKMPPMPELPVAKGASQKD
>Cag_1630 hypothetical protein
MDNTKQDFEKRKKEIESYFNFLLIFDDDKTKIRYIKDGILVNEKINPVFQ
ITLIANSFLILYNLIESTIRNSIIEIYEKIEADEITYETLSENLKKIWIK
QKTDKLKENNFKQDTLRGYIAEIANDILNRETIRFDKDNLEFSGNLDARK
IRDLADSIGFQKTVNGQNLVDIKNKRNRLAHGEHTFYDVGKDYTVNDVIE
FKTETFNYLSDIITNIDHFISTQAYKIKN
>Cag_0037 conserved hypothetical protein
MQGMVEQYTRNANTENRAYKPHLGWHKKTALLVTLFLTLTALFSLPQTTL
AAVNTVPPNFVLIRGGEFTMGSPESESERDRDEMPHRVKVGDFYIARYEV
TTAEFRTFVQETGYRTDAEKTNPSLVFWSGLWPGKAGLNWRYGTNGKERS
GAENNHPVILVSWNDAVAYCKWLSKKHGMNFRLPTSAEWEYACRAGTSTV
FNYGDNLSTTQANYDGNYPYSNHPKGIYRKNTVPVNSFTPNAWGLYNMHG
NVAEWCSDWYSEPYYESSKANGTVTNPTGPATGSNRVMRGGSWYDDARYC
RSADINDSTPSYRYINVGFRVVLEP
>Cag_0828 Primosomal protein n
MYARCVADRFFRGEPFSLVVPEAFCEELQAGCMVLLLSLKGQGLMSIGYV
LSLSPDAPPDMVNEELPSFEMVDLLNGSQPVLNGELLKLTSWIADYYLTR
PIDAIHTALPVAIRTTVHDVVEAAGFTLQAEPTKVMNTALRRSILKLLAT
NKQLTVTQLQRRLGKKQLYKTISQLEKGGYLTLSKKFSTKKPKYKSAYRL
TAPLQDGVLESVASAKKQHATLSTLADLYPETAFLNELEVSHAVIQVLLN
KGLVEKVQKRIESNFSSGYRESAQPAKKPTAQQQKVLNELCSASRQGHYQ
TFLLHGVTGSGKTLVYIEFLKEVLAAGKTAIVLVPEIALTPQTAGRFREH
FHHDIAILHSAMSLQEKYDAWHSLKSGRCRIALGARSTLFAPLENLGAII
VDEEHDGAYKQDRSPRYHARDTAVMRAMLSNAICLLGSATPSFESYQNAQ
NGKYHLLRMAERIDGATMPTISLIWMRESPRRTTSISEMLYQQIAQRIEK
NEQVILLQNRRGFAGSILCLECGHIPLCPHCNIPLVYHATHNHLRCHYCG
HTERYKAMCSACKSTGLFYKGSGTERIEEELQKLFPDEKILRMDVDTTAK
KGAHGRILREFHERKARILLGTQMVAKGLDFPAVTLVGVLMGDIGLNIPD
FRASERTFALLMQVAGRAGRAAIPGEVLIQVYNKESDVFTALLHGDYERF
FQQELESRRTLLYPPAARLIKFECSADDEVQAEAAATFCKEIVQQHLPEK
QGMVLGPAPACIAKIRNRFRYHVLVKLMLGKLSPLFIREMSDTIHSRFRS
ANVLLTVDVDPQSLM
>Cag_0081 Biotin biosynthesis protein BioC
MAQCVDKLLVGERFRKALATYREHAVVQHAMAEDLAAMLARHLPSSPTIK
RLFEIGAGSGALMEALLHRFCIDHYFANDLVAESEGCLRPLLAPYREEAF
TFLMGDIEVLAEWPSSLEVVISNATVQWLEQPAHFFQQAAKALQPHGLLL
LSSFGASNMQELSSLLGVGLRYHAPDELIALASHSFDLLEVKEEQKELLF
CSPEAVLHHLRCTGVNGVVRTQWTKSDYKRFLSSYRERFSTTDGKVVLTY
HPYAIALQRKG
>Cag_1029 heterodisulfide reductase, subunit A
MAKIGVFVCHCGENIAGKIDCNKLNDAIKEHPGVEIAFDYKYFCSDPGQE
SVKKAIRDNHLTGIVVAACSPRMHEATFRKACAEAGLNPYLCEIANIREQ
CSWVHSDGDMATQKAIDITRSMVEKVKLNNTLQPIEVPVTRKALVIGGGI
AGIQAALDIANAGQEVVLVEREASLGGHMAQLSETFPTLDCSQCIMTPRM
VETAQHPKIKLLTYAEVDSVEGYIGNFKVKVRMKARYVIKDQCTGCGDCI
LKCPQKKIPSEFDCGLGNRPAIYTPFAQAVPNIPVIDRDRCTYFKNGKCQ
LCVKACQLNAIDFQQQDEFLDIEVGAIVVATGFQIQNNAVYGEYGYGKYK
DVINGLQFERLASASGPTSGKILRPSDGTEPETVVFIQCAGSRDPSKGVK
YCSKICCMYTAKHAMLYAHKVHHGKTHIFYMDIRAAGKGYDEFTRRAIEE
DEASYLRGRVSKVWEENGKLMVRGVDTLLGKPVEIAADMVVLATAIVPQP
DAKEFAKTIGIGYDEYGFYNELHLKLRPVESSTAGIFLAGACQSPKDIPD
SVAQASASASKVLALFSREKLEREPVVASNNESTCAGCWGCVLACPYNAI
EKKDICDRQGNVIKRVASVNPGLCQGCGTCVTFCRSHSLDLAGFTEKQIF
AEVMGL
>Cag_0635 NADH dehydrogenase (ubiquinone), 20 kDa subunit
MGLLDAGITNHNVLVTSVDNVLNWARLSSLWPMGFGLACCAIEMMATNAS
NYDLERFGIFPRSSPRQSDLMLVAGTVTMKMAERVIRLYEQMPEPRYVLS
MGSCSNSGGPYWEHGYHVLKGVDRIIPVDVYVPGCPPRPESLIGGLMKVQ
ELIRMEQIGLSRADALKKLAENPIDPQAIIEQQRKTAVAH
>Cag_0023 Peptide chain release factor 2
MLKELADLKSWTDEYDKLSAKARYCQDEMDIALELQDESFGDELTNTITE
LEQAIAAIEFRNMLSGKDDAGNAMVTIHAGAGGTEAQDWAEMLYRMYMRW
AERKGFKTFTDDYQEGDGAGIKTATLEIQGAYAYGYLKAENGVHRLVRVS
PFDANARRHTSFASVYTYPEAPPDVEIEVRKEDLELSTFRSGGKGGQNVN
KVETAVRIKHLPTGIVVGCQEERSQLQNRERAMKMLMAQLYQRQREEEDA
KKREIEGKKKKIEWGSQIRSYVLDDRRIKDHRTNYERFDIETVLDGDLDD
FMEKYLSQFGDE
>Cag_1260 transcriptional regulator, XRE family
MEQTLFTSVSDAAAFLADDNSIKEKVEQEIASSQLVNRLIQLRINKKVSQ
KELARQMGCDASKLSRMEAGTDQQLRMGDITDYLSALGVHVSLVFEDTTL
PAAEQIKQHVFAIHEQLENLANIARQVDGDNEIIEKIHKFYGEVLLNFML
RFSDSHEKLCMVSGYKSATPHKEEYIKKPLFAKPLTN
>Cag_0886 conserved hypothetical protein
MRKKRAGRPQHCRCVQDLPKVTCFTPSGVAPEAVEQVLMTVDELEAMRLA
DRDGLYHADAAMQMKVSRPTFGRILESGRRKVADALVGGKQICIKGGTVL
AVCDSIPTERPDICLCPTCGREFPHIKGVPCRNSICPDCNEPLQRKGGCL
SDEESDEQENEQRTVGYPESEEELEIE
>Cag_0715 conserved hypothetical protein
MTDAEKPRLIVVAGPNGSGKTTITDKLLRHEWMSDCHYINPDLIAQQEFG
NWNSPEAILQAANKAKALREEYLRQRCSMAFETVFSTNEKVEFLWRAQKQ
GFFIRLFFVCTNDPTINAQRVAQRVIDGGHDVPITKIIQRYYRSLSNSIA
AMKWCNRSYFYDNTAANQDPQLLFRVNDGCLKKVYAPIAPWAEIMYQTAI
NQGTISDTTSHNPNPSTITFYEY
>Cag_0578 conserved hypothetical protein
MSKELSHRADYQKLLGSISTLYTSGQRRAYQAVNSVITETYWQIGCYMVE
FEQCGNIRAEYGKALLDNLSRDLTLRHGKGFSRSNIIRFRQFYLAYPKGA
KPSHLLSWSHWVELLKLDDPLERSFYEQQAIREKWSVPELQRQKKSSLFL
RLAAGKDKAAILQLAEQGQIVEQPADLLRDSFVFEFLKIPESSEMAELDL
ESRLCDHLQPFLLELGKGFTFVGRQYRIPINNSNYRVDLVFYHRILRCFV
LIDLKINEVEHHDIGQMNLYLGYFAAEENTPDDNPPIGIILTRQKDELLV
EYATYQMNSQLFVQKYQLYLPDREELRREIERALWDIEESNSNKEKKNE
>Cag_0581 hypothetical protein
MDFLTDTSFWITVMVATLLATIPSFIKPMRLSILIPIIVLWLAVIGWLAF
SVSALAALIGAGISLLVGALIFLTTILLTGIQQMAKKRYR
>Cag_1838 30S ribosomal protein S14
MAKKSVIARNEKRIKLVAKYAELRAELVKAGDYEALRKLPRDSSATRVRN
RCVLTGRGRGVYAKFGLCRHMFRKLSLEGKLPGIRKASW
>Cag_1708 hypothetical protein
MQRIIRIFFASPGDLEEERHLTKEILRQMSERSRYTFEFYGFERALATTA
CRPQDVINNFVDECDVFIAVFHRRWGQPPQDTVVYSSYTEEEFERAKRRF
VSTGAPEIFCFFKQVDLPSIADPGEQLRKVLAFKRRLEESHQVLYRTFAT
AAQFVADIEQHLFAFAEGKLPTPRSPKHRFHIPIIEDQQPDSQRSYDLTK
VHQALNAATSGCVEEAVILMAGVSQTTRDIELLDVIKEFFINTNNLDAAQ
AVVEKKLTLLQDRRLAAHAYVAVLMSEHWLNDLVASMSKTVSPEKQSVAE
HTTRKLFTGIRFHELMIEYLSKYFTVGELLSLTRFYQGEGASITAKFGRT
IGIMIPEINAILMAENPELFEG
>Cag_0969 conserved hypothetical protein
MGINYLLTNNHKSLQNMASYSQSHENLSGVARLFLVVSASVIAVASIAGG
AGTLLQSELLLTLHPYLFFIGFGNLAILILNRYLTAAIYPELTIDPARQR
SYIALVLLALGMITIAVALKLPLLKAATGLLLMAVVSVPLREIFSKLSIP
AIWKEVSVRYYIFDVIFLMVANLGLFTLGLKEAFPDFSIIPFFVTQSSYF
LGSSFPLSISVMGFLYAYAWSRSPKRELAKQLFSLWFYIFVGGVLFFLVV
ILIGHYWSMMLISHFLMFGVMAMLASFAVYLNNFFHSKFHHPALAFLLSG
LSLLFATSGYGIMNIYFMQGITFGTRPPLPFEQMWIYHSHTHAALVGWIS
FSFMGMMYIVIPSILRSGSLETLRSDNALSALLDAESMKRAFAQLTIMVL
AAMAMLLAFFLQQQLILGVAGVLFGVAVAYVMLNMHSSR
>Cag_1309 methylase
MPITKKEIRDNALRFALEWRDASRERAEAQTFWNEFFQIFGVSRRRVASF
EEPVKKLGEKRGSIDLFWKGTLVVEHKSRGGNLDKAYNQALDYFPGLKEE
ELPKYVLVSDFDRFCLYDLNENTQCSFLLNELPEHIDLFGFISGHQIKVY
KDEDPVNIQVAEKMGELHDALLDSGYDGHDLEVFLVRLVYCLFADDTGIF
NAKGDFEDYLRNKTKENGSDTGSILANMFQVLDTPYEKRQKTLDEDLVNF
PYVNGDLFREPLRIAHFNGAMRELLLECCLFDWSKVSPAIFGSLFQSVMD
RKRRRNLGAHYTSEKNILKVICGLFLDDLRREFEAIKNDARKVTAFHNKI
AAMRFFDPACGCGNFLVITYREIRQLEIEVLQQYFMLTSKMYKAGVTQLE
TDIETISKIDVNQFYGIEIEEFPARIAQVALWLTDHQMNMRLSQAFGQTY
VRLPLQHAPNIICDNALRKDWETVIPSKEHLYILGNPPFIGKQNRNAGQM
ADMDVICQPLKAKGLPNYGVLDYVALWYIKAALFIENSNVKVTFVSTNSI
TQGEQVAALWEFLLRKGVKIFFAHRTFKWTNEARGNAQVFCVIIGFTWNN
TTQKKRLFDYETPQSESHEIEAKNINPYLIDAIDIVVSSRNKPLCNVPEM
LYGSKPVDDGNLFFDDDEKVELLKKEPKAEQFIRRVISAHEFINGKNRWC
LWLKDIAPNEWRNLPELVKLVEAVRGFRLKSKKAATVKLAEVPYLFGEIR
QPETNYIVIPLHSSEHRKFIPTGYFSKDNILHNSCSAVPNATLYHFGILT
STMHMVWMRTVCGRIKSDYRYSNNLVYNNFLFPHDISNKQKAKVEEKAQA
VLNARELFPNSTLADLYDPLTMPKALLTAHRELDAAVDACYRKTPFQNEL
ERLEFLFQLYSSYTQPLVPAMDAKPKRKRMGKG
>Cag_0013 conserved hypothetical protein
MKPIYYFMAATFSILLSIYVFIFGTSTNHEMLGIFIGLWAPTIICVGIFN
TLIGILDEMCCAHKRIEEGQSCSHNRH
>Cag_0366 2-oxoglutarate ferredoxin oxidoreductase, alpha subunit
MTDTTSSTKQVTVTPKTSVSVLFAGDSGDGMQLTGTQFANTVAVYGSDLN
TFPNFPSEIRPPAGTVSGVSGFQLQFGSKGVYTPGAKFDVMVAMNAAALK
ANLKNLHHGGIIIADREGFDAKNLKLSGYGEENNPLDNGDLRDYKVFEAP
VITLTRKALADTGLSTKDIDRCKNMFVLGMLYWLYSLPLETTIQTLQTKF
RKKQLLADANIKAVQAGYNFGDETEIFSQHGRFSVAPADKKAGLYRRVTG
NEASAIGLAAAAKKAGLELFLGSYPITPASEILQTLAGMKKWGVKTFQAE
DEIAGVLTSIGAAYGGALAATNTSGPGLALKSEALGLAVILEIPLVIINV
MRGGPSTGLPTKPEQSDLLMAMYGRHGEAPMPVIAATSPVDCFYAAYEAA
KIAVEYMTPVLCLTDGYLALSSEPMLVPAPDALATITPVFAKERAADDPA
YLPYKRDERGVRQWAAPGTKGLEHRIGGLEKQHETGHVSHDPLNHELMTR
LRAEKVERVVDIVPDLTLDNGPEQGDLLVLGWGSTYGAIKKAVEHIHEKG
INSVAHAHVRYINPFPKNLGAMLQRYKKVLIPENNCGQLLHLIRDKFLIE
PVGFSKVQGLPFNEMEIEAKITDILKEL
>Cag_1508 hypothetical protein
MGHIYLQNNEIQDAVSAWVTAYTLARKIGYAQVLDALENLAPQLGLPGGL
EGWEMLARQMGGEE
>Cag_1403 Methionyl-tRNA synthetase, class Ia
MSTMPHFSRTLVTTALPYANGPVHLGHLAGVYLPADLYVRYKRLKGEDII
HIGGSDEHGVPITISAEKEGISPRDVVDRYHAMNLDAFTRCGISFDYYGR
TSSAVHHATAQEFFSDIEQKGIFQQKTEKLFFDLQAGRFLSDRYVTGTCP
VCNNPEANGDQCEQCGTHLSPLELLNPKSKLSDATPELRETLHWYFPLGR
FQAALEEYVNSHEGEWRPNVVNYTRTWFKQGLNDRAITRDLDWGVAVPLQ
SAEAVGKVLYVWFDAVLGYISFTKEWAALQGNAELWKTYWQDPETRLIHF
IGKDNVVFHTLMFPSILMAWNEGKTTDCYNLADNVPASEFMNFEGRKFSK
SRNYAVYLGEFLDKFPADTLRYSIAMNYPESKDTDFSWQDFQNRTNGELA
DTLGNFIKRSIDFTNTRFEGVVPASVTKEEWDNLGIDWQATLEQLDSAYE
GFHFRDAATLGMEIARAANRYLTSSEPWKVIKVDREAAATTMALSLNLCH
ALSIALYPVIPETCNRIRAMLGFSEPLEATIQRGTSLLSSLLTPTLQQGH
KLREHSEILFTKIEDSAIAPELEKIAKLIAEAEKREAALAESRIEFKPAI
SFDEFQKVDLRVATVVAAEPVAKANKLLKLRVQVGSLTRQVLAGIAKHYT
PEEMVGKQVLLVANLEERTIRGELSQGMILAVENSDGKLFIVQPSGEGIN
GQSVQ
>Cag_2011 conserved hypothetical protein
MSAHIYKKVEMVGSSPNSIEDAINNAVAKAAETMHSIRWVEVAETRCHVE
NQKVAYFQVTVKIGATLEENETL
>Cag_1996 putative segregation and condensation protein A
MFTINLDEFEGPLDLLLFFIKRDELDIYNIPIARITTDFISYIHAARQLD
LEVAAEFIYMASMLMSIKARMLLPRPPDDPTAESDEFDPRTELVERLLEY
QRIKEAAEALRLRADERALLLARGWSELQEAVSAGNEQEGEDELLQQPTL
YHLMLACNAMLKRRPRPMTRSVADVPVTIEEQSAMILERLRYEPQLSFTV
LLASFSENLVVVVTFLAVLELCKSQRISVLINDAYNDVWLTLRVSQSN
>Cag_0427 hypothetical protein
MAFNQNVFVNCPFDKTFYPLLRPLLFTIIYLGLKPRIATERLDSGEARIT
KIVELIEDSKYAIHDLSRIKATKKGEFYRLNMPFELGIDVGCRLFKGGEH
EHKKCLILVAEPYNYQAAISDLSNSDVANHHKETPEDVVIEVRNWLSATC
GLEADGPSRIWDAFNVFMGDNYSALIARGFSKQDIEKLPVQELIQSMERW
VTENV
>Cag_0112 von Willebrand factor, type A
MGREWFSLTKHSLEQTPQESLAELQRRIRRIEIRSRRKATELFSGEYHSS
FKGKGIEFSHVREYHYGDDVRSIDWNTSARNQDLYVKLFTEERERSLLLM
VDGSASMFFGSNQQSKKELAFELAAVLAFSALDNNDKVGLLIFTDQVELY
LPPRKGRRHVLLLLDKLSRHKPQSKQTNINAALSFLRYTLRRQEIVFLIT
DLIDSDYEKGMKQLNQRHDFILVHLRDALDTKLPLSGLLTLQDPESGERC
VVDMATPQQCERYKAMQERSIEELRQRMRRMRIDAIYLETDHSFFGALNA
FFRYREQKV
>Cag_1852 Ribosomal protein S10
MAVQQKIRIKLKSYDHSLVDKWALRIIDVVKQTDAIIFGPIPLPTKSHVY
TVNRSPHVDKKSREQFSFSSHKRLIEIMNPTSRTIDMLMKLELPSGVDVE
IKS
>Cag_1762 site-specific recombinase, phage/XerD family
MFMSNSALHQPLPRLLQESALPIQAFLEHVAQRRGLSPNTVVAYRGDLIQ
FFTFLAQHLELLDLRAFQPESVTPMDVRLFMGFLLEQGVKQRSIARKLVA
VKVFYRYLQEHGIITTCLFSSLGSPKFPQRVPNFLTEEQTSKLFELLETV
PNGAVSDSQPANSALHAFTAARDCSILELLYSSGLRVSELVNLRMDELDV
ERGYVKVHGKGNRERIVPVGAAAIEALKKYFEVRRNFFRMNKEVEPFTSV
FVTQKGAKIYPMLVQRVTARHLSLVTEQKKKNPHLLRHTFATHLLNSGAD
LESVSEMLGHSNLATTELYTHVTFERLKEVYRKAHPNA
>Cag_1091 conserved hypothetical protein
MTTTDRYINPFTDFGFKKLFGSEMNKDLLIAFLNTLLPIEAGTIADLTFL
PNDRVGRSEFDRRAIFDLHCKNEKGEYFIVEMQQAKQDYFKDRSVFYASF
PIQEQAQKGKWNYCLQPIYMVGILDFIFDENKADDTIVHHEIKLVNLSTG
KVFYEKLTFIYLELPKFTKSVDELESDFDKWCYLLSNLPDLTDRPARLQE
KVFLKVFELAEIAKYTPEEAREYEKSLKVYRDLKNVIDCAYDEGKAEGIE
EGIEKGKEIGVLEGMVKGKELGLQEGLQKGMEAGLLKGKLEIARKLMVKG
MSADEAAGIAGVDVERLSSNDE
>Cag_1519 Chorismate synthase
MIQYVTAGESHGPALSAIVDGMPAGVPLTDEAINHQLARRQQGYGRGGRM
KIETDKAEVLSGIRFGKTIGSPIAMVIRNRDWQNWTTTMAQFEQPQEAIP
AITVPRPGHADLAGCIKYGFEDIRPVIERSSARETAARVAAGACAKLFLK
ALGIEIGSIVTAIGSAKEEMPQQELAAMLAHGAEAVATQADQSPVRMLSK
SAETAAIAAIDEAKANGDTVGGIVDVFITGVPLGFGSYVQHNRRLDADLA
AALLSIQAIKGVEIGTAFDNACKYGSQVHDEFIFSESGELTRPTNRAGGI
EGSMSSGQVIHLRAAMKPISSLISPLQSFDIASMEPIPSRFERSDTCAVP
AAGVVAEAVVAPVIANALLAKLGGDHFSEIHERLEAYRAYLANRLQRKS
>Cag_0922 conserved hypothetical protein
MFFDPLYLILALPPMLLGLWAQFRVKSAFKKYSGVPTQSGINGAEAARRI
LQRGGLTNVSIEPSHGMLSDHYDPTQKALRLSDEVYGYASIAAVGVAAHE
AGHALQDKTGYAPLQLRSIMVPAVTVGGNVGPILFMIGMFMAGSLGTTLA
WAGVLLFAATSLFALVTLPVEFDASRRAKELLVSQGIVSSAEMKGVNAVL
DAAALTYVAAATQSIMQLLYFVMALNRREE
>Cag_1233 Band 7 protein
MDIGIAILIVIGAAIASALKILQEYERGVIFRLGRILGAKGPGIIILIPG
IDKIVKVDLRTVTLDVPPQDIITRDNVSVKVSAVVYFRVVDPIRAIVEVA
DFHFATSQLAQTTLRSVCGQAELDNLLAERDEINERIQAILDKETEPWGV
KVAKVEVKEIDLPEEMRRAMAKQAEAERERRSTIINAEGEYQAAQRLADA
ARIIASSPSALQLRYMQTLKDISTEQNSTIIFPLPIEFFKAFMDSNSKNS
SQQS
>Cag_1461 Ribosome-binding factor A
MSRRTERVASLLQHELGAIFQLELPRTPIVTIVEVKVTVDLGIARVYIST
IGTPEEQAAIMAHLQEQNKYIRKLLSQRIRHQFRRIPELEFYEDHLYEHA
RHIEQLLSQVRKAPVDDAETPLD
>Cag_1817 GTP-binding
MNITSASFVASYTSLHALPEAVLPEIVFAGRSNVGKSSLLNSLTGLKGLA
KTSAKPGKTRQINYFLINELFYFVDLPGYGYAAVSQSEKAAWGQLLANYI
ERRDAISLVVVLVDSRHPFMENDVAMLEFLEFHGRPYGIVMTKSDKLNQS
EKSKCQRVAKTYAAKAKFVVNYSSFSGAGKALLLSHIDHSIISQ
>Cag_1397 conserved hypothetical protein
MNTKSCIQVIEQGTRGFYIRNSLYLPFHCEILSIWVGREMSFIAAPELLC
DMMDSEVLALREGDRYTNLVFRKWGDMAKELGNNKGHVILFAAEKGSDLF
QAEQRFYIRITFELETRELSFELLDNPFYL
>Cag_1484 ATPase
MEGNFSDRVQDVIRMSREEAIRLGHHHIGAEHLLLGLLREGEGIGAQILK
NLQIDLDNLRQRVVESSQPKVSEPHSGNIVLTKQAERILKITYLEAKICK
SHIIGTEHLLLSILKSDDPIATRALEQQGVNYDAVRDELADITGIPRPPE
PDEPPTPRERPVFTKAQTEPPSPPPSPAQPARKPEKEKSKTPVLDNFGRD
LTHLAIEDKLDPIIGREKEIERVAQVLSRRKKNNPVLIGEPGVGKTAIAE
GLALKIIQRKVSRVLYNKRVVALDMAALVAGTKYRGQFEERMKALMTELE
KTRDVILFIDELHTIVGAGGASGSLDASNIFKPALARGELQCIGATTLDE
YRQFIEKDGALDRRFQKIMVEPTSADETIQILNNIKSKYEAHHRVTYSEF
AIEKAVRLSERYITDRFLPDKAIDVMDEAGARVHLSNIHVPANILALEKE
IEEIKSEKNDMVRLQNFEEAASLRDREKAVNDQLEQAKQEWEENSADIVY
DVSETDITSVIAMMTGIPVARVAQSESQRLLTMEADLKREVIGQEEAITK
ITRAIQRTRAGLKDPMRPIGSFIFLGPTGVGKTELAKALTRYMFDTEDAL
IRADMSEYMEKFAVSRLVGAPPGYVGYEEGGQLTEKVRRKPYSVVLIDEI
EKAHPDVFNILLQVLDEGILTDGMGRKVDFRNTIIIMTSNIGARDIRNFS
AGAGMGFAPIDDGTGNYKAMKSTIEDALKRVFNPEFLNRIDDIVVFHQLE
KQHIFKIIDITAGKLFKRLNDMGITVQIDEKAKEFLVDKGYDQKYGARPL
KRALQKYVEDPLAEEMLKGNFGEGSTIHIAFNEGSQELSFSDGVATEESI
TSSDSATAAEVQE
>Cag_0650 exopolysaccharide biosynthesis protein
MQSVVKQQNITDFHCHILPSVDDGSPNIASSVEMARLLVQAGYKQVYCTS
HLIKGMYEVTNSELLHTRDMLQQELNKQDIALQLLFGREYYLDEFLLDFL
TEPLLFEGTNLLMVEIPNNTSADVVKNTLFAIARRGYTPMIAHPERCQLL
EITTYEQKSTKGWKKWFMGGKEDSTMPEHTNSLLRYLQQLGCMFQGNLGS
FKGLYGHRVKANARAFEQYGLYTHFGTDGHSPDMLRSLF
>Cag_0367 conserved hypothetical protein
MSWGVEDPQKRKDIRELMDIASKIVQENPNHVNEVKKSHNNCWMEQIYLI
QRCDFCDLAPDCPTREEKEWQEYIKANNIMVIKDSFPTNPQ
>Cag_0233 Dihydroneopterin aldolase family
MPNTQTHSLIKLNKMTFYAYHGVLPQEAVLGAHYEVDAELSVDITSAALN
DDLAQTIDYGMVYMLIKTEMTETRQQLLETLAYRMAHKLLETFALVDRTT
IAVRKLHPPLGGECHSAEVSYTFIRS
>Cag_0745 glycosyl transferase, family 25
MKVHVISLKRCTERRKAFMDMNPHVDYLFFNAVDGSTIPEKVLSNPLLFE
KGLPYTKGAYGCALSHLLLWNKAIKENCVLTIAEDDAIFRKDFHVMQNKL
LSSISSDWDIILWGWNFDSILSLNVLPDVSPTVMVFSQEKLRESINTFIE
KVTYPSSLFFLDKCFGIPAYTITPQGAIKFKSLCFPLKFFSLWFPLLNRK
LPNNGIDIAMNKIYSSTNSYVSFPPLVVTKNEHAISTIQTNRNT
>Cag_0949 D-alanine-D-alanine ligase and related ATP-grasp enzymes-like
MIKLTHQHTYSGTNIYSTESAIVVLFEIEKEELKIAKNNILSVKHSLSNL
FESYVLKDFVTELELGDFIVSFAHILLTEVRGFINTAKSFKNDDQVVLIL
GYHVPKVSLLALKVALNIYINIEKLNSSDLNKILIDFWDICRKYHPDYQA
GILMEACYKKNIPVLPFINGTKFWQYGWGEKSRIFMESRSNTDGSIAHTL
SNDKPITKAVFNSLGVPTPKDVVIKSSNELETAIAAIGYPCVLKPTNTGG
GKGVIANIRNFTQLLNAFTYARQFTKDAIMVERFICGDDYRLMVIDGKFV
AAIKREPAKVIGNGKSAIRELIQQLNSKRSINLRKSNYLRPILIDKILLE
QLAKQEMSLDNILSAERHISLRSNANLSTGGTCTDVTHLVHPTITQCVEQ
LSVTTGFGTAGFDYITTDISSSLQKSGGAFIEMNTTPGLDVTIAAGWSVE
KIGSITLGDTVGRIPIHLHISNTPLDLYKLPINFDSHLARVSENNVCIGQ
CCYHIDDLQPWAAVKSVLRNKTVQVVEIFCTVSEIIKHGMPVDYVNNIFI
SNIDIPENWMNVLKEHASEIIFH
>Cag_1573 hypothetical protein
MNRMTSSQQNPEPNATCPICKASYHCARSSSCWCSTRKVPQQLSDYLADK
YKSCICPDCLDSMIAEANAGKQFC
>Cag_1457 conserved hypothetical protein
MLESMTGYGSAESVMDGVRALVELRSVNNRFAEISVKLPRQLLAYELEVR
EMIRAHFQRGKISAFIQLQLDEEQPIPVAINPAKVKAYTALLRSLQQEAE
IDAPLQLDHLLRFSEIFDSTASALDKGEMLWPSVKTLVLEAIERLQAMRR
REGEELSGDFRQRIATIEELLATITTLAAGNLDAVRAKLTARVEAVAGKD
VAYSRDRLEMELVIAADKLDITEELTRFASHNKFFIQELESNESGAGRKL
NFLLQEQLREANTIASKSQNAAISQHVVHIKEELEKIREQLQNIE
>Cag_1124 Type I site-specific deoxyribonuclease HsdR
MIHLTEHSIETFAIELLYKLGYEYIYAPDIAPDTSAGSVSEIRESFAQVL
LLNRLQNAVKRINHSIPADAQAEAIKEIQRIASPELLTNNETFHRLLTEG
IPVSKRVDGDDRGDRVWLIDFKNPHNNEFVVANQFTIIENGNNKRPDVIL
FVNGIPLVVIELKNATYENTTMHSAFKQIDTYKKTIPSLFTYNGFIVISD
GLEAKAGTISSGFSRFMAWKSADGKAEASHLVSQLETLIQGMLNKETLID
LMRHFIVFEKSKKIDAKTGITTISTVKKLAAYHQYYAVNRAVESTLRASG
YQLVKETPLSMVMESPESYGLRGVKKQPIGDKKGGVVWHTQGSGKSLSMV
FYTGKIVLALDNPTILVITDRNDLDDQLFDTFAASKQLIRQEPVQAEDRN
QLKELLKVASGGVVFTTIQKFQPNEGNIYEKLSDRKNIVVIADEAHRTQY
GFKAKTIDAKDEKGTIIGKKIVYGFAKYMRDALPNATYLGFTGTPIENTD
VNTPAVFGNYVDIYDIAQAVEDGATVRIYYESRLAKVSLSEEGKKLVAEL
DDELEEEEDVRAYSNTPQQKAKAKWTQLEALVGSENRIRNIAKDIVAHFN
QRQEVCNGKGMIVAMSRRIAADLYQAIINLKPEWHSEDLNKGVIKVVMTS
ASSDGPKISKHHTTKEQRRTLAERMKNPDDALQLVIVRDMWLTGFDAPSM
HTLYIDKPMKGHNLMQAIARVNRVYNDKPGGLIVDYLGIASDLKKALAFY
SDAGGKGDPTILQEQAVQLMLEKLEVVSQMYYGFAYETYFEADTSKKLSL
ILAAEEHILGLEDGKKRYINEVTALSKAFAIAIPHDQAMDVKDEVSFFQT
VKARLAKFDGTGSGKTDEEIETTIRQVIDKALISEQVIDVFDAAGIKKPD
ISILSEDFLMELKGMEHKNVALEVLKKLLNDEIKSRAKKNLVKSRTFLDM
LENSIKKYHNKILTAAEVIDELIKLGKEIVETDDEAKRMGLTDFEYAFYT
AVANNDSAKELMQQDKLRELAIVLTETIRQNTSIDWTIKESVKAKLKVAV
KRVLRKYGYPPDMQLLATETVLKQAEMIANEITK
>Cag_1165 conserved hypothetical protein
MPLKPHFTIIGLLGVAVLVQEFALNRLTLFHAAPDMVTIGIAVLALLQGQ
KKSTTAGFIVGTIIGMISGNMGMAMLSRSVEGFIAGYFYEPEDSHATSHQ
IRRSFFFAILLSGFVANALLSLGDNPLALPLAYRLLATGIAEALMTYLLA
WLLHWLFLKKLLAD
>Cag_0653 transcriptional modulator of MazE/toxin, MazF
MFMTNFSKYDVVVVRFPFASSLKYKARPAVVVSSEIYNKNSRATLLILAI
SSSIKNKLDFEIELSDWQSAGLLKPSIFKSSIATIEKEYIVSKLGVLSDV
DVKRLEKMIDVIC
>Cag_1793 6-phosphogluconolactonase
MPHHPTQHLITEPLADVTLHAAALILAAAHRAVVERGSFTMVLAGGQSPR
QLYTMLGQGLPCNELQRWQLPLPEECANTELICLPPSTWLFQSDERCVPP
NHPDSNQRMIRETLLASAALPPDHFFAMDSTPDNPYLAASQYEERLQLFF
ANERTTPLFDMMLLGMGDDGHTASLFADDEQGLHECDAWVMARNAAQGKP
PGWRITMSLPLLQRARQVLFFVPSATKHQLVQRIVAGNAPTLPAAMVQPP
YGDVYWFTTKE
>Cag_0444 inorganic pyrophosphatase
MSFNPWHGVSIGDQHPRVVNAVIEISKGSRIKYELDKKTGMLMLDRVLFS
AVFYPANYGFIPRTLGDDKDPLDILVISQSDILPMCIVRARVIGVMHMID
HGEGDDKIIAVAEDDVSMSNIQTIEQLPPYFISELKHFFEEYKTLEKKTV
LVEDFQDAEEAYNSIQHAIDNYNEVFINQVD
>Cag_1992 Protein of unknown function UPF0102
MNPPNSTCELGRQGEALAATYLQNEGYQILERNYRFRHNEIDLIALDGST
LCFVEVKARLSNKAGSPLDAVTVAKQREIIRAAQAYLTFSGQECDCRFDV
IGVNVHAMHEARISSFTIEHIKDAFWVEQ
>Cag_0476 hypothetical protein
MEKLVDYFAHHPVVFFIAVVFSFFVVFAFFRKIVQTLFVIGALMVLYAAY
IHFTGSPIPDIFQHIWQWMVNLYQTILGLILRILKKEPEEGVEAFIIFFA
VPTCHLLQGIQQRCCASGDGKHGF
>Cag_0449 L-alanyl-gamma-D-glutamyl-meso-diaminopimelateligase
MTSNRIPCAISLLILYFESNYFSMSFFYFIGIGGTAMASVAVALSRAGHY
VIGSDTQLYPPMSTFLEEHGISYCHGFAEENLRSFSPDAVVVGNAISRGN
PELEYALEQHLELLAMPDLVRRHLIANNTSVVVAGTHGKTTTTSLVAWML
EAGGLQPGFLIGGIPENFGLGCRPSAGEGAGFFVTEGDEYDTAFFDKRSK
FLLYRPDIAIINNVEFDHADIFNSLEDIKRSFRLFVNLIPRNGLLLVNGD
DPVALECAEKAFCPVERFALHANAEWSATNIHSEEEGSSFELLHHGKSVG
TFHVPLFGNYNIMNAIAALAAAYRCGVSFEALADGLPHFQRPKRRMELLG
EFADGITLIEDFAHHPTAIRVTLEAIAQRYNGRRIVACFEPRSNTTTRNI
FQEELASCFAPASVVVLGKVHRPERYGDHALNTALLQEQLQSAGKEVFLA
GNDADYPADIIRYLEAHRRHGDVVVLLSNGSFGNLKQMVMERWR
>Cag_1056 Hemolysin activation/secretion protein-like
MVPKIITSLVAGSVVFSASLQAAPLVPNAGSLQQQQRPAAVSKQFKQNVQ
ADKKATEKSKPLAIKPSAEGKVFVKRFTFSGYEGTVSQDELQNMVKPYVG
KQFSMEQLDAVSANITSELRAKGWLALATLPPQDVTSGTVHVAINTGKAA
MTSITSDGSIRICKRPLRQIAEKTCPPGSPLNTNDQERAVLLMNDIPGIA
ATTSLSKGMQAGTTDVNYLIHEGALLSGVLWADNYGNRYTGSLMQYAVLN
INDPFHCGEQIMLNAAHSAGMWRGGANYSVPMPFLFAGLTGHAGVSGMQY
ELLEELEVLDYKGTSVKADAGFSYALHRSRKANLTSDVSYTYKGLKDRMS
NTDLRDGTIQFVTFGLSGNYHDDLFFGALTTADVSITKGSLDEKIRDIHL
SGAQGGYTRFNLELTRYQRFSEPCALDLTFSAQHTLKNLDSSDKFYLGGP
YTVRAYPLGEAAGDHGALFKADLRHRIPVPAEWGDMFVNAFYDVGHVTLN
KDRYAGDSATMNATGSNDYWLQGAGVGLRYDISETFTLQGCWAHTIGKNS
GRAFDGNNSDGKSDNHRFWVQGLMNF
>Cag_0479 ATPase
MIELRNVSLKYGEREILKNVSLTIRDNTITAILGPSGVGKSTILKLMLGL
LKPSSGNVFVDGVDITNMKESELYSIRRKMGMVFQGNALFDSLTISQNLA
FFLRENLKLPEAEIGRKVTEQIQFAGLEGYEEQLPENLSGGMRKRVAIGR
ALIFKPHMILFDEPTAGLDPVSSKKILSLISSLRHNNNLGAVIVTHIIDD
VFNVADRVGVLYKGEIIFDDVTDQLQHAEHPFIRSILSDKILEL
>Cag_0006 Orotidine 5'-phosphate decarboxylase subfamily 2
MSNARNKANARLQSLQSLLCVGLDSDLAKIPEPFASLPNPVVAFNRSIIR
ATAAYAVAYKINTAFYEAYGIAGLQAMEETLAMIPDNCLSIADAKRGDIG
NTSAMYAKAFFEHWNFDALTVAPYMGSDSLEPFFAYEEKLIFVLGLTSNA
GSADFEEQRLQSGQPLYERVIERTRQWSRNGNGGVVVGATKSSQLAALRT
EAPDLFMLIPGVGAQGGSLEEAVTCGVDAERQNAIINVSRALIYPPAPRA
SGFASVEEYEQTVSAIAMELAQSMKQALQG
>Cag_0405 conserved hypothetical protein
MKKAAFLVALAALFGGTQANATDWNWKGDVRYRYQSDLASDPAVTGENSR
DRHRTRVRLGVYPWISEELTGGLQFSTAGAGDETTSRNETFGDQFVPDQL
YLNEAFINFHPKAFDSKVNIILGKREVANTMVVLSDLVWDGDLTFEGMTL
QYGKDENGKNKDGWNAMLGYYPLNEINDLKEVKAQDAYLLAGQVAYKGKT
SAVTYHMGAGYYDYTHFDVSNKKVNASQLAAAPYTYTAAKSATYSPEYDY
TGKDFNIIELFGTVGGKLTENTPWTLTLQYAFNTAKQDAKHINIDDDERT
SYLAGVKIGDAKNVGQWAVGADYVRIEKDAMTVLTDSDRNGGTATNLEGM
KLGVTYHMVKNMTVGATYFNFNTIDNDATAVDESATKRHTLMLDTVVKF
>Cag_0801 Ribosomal protein L19
MNQLIQLVESSLGANRFEQVRPGDTVKIQLRVIEGEKERLQAFEGVVISD
RGEGGSKTITVRKISHGVGVERIIPVNSPNIESVTVLRHGRARRAKLFYL
RKRTGKAALKVKERKTQVNA
>Cag_1672 O-succinylbenzoate-CoA ligase
MDILAHAAARFGNAPMLYLPNRVLSFHQCNEQAAAIAARLQSKGVRAGAI
VAILSPNTPELVLLLLALLKSGIIAAPLNHRLPQPLLTRMVERLQPCLLI
SDIDAPHISNINNHLSFSSLLDGITYSNTSQSGCPLLSAESEVTRNNFPA
DAMQQPVTIIHTSASSGEAKAALHSLANHWYSALGSNHNLPFASGDCWLL
SLPLCHIGGYSLLFRALLSGGALAISAPHASLSNALSNFPLTHLSLVPTQ
LYRLLADSENGEQFRSIKAILLGGSAAPASLIEEALRRQLPLYLTYGSTE
MSSQIATSFKPLTTLQLDSGTVLPYRQVAVSDDGELLVKGECLFLGYLRD
GEIAPQRDGDGWFHTADVGTLAADGTLTVLGRKDNMFIVGGENIHPEEIE
RALLQIDSIHEAIVVPAPDAEYGQRPVAFIATTHLNEPTDAALMQQMRLF
VGTLKTPQRFYRVTEWELLSGSQKINRRFYKEFVLHSIT
>Cag_0654 hypothetical protein
MWLAERDELVTMFQRDVVEVKPLRNRRLVLTFCDGLVATLCLDDIVYHYK
GVFLPLLDAAYFNQVAINRDLGTIVWPNGADVCPDMLYAVASGKPIVCE
>Cag_1283 conserved hypothetical protein
MSNILVPLTTANGRQSIVSEHFGSAPYFAVVESESGKCSIIENGGCHHPQ
GECSHGDVFAQHQATVLLCKGIGGRAASRVEASGVAIYVVPQANSLDEAL
QLLQNGSLQQFAAGDACRGHNCH
>Cag_1123 conserved hypothetical protein
MNQYNPNIHHRRSIRLKDYDYTQVGLYFITICCQDRTCRFGRIENGEMIL
NEHGKIAHNEWMKTREIRPNVELGEFIVMPNHIHAIIRFLRRGELHSPNN
NVVFDTPLPFDNGGVFKTPNNTGECNSPLRSPSQTVGAIVRGYKSSVTKQ
LGLMGFTEKLWQRNYYEHIIQNEQSYQTISEYIINNPAKWQDDKFYVE
>Cag_0333 type II restriction endonuclease TdeIII
MSLNQQQIQKVETVLRNSLRHKFQNYNPEPAVMPFHTRLLGKDRMALFSF
IHSLNTNFGTTIFEPVAQALALSRFGSVELQKVAGNQISLQAQQVIQEIM
DGLTTATNLPCKSQEIEAIRAVCQSGEMRKVKPTKVDVKLISYDGTLFLI
DIKTAKPNKGGFQEFKRTLLEWVAVTLATNPEATIETFIAIPYNPYEPKP
YSRWTMRGMLDLEAELKVAEEFWDFLGGEGTYPQLLDCFERVGCELRSEI
DAYFAQYINS
>Cag_0063 hypothetical protein
MKPLFDFLLKLCILSLVLWGAILYALDPSSVDYYSIFIAWVMMFSNTLVG
YLLFEYAIDKDSVVFNKIVFGGLALRLLALMVLVAIFIVGKLVAVNDFVF
SVFAFYCIYVVVEILGYQKKNKQKKN
>Cag_0474 phosphoglucomutase/phosphomannomutase family protein
MQIKFGTDGWRAIIAKEYTFDNLKIVALAASRYFLSHPNRAKGVCIGYDT
RFMSKEFAEYTAQIFSSQGLRVFLSDSFVSTPAVSLYTRDKELAGGVVIT
ASHNPALYNGFKIKAHYGGPAHPEVITEIEDYIAQVNPATEIVTEPKLIE
MVDMKGFYISHLKASIDLQLIRDSRIKIAHNAMFGAGQDILSRLFDESML
SCYHCSVNPSFGGINPEPIPQYTTDFVDFFKEIECDVAIMNDGDADRIGM
LDEQGNFVDSHKLFAIILKYLIEERHLPGEVAKTFALTDLIDKICQKHNV
VMHEIPIGFKHVSKLMTTNTILIGGEESGGIGIPSFLPERDGIYIGLLIL
EMMTNKEKSLSQLVQELYDEYGFFSYNRLDMRVSEEKKQAIMARAAQGDL
TSIAGYNVLKFGDLDGYKYHFEGGWLLIRASGTEPILRIYCEADSAEKVE
KVLAFASKLA
>Cag_1055 Filamentous haemagglutinin-like
MNRIFNVIWSVTREKWVVVSEKVKSNGSVPKSSLVSIAFLSALLGGGSVA
QAVEPGQLPTGGVITAGSGSIATNGNSMTIQQSSQKMVANWNNFNVGSDA
SVRFQQPNASAAALNRIAGQNPSQILGSLSANGRVFLINPSGIVFGQNAR
VDVGGLVASTLDISDYDFLAGNFAFRSTGSAGTLRNEGLINAMPGGVVAL
LSPSVINNGTITAVGGSVALAAGNQMTLDFGGDGLMTVRVDDGAVNAFVE
NNSLIKADGGLVVMSAKAANNLAFSAVNNNGVVQAMSVVEKNGRILLDAE
GGQSTVSGTLNASSVDGKGGQVVVTGKQVMIADGAHLNASGLTGGGDVLV
GGSWQGSDASVRQAVGTVVMPNTLLQANAISNGNGGTVVVWSDVNNPLSV
TRAYGTFEAFGGTNGGNGGRIETSGHWLDVAGSRGGASAVNGNAGVWLLD
PYNVTISSSNANGSWGGVFPNAIWTASGDNSNLLASDITTRLNAGTSVTV
QTGTAGSQAGDITVDGAINMTNDSGEVSLQLDAAGSIAINNNITNSTGTL
HLVFNSGTGAISGTGALGSGQGRTLFNVGASTGTFSGIISGASRTVTKQG
AGTLIFSGANTYGGLTSIEAGVLRVANAQGLGDVTNGTQVSNNGALELSG
GIVITGDEVLRLVGTGVSNSGALHSIGNNSFGGHIILTGNSTITSDTNGT
LILGNASQGIYGAYGLTLSGGGSVVFNGAIGATIPLASFHGLTGTSIELN
GGSITTTGVISALGQVKATNPLTLSSGISDISLSNETNDFTTVTVTNAGA
VSLIDDTALTLAGVNASGDVNIATHTGNLTVTGNVATTSATPTALTLNAD
QSKDAGNGNGENLILSSGTLTVGSGGIAKLYTGSVAGSTSIASVVNAGHF
RYNSDEAVQHYTDPLTAGLNLIYREQPTLSVMFAPVTTTYGTTPTFAISS
YSGYINGDTSPGIVTGTPTWLVDGTPSFAGYYTAGTHNVSYNNGLISSLG
YGFVDNAISFNDLVVNPLVLAATSLTGLTASDKIYDGQITATISNYGTLT
GILTGDRVALNSAGSSAAFADKNVGTGKTVTVSGLTLSGLDNGNYRIVPQ
TTTASITQKSLNVTAPSNVTKVYDGTVAAPGVATVTGLAIGDVVAGTATI
EYADKMAGSNKVVNPLSVTILDGFDMIMTNNYAITYVGDHGTITQAPLTL
TAPDNVTKYYDGLLTVPGTPSVNGLVPNDVVVIPASLLYTDPEVGIGKTV
NPDSAGLVIHDAIGNNMTPNYAITDIASHTGIIVEKTFTPFKKWNDADPS
VPEIPTNAPEVTGSRDLAGSDFEPATDSGVTATRSLTMATMDESAVQSDI
VVKLAEPASKNKQGVVKVFVPKEVFAKPAFLFPLPEEVAVEINKTNVQEK
VFMQNGDALPGWLSYDYEKKIFTATSAPAGSLPLTIMVQSGTMAWQVIIQ
Q
>Cag_1713 Ribosomal protein L35
MPKMKSHRGACKRFKVTASGKIKRERMNGSHNLEKKNRKRSRRLHQSTLL
EGTKAKQIKQMIQG
>Cag_0559 phosphoribosyl-AMP cyclohydrolase
MNNSLESTKQLLDVVKFDSNGLVPAIVQDVESGKVLMMAWMNRESLAMTL
EKKVACYWSRSRKELWLKGETSGNMQQVHDVLIDCDGDTLLLKVSQKGGA
CHVGYHSCFYRKATENGSLEICDTLMFNPEEVYGKKS
>Cag_0053 UDP-N-acetylmuramoylalanine-D-glutamate ligase
MDVAGKKVTVLGGGKSGVAAALLLQQLGATVLLSEHGALSSEAMQRLQAA
HIAYEANGHSEQIYSADFCVLSPGIPPTAPVVQQMEAHSIPLYSEIEVAS
CFCKARMVGITGTDGKTTTSTLIHTLCEADGKRHGYRSYSVGNSGIPFSS
MVLAMQPNDVAVIELSSYQLERSISFHPQVSLITNITPDHLDRYGQNMQR
YAEAKYRIFMNQQAGDTFIYNQDDSMLQAAFGASQIAVPCRSVAFGLEPL
TNVQLDKRRVLVNGNMVVVRQNDGALQPIVAVDEVLNRAFRGKHNLSNVL
AAVAVGEALGIGSEVMRQALTAFGGVEHRQELVATIDGVEWINDSKATNV
NAMRQALEAVPAPMILIAGGRDKGNNYATVSHLIERKVCLLIATGESREK
LASFFKGKVPVIAVPTIDEAVAIAHQQAKAGESVLFSPACASFDMFNNFE
ERGAFFKQCVRQVL
>Cag_0547 collagenase
MTPLKSQPNHSIPQAELIAPAGDMTALVTALQAGADAVYFGAEGYNMRAG
SNNFTNSDFATVRALCSKHNAKAYLALNTIIYDSELKQMRQSVESAKAAG
IDAIICSDMAVIEACRQAEMPIHLSTQASVSNYNTLRFFAEQGAAMVVLA
RELTIEQVRHITRNIQHDSLPVRIECFVHGAMCVAVSGRCFLSQELFGRS
ANRGQCVQPCRRSYIITDPEENEELELGADYVMSPKDLCAIEFLDVLLDA
GISAFKIEGRSRSPEYVHTTTTAYRQALNMCMQQRHQADFRNRYSALTAS
LKHDLATVYNRGFSNGFYFGKPMEAWAQTYGSQATEKKTYIGDINKYFPK
AGIAELHIRARGLKQGDKLSILGVKSGMVTVIADSFLTNDQPNTEAIKGD
SVTFKCPPVRKNDKVYVLEERK
>Cag_1237 ArgK protein
MPHHHTFDVEAIANAIMQGNRHQLSRAITLVESQRIEHHHVAEAILERCM
ASNRHALRIGITGSPGAGKSTFIEAFGEHILSQGLRLAVLAIDPSSHHSK
GSILGDKARMEKLSGRKEAFIRPTPSSGHLGGTSPRTHEALLLCEAAGYD
VIIVETVGVGQSELHIEQMVDFVLLLMLPGSGDELQGIKRGIMEIADMIA
ITKCDGLQATSAAISHAEFEAALRMVPKRHPFWQPSVQLTSAVTGVGIAE
VWQQIERFFAIMQQENSLETQRREQRRHLLANVLEEQLRRLFFNHPTIRQ
QQPHLVQQVLDGTLSPFTAATRLIELFRHNPIGEKQ
>Cag_0173 carbamoyl-phosphate synthase, medium subunit
MPKREDIKSILVVGAGPIVIGQACEFDYSGTQACRALKEDGYRVILVNSN
PATIMTDIEFADATYIEPITPAYVQKIIEREQPDALLPTMGGQTALNIAV
SLAESGILERNGVELIGAKLRAIRKAENREFFSDAMKKIGLEMAKGVFVR
NEKEAREALEEIGLPIVIRPSFTLGGTGGGFAETKADYYDAVRRGLNESP
IGEVLVEECLIGWKEYELEVIRDLADNVIIVCSIENVDPMGVHTGDSITV
APAQTLTDRQYQELRDASIRIIREIGVETGGSNIQFAINPKDGRIVVIEM
NPRVSRSSALASKATGFPIAKVAAKLAVGYRLDEILNDITKTTPASFEPV
IDYCVVKIPRWDFEKFKNVDARLGVQMKSVGEVMAFGRNFREALQKSLRG
LEIGRAGLGADGKDVMNVLDMTPEQKQFAKHDILEKIKIPKADRMFYLRY
AFQAGATIDEVYQATGIDPWFLDNILQIVEMENELLQLAAAE
>Cag_1803 ATPase, ParA family
MGRVIAIANQKGGVGKTTTSVNIAASIAISEFKTLLIDIDPQANATSGFG
LEVNDEIDNTFYQVMVKGGNIEDAIRPSSLDYLDVLPSNVNLVGMEVELV
NMRDREYVMQKALKEIRNRYDYIIIDCPPSLGLITLNSLTAADSVLIPVQ
AEYYALEGLGKLLNTISIVRKHLNPKLEIEGVLVTMYDARLRLAAQVAEE
VKKFFKDKVYKTYIRRNVRLSEAPSHGKPALLYDAQCIGSKDYLDLAQEI
FEKDGNIAKFKVRQS
>Cag_1208 Pyridoxal phosphate biosynthetic protein PdxA
MNRMTIVCSMGDPHGIGPEVVMKSVITLGALGEVGRVVVAGSMRVMEFYR
NLLQLPLTLQPIERVEDIANLPTTFDGVLPVLSVAEPQNPITPGVISAEA
GRVAMEAILRGTQLCQSGVCDALVTAPIHKEALAKAGYTECGHTGLLGRL
TGVASPTMMFYDRLTGLKVSLATIHEPLSRVPKLIRTMDLDNFLLKLATS
LSVDFGISAPRIALLGLNPHASDGGVMGSEEAEFLIPAIQRLSPTLSIEG
AFPADGFFGAKLYRNYDMVVAMYHDQGLLPFKVLAFETGVNVTLGLPIVR
TSPDHGTGFDIAGKGVANSRSMEEAIRLAHVIARNRMK
>Cag_1791 conserved hypothetical protein
MGTLLLIGFATIAPTTQSIAADNHIAVSASASIPVKPDMAEFTVIISADA
KQADKAATEVAEKYAAVQASLRKAGIAADDAPTTAYTVAPRWEWNGTLQK
NVLKGYSARHTLKVKVRSLSAIGQAVDAAVQAGANEVQEVHFVVSRYEAF
RQQALEQAVAKARADAAIMAQAADCKLGTLLEASVTQQSNMPRPMYDAMT
LRVAAAPKAETTMVAAEQEVEVTVHSRWQIRPLVGSK
>Cag_1291 hypothetical protein
MPKHVAKEFIPSLTNKDNIMQAIEFESTIHNGIIQLPNECQQWNKKLVKV
IVLEKTRASIISKPRRMPHPAIAGKGKTIGDLLEPIVNKSDWECLQ
>Cag_0705 hypothetical protein
MRFNMLYSIESNSQITHIPHHKDFTSWRKRLSDDEYQAIVDDLNSRIDGT
EIQTSSWMPGSDWRGTVFQPIYEKACRYSKESSAKFFGLIVWKVFMDRPE
WWAFGRYEKDGIQISGITYFKIDPQT
>Cag_0272 hypothetical protein
MHDNPEKSLESVTKLALQFLAEAIGQCPAGLEQSTNQDVVFAVVGFQYGA
VQSAAYVAGLGIEAWNSMAGEVIGRLNGIEKEKVAQFLSVMPMLARKKYP
PISIGGQAIMRFYNATSEEEKLTAAASLREILRQIDEGN
>Cag_2031 Protein of unknown function DUF37
MSIWKIINAIPIVLIRLYRTFLSPLLGPSCKYVPTCSSYALEAFERHNFF
YALWLTIWRILRCNPFSKGGYDPVPPLQGTQQSSSHHQESSHHG
>Cag_1358 probable cb-type cytochrome c oxidase subunit III
MHDNDIPHEGHNKIPIGWLLFFAGTIIFLVGYIALYTPAISGWSYYKGFE
KEMAEAAKAQKSATVKSYAGNQAAINEGKELFASSCAACHNADATGGIGP
NLTTATLKYGATEKELYTSIADGRPNGMPPFGQQLGADNTSKIIAFLETL
RK
>Cag_1673 Yeast 2-isopropylmalate synthase
MMNYRKYVPYPPVDLPNRSWPSKQITKAPIWCSVDLRDGNQALPVPMSVD
EKVAMFQLLVSIGFKEIEVGFPSASATEFAFARRLIESNLIPDDVTIQVL
TQAREHLIRKTFEAISGAKHAIVHLYNSTSTLQREVVFRKNREEIKAIAI
EGTRLVRHLKEESPNSAGIRFEYSPESFTGTELDYALEVCHAVMEEWGAS
ATNKVILNLPSTVEMSTPNIYADRIEWFCRHLTNRDAALLSVHAHNDRGT
AVATAELALLAGADRVEGALFGNGERCGNMDVVTIALNLLTQGVNPELNF
SDLPRIREVYQRCTRMEIHPRHPYAGELVYTAFSGSHQDAISKGMKAQCA
CSQGIWAVPYLPIDPQDVGCNYEAIVRINSQSGKGGVAFVLEKEYGIQIP
KWMQPDFAEVVQAMADRTGIELTPEQIHDLFQKEYVQAHEPVVLKKCHIR
WDEAAAPKQAEDDTTITCVVQAKERELRFEAQGNGPLDAFVRGFMRESGM
EFSVEEYAEHAIGRSSGAQAIAYIKIVDNEGKVAFGAGVDSNISLASIKA
TVSALNRLP
>Cag_1540 two component transcriptional regulator, winged helix family
MFIVKRLVNPQFLLLMADNKKIIVVEDDSDFLDSIVEYLTLSGFDVTGVK
SALEFYYSISQNNYLLVVLDIGLPDQNGLVLADYVRNNTDIRIIMLTAQS
SLETKITAYKAGADIYLVKPIDLSELAASIQSIIGRLENTTHSLHSEPIS
NEREIIHEQPSALWKLLRSNEALYTPEKNEIKLTSKEFYLLEMLASSPNK
IVQRQTLLETLDYGNDEFGNRALDALIYRLRRKKGDRNEKLPIKTAHGAG
YYFSSPIVIA
>Cag_0308 hypothetical protein
MSEDKRHRTKFYSIEDMSGGHQLSKAETLLDNFNAAADFSLNDLLEFYNI
KLYFDKNLFLTTWTEDKKKSYKANVELAYNQLKERIYKVTDESLEQGLTD
IEYNFYDDYWSLLNNINDLKKITDTILTEVLGKHPRQISYILKEKKLVDK
YDKTIRNFLIGYSDSAQILLSSLEEKDTLGRREKNHFPKSLTLNDKETII
NSYLDTDEPNLNYVRLIENSRDTSEFKLSDKTRLKAKKKSKELNDKVLEN
GHTWNVGVGVSLSKDQKEPFKFKSEGTTFEASYSESFIDMLPTDLDLFLV
FKYLFFYTDEKNLISLVSKQSELDVLERTFMKSKNEYEIGFSFFRKENLS
NLQLYIFSDYLKRKNKSLESIINSYVTSLNESISPNKIVFKLPLNESSYL
EKIRTLTPDFEFFLKQYKLLIDEKSIDLELIQISSTPIRISEIYSNKTKK
YLYSDDNLILQLKYLFFSDQSHLYYVKHFENKYHCLFDLLTNENVKLEYF
ENYQKDTIQSLINDGYLKLSNDEYVTLNKEILIYLIGELHHKEVISYWNY
PVQCRPVMDELIEKKLVIAENTLLSRQERNYYNFYLNKKEFTNGYDLRNK
YLHGTNAFSEKEHEFDYHRLLKLIILTLLKIDDDLHAEK
>Cag_0642 NADH-plastoquinone oxidoreductase, chain 5
MHSLIQLSIVVLLLPLLSFVLLIFFNRRLPRRGDFVGIGILGTAFALSAY
IFWSVIVQANDPSFKVAWDFTWIDFGNVPGVGPLQIKMGIVIDNLTAIML
AMVTLISLLVHIYSTGYMAGDKNYGRFYAYLGIFTFSMLGIVLSDNLFSI
YMFWELVGLSSYLLIGFFFEKESAADAQKKAFLANRVGDIGMWLGILILY
SQFHTFGYEEIYANLAAGKFTLSNAWLTAAGILLFMGCVGKSAQFPLHVW
LPDAMEGPTPVSALIHAATMVAAGVYFVARIFVILTPDALHLIAFIGAIT
AFMAATIAITQHDIKRVLAYSTVSQLGYMVLGLGVGAYSAALFHLVTHAF
FKACLFLGSGAIIHAMHHEQDMRWMGGLRKQMPWTFTTFTLATLALAGLP
LTSGFMSKDAILAGALGFAEVEGGGIFYILPALGFFSAMLTAFYMGRQIW
LVFFGESRTHLKPADNHHHGHDAHAAHGDHHDEHDAHHGVHEVSWNMRAP
LVVLATLSVFFIYSPDPLDGAKGWFMKLVETPATVVGDSNPQQGALKAGI
AATPHATLLHGAEAENHAANGADVTHNAANTTEATAHATMEGHGEPHGGH
TYADPRQAEIIHAGHAAHYTAIYISSLMVVFGISLALIVYVFRIIDPDKT
ASAIRPLYLYSFNKWYWDEIYQATFIKGSLVIAKVLSWFDATIVDGIVNS
TATLVRKISTLSGGIDKHVVDGMVNFTAFAVNTTGALFRKLQTGKVQTYV
VMLLVIVFGYFVVYLSGFIY
>Cag_1785 conserved hypothetical protein
MKQPLKIAHISDIHLSGANDRSHAARLTRLLQHLRNEQFDHLVITGDLGN
HADPDEWRVVQQLLKQTEWYHWERCTILPGNHDLMNLEEEMRLYNALNPI
QWFRQKAFQRKRQLFCELFYEIMGGKNQTFPFLKILNYPTLRLALVALDS
VAAWHPSTNPLGARGFIEPQQLTALQQPQIAEALRSCVVIGLCHHAYKVY
GTDSLIDQAFDWTMELQNRDAFFSLMQQLGASIVLHGHFHRFQSYQKEGI
TFINGGCFRYNPYRYSELLLEADGSFQQQFLSLEEK
>Cag_0491 Peptidase S26A, signal peptidase I
MKNNNKSQGQGEKKQSREWFDALIIAALIATLLRVFVIESYRIPTGSMER
TLLAGDFLFVTKFEYGAKVPFTNFRLPGITEVKRGDVIVFKFPKDRSLNY
IKRCIAMAGDTVEIRNREVLVNGVVQPLPPEAQFLASMEPSGVEDVMIFP
PFSGFNKDNYGPIRVPRKGDVIPLNMRSFPLYNALVSDEGHEITMQAGNV
FVDGIMVDSYTVEQNYYFAMGDNRDNSLDSRFWGFLPESDLVGKALMVYW
SWNPDVSLLTNPVEKISSIRLNRSGLMVH
>Cag_1547 S-adenosylmethionine synthetase
MSQKRYFFTSESVSEGHPDKVADQISDAVLDDFLRQDSTSRVACETFVTT
GQVIVGGEVTTKGIVDVQTIARKVITDIGYTKGEYMFEANSCGILSALHS
QSPDINRGVDRKEEIADEFDRVGAGDQGMMFGYACTETPELMPAAIQFAQ
QLVKKLAEIRKEGKMMTYLRPDAKSQVTLEYENDIAKRVDAIVISTQHDP
EPAGISEAEWQEIIKKDVIENVIKVLIPANMIDENTKLHINPTGRFEIGG
PHGDTGLTGRKIIVDTYGGAAPHGGGAFSGKDPSKVDRSAAYAARHVAKN
IVAAGLAEKCTVQVSYAIGVARPVSIYINTHDTAKHGLSDNQIQEKAEAI
FDLRPAAIIKHFGLDKPNGWCYQQTAAYGHFGRDIFPWEKTDKVAELKAA
FKLA
>Cag_0493 Anthranilate synthase component I
MAAPQHTAPFHLKPLFREVHADTETPVSIYLKLKRPFSCLLESVEGEEQL
ARYSYIAIEPVAIYKGTVDGASSLEVLDRRFDGLHQALEGVDAMREQIDC
LLAQFSTDDMPRSKNGQPHMITSGVFGFFGYDAMHVVEKIPAPAKADPAH
LPDVMLLFCDTLVIFDNVMRKLFIVANHLHEGDKAAALQRIDEIAEQMFR
PLTRDEITFRPEVPETVQSNTERDAYLHKVAIAKEHILAGDIFQVQVSQR
LCRRLHTRAFDVYRMLRTINPSPYLYYFEMKDFEIVGSSPELLVKVERDS
KGRRIVDTRPIAGTRRRGLTYEEDEANMRELLMDEKELAEHLMLIDLSRN
DIGRIAKIGTVDTNEMMVIEKYSHVMHIVSNVRGELRDELGAMDAFWSCF
PAGTLTGAPKVRAMEIIYELEEEKRGLYGGAVGFLDFQGNLTTAIAIRTM
VVENSTIYFQAAGGIVADSKPESEYEETMNKMRAGLTAVEKLQTL
>Cag_0397 protein tyrosine phosphatase
MLTSILILCTSNSCRSQMAEGFLRSFNKSLEVYSAGTKPTEAVHPLAIKV
MHEEWIDLSRNKPKSVDEFLNKPFDYVITVCDGAKESCPIFSGQVKHRLH
ISFDDPAEAIGTQAEQLAVFRRVRDDINQRFRQFYNEHISGK
>Cag_0624 thiol peroxidase
MAQINLKGNPIETVGTLPVKGTEAPFFCLVKTDLSEAGPADFASNKLVLN
IFPSLDTPVCAASVRRFNQEAASYPNTVVLCISADLPFAHNRFCEVEGLK
NVVPLSVFRSPEFGQQYGVTITTGPLKGLLSRALVVVDADGVVRYTEQVP
EIVQEPDYDAALKVVASL
>Cag_1150 Thioredoxin
MSGKYFVATDSNFKAEILDSNKVALVDFWAAWCGPCMMLGPVIEELAGEY
EGKAIIAKLNVDENPNTAAQYGIRSIPSLLIFKNGQIVDQMLGALPKNMI
AKALDKHLA
>Cag_0380 DEAD/DEAH box helicase-like
MSFQTILEKYRRISFSERDKGNRFERLMQAYLQTDRQYATQFKKVWLWNE
FPGRHDLGGSDTGIDLVALTHGGDYWAIQCKCFEASATIDKASLDSFLAT
SSREFKNEQMQTVRFAERLWISTTNKWSSNAEEAIKNQNPPVTRITLQNL
VNAPIDWEKLENGVHGEFARREKKKLYPHVLEVRDKVVDYFKEHERGRLI
MACGTGKTMTSLKIAEKLTNHKGTVLFLVPSIALIGQTLREWTSQADETI
NPICICSDPEITKKKNTTDQDLTSTIDLAWPASTDANYILKQFQHYKNKS
NNGMTVVFSTYQSIEVIAKAQKVLLKNGFSEFDLIICDEAHRTTGYTEPG
MDDSAFVKVHDGNFIKSKKRLYMTATPRMYNVDARSQAAKQAIPLWSMDD
EEYFGKEIHRIGFGEAVEKGLLTDYKVIILTLNDKDVPPAVQKMISNGKT
EIKTDDLTKLIGTVNALSKQFLGNESIIVDGDELPMKRAVAFCQSISNST
TIAASYNLASENYLDALPENKKAKMVTIQAQHMDGTMAAPQRDQMLNWLK
EETSGNECRIITNVRVLSEGVDVPSLDAVLFISAKNSQVDVVQSVGRVMR
KSDGKKYGYIIIPVFVRSDEEPENALDDNERYKVVWTVLNALRAHDDRFN
ATVNKIELNKKRPNQIIVGGADTAFDGDGNPIDKRRDGYDQSKEIGQQIA
IQFEQLQDVVFARMVQKVGDRRYWEQWAKDVAVIAERQIERINYLINEKK
EQRAAFDKFLLGLQKNINPSINEEQAIEMLAQHIITQPIFDALFEGYSFV
KSNAVSVAMQSMIDALEKGSNLAEQDETLQRFYDSVRKRAEGIDNAEGKQ
RIIIELYDKFFKTAFPKMVEKLGIVYTPVEVVDFIIHSVNDILKKEFNRT
ISDENIHILDPFTGTGTFIVRLLQSGLIDINDLERKYKHELHANEIVLLA
YYIAAINIENAYHDAISGYRNLGLGFGEENLVTHRYLNTNANFQRTNCLA
GSDEFGRDDLQNNKELSERGDVWLDESNKESSEFNSGKHSRRIWEKEQGR
ISTISGNSERITYGVRDTCFDSTENSNSECDGNGNNIGTNSNSGKIIDSS
SEISNTQTLNPIPYEPFDGIVLTDTFQLGETKEGEIQYEEMLKKNSDRVE
KQKKAPLRVIIGNPPYSVGQKSANDNAQNQKYEKLDARIAETYAAGTNAT
NKNSLYDSYIKAFRWSSDRLSKEHGGIIAFVSNGAWLDGNSNDGFRKCLE
KEFTSIYVFNLRGNARTQGELRRKEAGNVFGGGSRTPIAITLLVKKGKKD
A
>Cag_0035 Geranylgeranyl reductase
MQYDVAVIGGGPSGAVAAAMLARAGFSVVLLERNLDNVKPCGGAIPLGLI
EEFNIPAELVEKKLTRMKARSPKGRVIDMHMPNGYVGMVRREKFDRYLRE
EAERAGATVVEGLVSSITSNGDAFVLKTLNDKVAPLQARRIIGADGANSK
TADELRFPPNELKVIAMQQRFHYTPSIEKFSDIVEIWFDGEVSPDFYGWI
FPKADHLAIGTGTEDRKHNIKALQKRFVEKIGITDKPYLDEAAKIPMKPR
RSFTQEKAILVGDAAGLVTPANGEGIFFAMRSGKLGAQAMIEHLKQGKPL
SNYEQEFRKLYAPIFFGLEVLQVAYYRNDRLRESFVAICADDDVQRITFD
SYLYKKMIPAPWSVQMKIFGKNIYHLIKGS
>Cag_1226 conserved hypothetical protein
MNTKEENLSNIDTRMKLKLIELNGFKAVDGQSGQNIPLGDITILLGANGS
GKSNVVSFFKMLNFLTTSALQNYVGKQGVSQLLFYGPKVTDSMSFILHFS
SENATDTYEVTLSHGLPDRLFISSEKVTFKKNGSEQPQEYYLESGGTEAG
LGKDKRKTSKVVYTLLSGIRAYHFHDTSDTAPIKDRRYVDDAKYLRHDAG
NLAAFLKMLKEHDEYMRYYHRIVRHIQRVMPQFGDFSLDSLPGNDKYVRL
NWTDSYGRDYLFGPDQISDGSLRFMALTTLLLQPPELLPKFIVLDEPELG
LHPAAIVELAGMIRMVAQKTQILVATQSTRFIDEFSPDDLVIVERDNANR
CSIFKKLNAEQLQEWLKSYSLSELWEKNVLGGQP
>Cag_0078 Elongator protein 3/MiaB/NifB
MLHPAIESAYRVLATGEPLTHGEALQLANLSGSDLLDLASLAHKVQLRYA
GGVESFHACSIMNAKSGVCGENCRFCAQSKHNNAAIEVYDLVSDDSVLIE
ARQAVANGVSHFGIVTSGYGYQRITPEFERLLGMIDRLHSELPSLNVCAS
FGVLGEEPVAALAAHGIAHYNINLQVEPNRYNDLIATTHSAEERIATIHR
LRQHNIAVCCGGIIGVGESVEERIGLLIALQALDLAVIPLNVLVPIAGTP
LEKSEPLSAAEVLKTFAICRLLHPRTIIKFAAGRETMMKDFQGLLMLAGV
NGFLTGGYLTTRGRDVADDQRFAAQLASFA
>Cag_0176 conserved hypothetical protein
MKKVWRTIALFLFCASPLHGEEVATVPTPPSIQKATQPHPYVNTIRISGN
KAISEEELRLIISTRAHKSFFGSGLFGGAKKAFNAEEFERDIFLIKKLYT
YKGYFAAEVDTTITRLSKGKKVNLAIRIKEHQPARIDTLRYFGLEKVPER
LEKKFLAESRLQVGKIFSVEQLIEERDRSLNFFREQGYTFFHEDSIRVKV
DTVGTHAGIAMNFHLPERLQYAPIHAVVQNSRRNEKKPREQTFQLEGIHG
KVIGRQRINPELITTAVAFRQGQYTSQSKEQRTLQNLGATNVFSSLSITP
DSVRSGLLYTTISLEAAPKHELAPKILVDNRYGPLFFGGSMAYENKNLLG
RGEQLRLSANYGTQIEKKSSLLSNLAPSEYDAFNPYDFSVKSTLVRPVSK
QNGNYYSTTIEYATTKQPVLLSNRNALIRATYNAKLGSTSRLNFDFFDVE
WVQKDSLRGFKPLFQKELATNIGIDPTNDAAINAGIDSLVSTHFNQTFRL
RYQSKSKPRSETQIGRTLWNTDLLLEESGSLAWLVDRYLDTSRRNGFTSN
DPQIFGTAYSQYLKLESGVSFVNLSTTNSQFAGRIRAGWMAPYGKATATP
EERRFYAGGANDLRGWIFGTLGPGKNRSEATANFGANIKLTTSLEYRLKF
FRFLNQPSGITFFTDAGNIWDNDGTYGFNSKSLFRDMAWDAGAGLRLGSP
IGPLRFDIAYRLHDPTQAHPWQLKHLNGSDYTFTFGIGEAF
>Cag_2014 ATP synthase F1, beta subunit
MQEGKISQIIGPVVDVDFPEGRLPSILDALTVKREDGSKLVLETQQHLGE
ERVRTVAMESTDGLVRGMGVVNTGAAIQVPVGAEVLGRMLNVVGDPIDGR
GPVNSKKTYSIHRSAPKFEDISTKAEMFETGIKVIDLLEPYSRGGKTGLF
GGAGVGKTVLIMELINNIAKQQSGFSVFAGVGERTREGNDLWHEMMESGV
IDKTALVFGQMNEPPGARQRVALTGLSIAEYFRDEENRDVLLFVDNIFRF
TQAGSEVSALLGRMPSAVGYQPTLATEMGQLQDRIVSTKKGSVTSVQAIY
VPADDLTDPAPATAFTHLDATTVLSRSIAELGIYPAVDPLDSTSRILDPN
VVGDDHYNTAQAVKQLLQRYKDLQDIIAILGMDELSDEDKLVVSRARKVQ
RFLSQPFFVAEAFTGLAGKYVKLEDTIKGFKEIIAGKHDKLPENAFYLVG
TIEEAIEKAKTL
>Cag_1013 nitroreductase family protein
MNTAITELEREALYKVIYSRRDVRGQYLPELVPDDVISRVLDAAHHAPSV
GFMQPWDFIVVKDVALRQAIKEGFDLAHQEAAAMFPLDKRDTYRSFKLEG
ILEAPLGICVTCDRSRSGSVVIGRTANPEMDLYSSVCAVQNLWLAARAEN
LGVGWVSIIHHDHIRNVLGIPEHIVPVAYLCMGYVSFFHETPELQQAGWL
PRLELETLVHHETW
>Cag_0005 Glucosamine-fructose-6-phosphate aminotransferase, isomerising
MLNSTISVAKQKGRVKELESEANALAPATIGIGHTRWATHGEPNHRNAHP
HLNKAGDIALIHNGIIENYSLLKQELQAEGYTFVSDTDSEVLVHLIDRIW
QRDPSLDLESAVRQTLRHVEGAYGVCVISSREPDKIVVARKGSPLVIGLG
KDEFFIASDGAPIVEHTNKVIYLSDGEMATVTRHGYCIKTIENIEQYKEV
TELDFSLEKIEKGGFEHFMLKEIYEQPTVMHDVMRGRIRADEGKIMLGGI
ADYLDKLKHAKRIIICACGTSWHASLIGEYLIEEFARIPVEVEYASEFRY
RNPIITSDDVVIVVSQSGETADTLAALRTAKEKGALVMGICNVVGSTIAR
ETLCGMYTHAGPEIGVASTKAFTAQVMMLYMLALLLGKGRTIAQSELSLS
LRELAALPEKAARILELDSQIRQIADRYKEARNVLYLGRGYNFPVALEGA
LKLKEISYIHAEGYPAAEMKHGPIALIDEDMPVVIIATRDNTYAKILSNI
EEVRSRKGRVIAIASEGDQEVKRLAEEVIYIPQASNAITPLLAVIPLQLL
SYHIATLRGCNVDRPRNLAKSVTVE
>Cag_0757 Cobalamin biosynthesis protein CbiB
MITYHFMQSELLLLVAFVLDLLIGDPRWLPHPVQGIGWFALQVERWLRRT
PLPLRLAGVVAMLLVVGGTATLAFFSITLATMFHPIAGVAVSLYLLYSSF
AVKDLGDHAEAVRQPLAAGNIEAARHRVSYMVGRDTAALTAEGIALAATE
SVAENSVDGVTAPLFYAILFGPIGAITYKAINTLDSTFGYKNERYLEFGW
ASARLDDFANYLPARLTVLAVALAAALSRLRVADVFRAVREGAHLHASPN
AGYPECACAGALGVTFGGARSYGGEVHNAPLLGIRAATCTPQTIKQCIRL
MRLTAFIFLLVGIGVKMVIAG
>Cag_0203 conserved hypothetical protein
MSNNHSTNQRILLLLLLVISALFFTMIRYFLLVVVLAAIFSALAMPVYNR
FERGLRGKRSLSAIMTLLTLLFIVVLPLAILLGLVVKQAIRLSNVAVPFV
QEQLLTPSQFDHHLQSLFFYPELVLYREEILQKVSELATKFGTLLFNAIS
SFTYSAVTEIVLFFVFLYTMFFFLRDGKQMLQSMLALLPLSHTDQYRLLD
KFLSVTRATLKGSLVVGMVQGSLAGMALYMAGIESALFWGTVMSFLSLIP
VLGSALVWIPAVIYLATIGSYPQALGVLLFCMIVVGQIDNIIRPILVGRD
TQMHELLIFFGTLGGIGMFGFFGVILGPIVAALFTTIWEMYAESFGDYLS
TIQKNRTSTLKD
>Cag_0291 Survival protein SurE
MTHHDAQPSTDAEQSSNATLPHILICNDDGIEADGIHALATAMKKVGRVT
VVAPAEPHSAMSHAMTLGRPLRIKEYQKNGRFFGYTVSGTPVDCIKVALS
HILTEKPDILVSGINYGSNTATNTLYSGTVAAALEGAIQGITSLAFSLAT
YENADFTYATKFARKLTKKVLAEGLPADTILSVNIPNVPESQIAGVIIAE
QGSSRWEEQAIERHDMFGNPYYWLSGSLQLMDHSMKKDEFAVRHNYVAVT
PISCDLTNYAALAGLEKWKLKK
>Cag_2030 Oxa1/60 kDa IMP family protein
MDKNSVTGLALIAVIMLVWLQFMTPAQKVQPPQQVATTEQQVASASLPLP
AALSPSTDTFGLFATASQGSEQVTVVENDLFRATLSSKGATLKSLVLKKH
LDGHLQPFDLLGKQKNGHLSLLFLTKDGKRIDTRDLYFRNVTLETKRTIS
GQERYTVRYRLDVAPQKAIEIAYLFSGESYAIDYDVKLIGFGNDIAGNEY
QVQWDGGLAYTEKNREEESQNALAGAYLGGSLVKLDAAKEKEVFREEQSG
EATWVGVRNKYFTAALIPQSKSNGIYLEGKREAGNHFENYLAALKMSLPA
SATEVHNTFTMYVGPLDYNTVKAQGVGLEKIMDFGWDWLTRPFAEWMILP
VFNWLNGFISNYGIIIIIFAFLVKLVTYPLSMASTKSMKKMAALQPVLQE
LQVKYKDNPAKMQSELSRIYREAGVNPVGGCLPTLLQMPLLFAMFYVFRS
SIQLRQHGFLWAKDLSVPDSIFDFGFAIPLYGDHIAFFPILMAGTVYLQQ
KITPTAQPNEQMKIMLVLFPVMMLFFFNNMPAGLGLYYLMFNIFSVAQQF
YINKTTTADDMPKVNLAPVASNASKKQKKGGAKK
>Cag_0527 cell division protein, putative
MSLLFVVKEGFSGLGRAKLPATLTVIISFFALVLLGLFGSVSLSFFEVLQ
ELRSRVEVEVFVADSLSEPQMMEASARIKELRGVQEVRVISKDEAATLFQ
RDFGEDIVRILGSNPLPRSLKVKLQPEYATPKQLNHMVPIFRSIVPESDI
RYNQSFLGQVEENARLFTIITGGTGLLISFATIVLVGYTIRLAMYARKEK
IRTMRLVGATAWFIGAPYVIEGALQGLLAGGLAAGALYLLFEQVLAVYEP
SLYQIVQASSLYVYPAVAVLGLLLGILGSLLSVSKFLRAASKGAH
>Cag_0528 AP endonuclease, family 2
MKRVGAHVSASGGVEQAPLNATAIGAKAFALFTKNQRQWKAPKLSKATIE
AFQKACADGGFQPQHILPHDSYLINLGSPDPEKLERARSAFIDEMQRVAD
LGLQLLNFHPGSHLKEISEEASLLLIAESINMALEATNGVTAVIENTAGQ
GTNLGYRFEQIAFLIDRIEDKSRVGVCLDTCHLFASGYDLSSTEAIETTF
NEFDSTVGLHYLRGMHLNDAMQPLGSRVDRHASLGKGTIGMAAFTFIMNH
PACEEIPLILETPNPDIWSEEIALLYSLQQVD
>Cag_1407 peptidase, M50 family
MAGEGTTYSESWHLVANLHLKLRSTVEVQRQFYRGEKWYVLKEPFTNRFF
RLQPAAYEFIMRLSPEQSVEEVWLNCLKEFPDQAPGQGDVVSMLGQLYAM
NLLETDVSPDTRQMFARYSQNRRKEKFAQLLNILFIRLPLFDPEPLLQRL
SWVIHRVISIPGAIVWFTALLVAAKLLTDNSAGVFDSAQGVLAPGNLFLL
YIALVFVKSVHEFGHATVCHRFGGEVHTMGVMFMIFTPLPYMDATSSWGF
RSRWERALVGAAGMISELFIAAFAAIYWALSGEGALHSLAYNVMFVASIS
TLLFNINPLLRLDGYYILSDLIDIPNLHNRAVEHLKYLLERYVFHYSGAT
PVAKNRREEIELTLYGVLSSIYKVVIFLGITLFVADKWLILGTLLAITGV
VTGFFLPLYRFVRYLFFDARLYSCRPRVASATALFLLIFLVVTGLLPFPR
NVSAPGIVESYPDVKVVNLSAGSVVEYLVTPGQRVVAGQPLVRLENPELD
FALASAESRVTEIRVRERLALGKRIADLKPVQQELAAVEAELAKLQRDKQ
ELVVRARAAGQWIPATRESIAGMWVERGRYLGSIVGGSRFRFAAVVPQEE
ASELFRAPVAHIEVKLWGEAFRSIRTESVTLVPYQHEKLPSAALGQQAGG
SVAVVQNSRQEADNAAEPFFLINALLEEQSGVHLLHGRSGTIRMSLSPEP
LLVQWSRRIYQLFQKRYQVS
>Cag_1418 phosphatidylglycerophosphatase A
MKRTVAHLLATCFGIGHAPVAPGTVASFAALLIYQLVPTLQQPLVLLPLV
ILLTVAGIWAGSVMEELFGNDPSIVVIDEWVGQWLALLLIPHGWIPALLA
FGFFRLFDIAKPWIIDRAQRLPHGWGIMADDVLAGIAANVSVQLLMVGAA
LL
>Cag_1710 NADH-dependent butanol dehydrogenase
MNFTFHNPTQLIFGAGTLLQLGEIVRTYGKKALLVTGGGSVKRNGTFDRA
VASLNAAGIAVVECSGVEPNPRLSSVVKGAEIARAEQCDVVIALGGGSTM
DAAKVIAAAVFYDGDLWDMFPHGQAERRLPTRALPIITVPTLAATGSEMN
GGSVITNEQTTVKSFVIGKCLYPQVALVDPELTVTVPKDQTAYGICDLIT
HVTESYFNGVDGTPLQDRFAEGVILTAMEWGAKAIANGHDLEARAQVQWA
SVVALNGWVQVGTFGAFPVHMIEHTLSAHHDIAHGAGLAIVNPAWMRFAA
QSRPTKFVQFAERVFGLSSKDADDIDVTMEGIERFEHFLRSLGCPTRLSE
VGIGDELLARYAEDTLLVAHDDQGRLPARPSMSNADIIEVLRSAL
>Cag_1987 Protein of unknown function UPF0079
MREEFFSTSESETLLLAERFAAALPPRSVVALLGTLGAGKTLFMRGICRA
FHCEAQLSSPTFSLMNIYEGELNGQAVSVHHFDLYRLESERELEAIGFDD
YLTSADLSVVEWADLFPHYKGRYTATVLLEYAGERERRIIIERGN
>Cag_0619 NrfC protein
MIIDTRVCVGCSACVYACKSENEVPVGYCRDWVVQETNGNFPKLSMENRS
ERCQHCDNAPCVTYCPTGASHYDEFGTVQINRSRCTGCKACMAACPYDAR
YVHPDGFVDKCTLCHHRLEKGMKTACVSICPTGSLHLVNFANLTADDKAM
MAGKEVYRQKAHAGTEPRLYWISASNKPGA
>Cag_1968 Glutamate-1-semialdehyde-2,1-aminomutase
MPQITRSIELFEKAKKFIPGGVNSPVRAFKSVGGTPIYMAKGSGAYMTDV
DGNTYLDYVGSWGPFILGSMHPRITAALEYTLKNIGTSFGTPIEMEIEIA
ELLCQIVPSLEMVRMVNSGTEATMSAVRLARGYTGRDKIIKFEGCYHGHG
DSFLIKAGSGALTLGAPDSPGVTKGTAQDTLNATYNDIESVKLLVQENKG
NVAAIIIEPVAGNTGVIPAQPGFLAALRQLCDEEGIVLIFDEVMCGFRVA
LGGAQSLYGVTPDLTTMGKIIGGGLPVGAFGGKRKLMERVAPLGDVYQAG
TLSGNPLALTAGLETLKILMDENPYPELERKAVILEEGFKANLAKLGLNY
VQNRVGSMSCLFFTETPVVNYTTAITADTKKHAKYFHSLLDQGIYTAPSQ
FEAMFISSVMTDEDLDKTIKANYNALVASQQ
>Cag_1042 conserved hypothetical protein
MKKISLYLTALLLTVATPLHAEEEQVCTTQADQKGSLTITITNFRAIKGS
LGISLYNTKKGFPSGYEQAYANQIKKVNGSSECVTFQNIPYGVYAVSVVH
DENENGKLDTTFIGIPKEGVGASNNPKMSFGPPSFNDSKVLLNKDVLEVM
VSMKYF
>Cag_0359 Acetyl-CoA carboxylase, biotin carboxylase
MFKKILVANRGEIALRVMQTCREMGISTVAVYSTVDAESIHVKYADEAVC
IGPALSRESYLNIPRIIAAAEVTNADAIHPGYGFLAENADFAEVCASANI
KFIGPTAAMINSMGDKNTAKATMIAAGVPVVPGSPGLVTDAKDALAIAKK
IGYPIIIKPTAGGGGKGMRVVTEDSQLEKALSTARSEAEQAFGNSGVYIE
KYLENPRHVEIQILSDQHGNTIHLGERDCTVQRRHQKLIEETPSPAVDEA
LREKMGTAAVAAARAINYEGAGTIEFLLDKHRNFYFMEMNTRIQVEHPVT
EERYNVDIVREQILIANGQSIANRNFTPQGHSIECRINAEDPEHMFRPSP
GQLQVFHTPGGHGVRVDSHAYASYVVPSNYDSMIAKLIVTAHTRDEAIAR
MSRALDEFIVMGIKTTIPFHKQVMNSAVFQSGDFDTSFLESFHFDKK
>Cag_0625 Lumazine-binding protein
MFTGIIKDVGHVASVASRQGGLRLRVNYSKPEEFRDLAVDESVSINGACQ
TVVACGNGWFEVDSVEETLAKTTLGLLRNGTVVNLERAMRPLDRLGGHFV
LGHVDCASTVVTIRELGSSREIWIALPEPFRPYIVLAGSITLDGVSLTVA
KLEEDRFAVAVIPFTFANTTINNLQVGSPLNLEFDILGKYVARQLQGGTP
MPNSSVSSITEGWLHEHGF
>Cag_0323 conserved hypothetical protein
MIDSSLTQKLMASLDCDEKEALRLLKNCGAAMVHYILASKKIAIKGLGVL
TVRHIPLKKERQASGVTFVPPSNNLVYERREVGEGDIARLAISALSLSEH
QAIRFSEVLASYFTAAFTAKQEVALPALGAFYADADGLYGFHVAPSFTAL
LNREYCDLADIVVPVGNRWGLWQERFRALRPAFITVGAVGVLFTASLLLY
RWFSEHPLQIVVPSTLSSAKAVKQSLHAVAATASSMLESSTERPVTVLPT
TPSFADSLQLERGAYAVVLATFQTERTAYEQVAVMRQAGIEAFVWPVFME
GSRYSRIMTGMFTTREAAEAHLKMLPEAFIKGAYVQKAKRNVVLYAKKRV
>Cag_1172 chlorosome envelope protein I
MNLIVNDIACTATVGQTLGSAAWKNHSHVGAVCGGRGVCQTCYVTVLEGS
EALSPLSEVEQAFLSPRQQQAGGRLACQTRIERDGTIRILSRPEQVQRLV
LTDWAGFMAFNATMIQDSASQIVSGIQNLVGRILRGEVKPLEDVK
>Cag_0179 DNA polymerase A
MAMLYRAFFALQRTGMSSPSGLPTGALYGFTTALLKIFENYHPHYLVAAF
DSREKTFRHHLLESYKANRAAPPEELLQQLEKLFELLKAFGVPVIKQAGY
EADDLIGAMVTQFADVCRIGIVTPDKDLAQLVREGVQILKPGKNQHELEP
LGCNEVKAHFGVPPKQFTNFLTLTGDTSDNIVGAKGIGPKTAATLLEKYQ
TLDKLYQHLDELTPKVRKSLEDFAPNRELVLQLVTICCDAPLHVTLEELA
CKNPARDVVLPLLQELGFRTIAARLQAASVALTCACNDGGESAPPMQSDP
NSSNLLNGSDGNTSATDTAPPPSFPDVPRHYTLVETREQLQALLEELQQV
THIAVDTETTSLDVFEAELAGISLCAEAGKAFFIATTPDALERKEVVKQL
KPLLENPAITKSGQNLKYDMLVLKKYGIELAPISFDTMLASYVLNPDEHH
NLDDMALRYLGRTTTKYDELTGTGKQRRHIFEVEKEALTNYACQDADVAF
QLEEVLQAQLQAEPQLLALCTTMEFPLVRVLATMEYAGIAIDTEHLARVA
ETTELELQSLTDNIYAAAGSSFNIDSPKQLSHVLFTDLSLPTGKSTKTGF
STDVGVLEELAATYPIASDLLSYRTLQKLKGTYIEALPKIINPRTGRIHT
SFNQHITATGRLSSSNPNLQNIPVRTALGKEIRRAFIPSTPEHWLLSADY
SQIELRIAAELSGDERLIAAFRNGEDIHTATAQVIFGTEEISSDMRRKAK
EVNFGVLYGIQPFGLAKRLNIPQKEAKVIIETYKAKYPQLFNVLRHIIEE
GKEKGYVTTLLGRRRYIADLNSRNGTVQKAAERAAMNTPIQGTAADIIKC
AMNLCYQQMQASGMASEMLLQVHDELLFETTDSEKEALTKLVENAMKEAA
VLCGMKQVPVEVDCGVGKNWLEAH
>Cag_1541 metal dependent phosphohydrolase
MGIVINLLLLVLAALVAFVAGFFIGRYFLERIGTTKVLEAEERAVQIVQE
AQKEANEYKELKVSEVNQEWKKKRREFEQDVLIKNNKFAQLQKQLQQREA
QLKKQSQDVRDAERKLQDQRKEVEQLSDSVKLRATELERVIVEQNQRLES
ISNLQADEARQMLIDNMVTQAREEASNTIHRIHEEAEQQATRMAEKTLIT
AIQRISFEQTTENALSVVHIQSDELKGRIIGREGRNIKAFENATGVDIIV
DDTPEVVILSCFDPLRRELAKLTLKKLLADGIIHPVAIEKAYADATKEID
DVVYSAGEEVAASLQLNDIPTEVIALLGKMKFHTVYGQNLLQHSREVAML
AGVMAAELKLDARMAKRAGLLHDIGLVLPESDEPHAITGMNFMKKFNESD
QLLNAIGAHHGDMEKESPLADLVDAANTISLSRPGARGAVTADGNVKRLE
SLEEIAKGFPGVLKTYALQAGREIRVIVEGDNVSDSQADMLAHDIARKIE
SEAQYPGQIKVSIIREKRSVAYAK
>Cag_0693 sepiapterin reductase
MQHIILITGAGKGIGRAIALDFAKATSPTFQPVLVLVSRTLSELESLAAE
CHALGAETHLCAADISNLQQIDAMVNDVVARYGTIHCLINNAGVGRFKPF
AELTPDDYEFVMDTNLKGTFFLTQKVFPFMEKQQLGHIFFITSVAAETAF
STSALYCMSKFGQKGFVEVMRLYARKCNVRITNVLPGAVLTPMWGEVPDA
MQRVMMQPEDISQPIVQAYLLPQRTSIEELVIRPVAGDINE
>Cag_1534 conserved hypothetical protein
MPNERVGRSEFDRRAIFDLHCKNERDEYFIVEMQQAKQDYFKDRSVYYAS
FPIQEQAQRGEWNFKLTPVYMVGILDFTFDDTDNDVYHLVQLADQATGKR
FFDKLSFIYLELPKFTKTVDELTSGFDKWCYLFKHLPYLTDRPARLQEKV
FLKVFELAEIAKYTPEEAREYEKNLKVYRDLKNVIDCAYGEGKAEGLEEG
LLKGRLEVAQRLVASGMSKAEAASFAGVSVEML
>Cag_1515 Porphobilinogen deaminase
MKKQLIIGTRSSPLALWQAEFTKAELSKHYPDMEITLKLVKTTGDVLLDS
PLSKIGDMGLFTKDIEKHLLAREIDLAVHSLKDVPTSTPEGLIITSFTER
EDTRDVIISKGGVKLKDLPPNARMATSSLRRMSQLLSVRPDLDIRDIRGN
LNTRFQKFDDGEFDAMMLAYAGVYRLNFSDRISEILPHDMMLPAVGQGAL
GIETRVDDEQTREIVRILNHSTTEYCCRAERALLRHLQGGCQIPIGAYAT
FSNGTLKLLAFVGSVDGKTALRNEITKTGLTSPEQAEEAGIALAEELLKQ
GADAILSQIRKTC
>Cag_0519 hypothetical protein
MLTVSYLTMQSLSIFLSSTCYDLKSLREHLHSEIAKLGHDPILSDYPSFP
VSPDLSTVENCKKVVRDRADVFVLVVGGKRGSLDLETERSVVNSEYREAR
AAGLDCIVFVDRQVWDLRHLFQKNPQADFSPTVDFPEVFGFIEEIENDTK
WIFQFHRTDDILTILLQQLSIRFRDLLLRHRTNRLLVPSEFAYERSDIAR
IVQDKDSLWEYRLTSALLADRMERLESKFNDLDSGYIVKRTKFLPARDTL
NYIQDLISDFTNVIKASVKVLEHQLTPAFGPPGVAGDAIQIRRACDNLFS
LFLALYEWEMDIRFVRPHELFENLFLRMHGWTSELLQDFRRIPKELDELL
AVPNLTGEHNIQLVINAPAGLALLTVEFERMSHDPDVVAALAGG
>Cag_1174 GTP-binding protein HflX
MNTVTPENQREKAFLVGIYSPPEVPRSLVEEYLAELAFLADTAGADVLDT
FIQERKVRDPSYCIGRGKVDEMEAYIKSEKIDVVIFDDDLSPGQARNLER
AWGCKVIDRTGLILHIFAIRAQSTQAKMQVELAQLEYILPRLSGAWTHLS
KQKGGIGNKGPGETQIETDRRLVRNRIALLKKKLREVERQHYTRTRSRQN
VSRVSLVGYTNAGKSTLMNALCPQAEAFAENRLFATLDTKTRRLELKINK
LVLLSDTVGFIRKLPHTLVESFKSTLDEVLQADFLLHVIDISHPSFEEQI
AVVRDTLREIGVQHDQIIEVFNKIDALEEPTLLREMGDKYPNAVFISAVR
GINLSLLKEIIGEQLARDYTERHVRLHVSNYRLISYLYDHTEVVEKKHED
EMVELTIHVRNHQLPQIDAMIQAAAEAPHEP
>Cag_0542 Pseudouridine synthase, Rsu
MYMKQEKSNNDGMRINRFLAMCGIASRRAADQLVQEGRVCINGKIVTTPG
VKVNPSRDEVIIDGKRFAQPENRKVYILLNKPRNVITTSSDERQREMVLD
FIDVPERIFPVGRLDRKTTGVILLTNDGELAHKLMHPSSGVKKEYIAELS
ERFPAAMLGKLTNGMRLKDTGEKVSPCRAKLLDDGMRVLLSIHEGKNHQV
RRMFDTLGFEVKRLDRVAYAGLQAGELRRGEWRYLSRNEVRELYRISQEE
A
>Cag_1796 Ankyrin
MKILEYTGFDSSSVAESYRKVATALAQGDFRAAQVKKLVNLTHGKFYRAK
LDAANRLLFTFVRYGDEVCLLMLEVIMGHNYHKSRFLRGAPLEEEKIPDV
DASEALNDAEQLRYLHPNHTEIHLLDKPISFDDAQQAVYLHKPPLIIVGS
AGSGKTALMLEKLKHVEGEVLYVTHSQYLAQNARNIYYAYGFEHPAQEAH
FLSYREFVESIRVPTGREATWRDFAAWFYRMRSNFKEIDPHQAFEEIRGV
ITAPEDGCLSRKNYLQLGVRQSIFSKEQRSILYDLFLKYRHWLTDSGLFD
LNLIAHEWKASPRYDFVLIDEVQDMTVAQLSLVLKSLKKAGHFLLCGDSN
QIVHPNFFAWSHVKTLFWKDPNLAGKKQLQVLTANFRNGREATRIANQLL
KLKHQRFGSIDRESNFLVEAIGGAEGQAQLMADTDATKREFNKKISHSTR
FAVLVMRDEEKQEARKYFSTPLLFSIHEAKGLEYDNIVLFRFVSSCRREF
NDIAEGVSLTDLEAIDSLEYCRAKEKGDKSLEVYKFFINALYVALTRAVK
NLYLIESDTKHRLFELLGLAVAGKVEVAAEESSLEEWQKEARKLELQGKQ
EQAEAIRRDILKEVPPPWQVCNETRLDELIHKVFKEKAPGNKFKQQLYEY
ATCHVEPVLAQALEKQTDYRSPHGSFWEHLDTIGRKSYLPYFSQQTKAIL
RQCEQHGPNHRLPMNQTPLMAAAAAGNIALTEALLERGADPTLNDHYGYN
ALHWAMRQAFRDNRFARTTFGTLYERLAPAAVDISSGERMIRLDRHLAEY
LLFQTCWVLFKSRFTTLELNGEYPAFDTSLILEAWEHMPDNVVPTERKRR
TYLSSVLARNEVSRNYAYNRSLFERLATGWYQFNPALHVRTSVTEEGQSP
WIPIFQAVNLPLISKFCHSHTIATIVQCFRKACMAVIPELEAEIAQQQAT
KAAKEQHLQTLVKQVKKKITPSSDSLAAKLLKQHKLSKKLDDELLVPFLK
FVREKELEEIRQQKMKKKLEREERQQIKAAEQAKRDEQVQQQLGFDF
>Cag_0683 hypothetical protein
MNDGIVIPTSQFSIINKNPTHSTHPLIAKNLVQDKRQPQGLPLHFYNSPP
LEGCPKGGVVFRANSNCGQKGLPQRAAPYFPLSTQYFSTPPFSINVIRYK
KRLRVRVIRGIRGEWFVRGLFGVARVHHNTQHYPFYTSYVYKQYNERLYS
HCFCGLHENCYGIRS
>Cag_1788 3,4-Dihydroxy-2-butanone 4-phosphate synthase
MHTAIDSIDAALEDIRQGKLVIVIDDEDREDEGDFIGAADLVTTEMINFI
TREARGLLCVAVTMERAKELQLDPMVQRNTSQHETNFTVSVDAIAEGVTT
GISVYDRTMTIKMLGDPSTKADDFSRPGHIFPLRAMNGGVLRRVGHTEAA
VDLAHLAGRSPVGLLCEILNEDGSMARLPELIKLKEKFGLKLITIKDLVA
YQMQRNALVKRAVESRLPTAYGEFKLIAYDSFIDHHNHIAFIKGDVSTDE
PVLVRVHSQCATGDTFASLRCDCGHQLASALTMIEKEGRGVLVYLMQEGR
GIGLVNKLKAYNLQDEGLDTVEANEKLGFKADLRDYGIGAQILKDLGIRK
MRLMTNNPKKIVGLEGYGLEIVERVPIEIAPNAVNESYLQTKRDKMGHML
GCSCSSTASHTHK
>Cag_1451 UDP-N-acetylglucosamine pyrophosphorylase
MALAVLIMAGGKGTRMKSDLPKVLHQANGRPLIHYVLETAATLNPAKTLL
IVGHKANDVQQATAHYPATALLQEPQLGTGHAVMQAEAELRNFEGETIIL
SGDAPLVTTATLQAMLALHHAEAATATLLTAELDDPTGYGRIVRVNNSSS
IEKIVEQKDATPNEQTIREINAGVYVFNTRWLFEKLGELNTNNAQQEYYL
TDLFSICFKTGKKVCAYKTATPNEILGINTPAQLQQIEEILKKGGVV
>Cag_0833 hypothetical protein
MLTISTTYDVQSDRLVTIKLPQEVYPGKHELLIVVEQQKKEKRIGTTIAN
SIMRFAGTVPAFRSLDGVSFQQSIRMKWE
>Cag_0096 Ribosomal protein S18
MKPRATSSHGYKAQGNKALNGALTSKKKVSKNQVVFFDYRDERKLKRFIN
DQGKIIPRRITGLSAKEQNLLTHSVKWARFLAVIPYVVDEYK
>Cag_1471 conserved hypothetical protein
MQHFSQSKRRFVLFWHTTAALAVLMLLYPLIGNFFIAWFVGAELERGALL
QPEMLRSFVPSLRIVQMVGQVLLLAFPALLLAGWQRGCRAPWSPEVRMWM
GLQPPFDGGVLVAGAGGIILLQPLLSLIAALQERYLWPALGEAGREVLQQ
QESMELFLRTIADAHSLPEALFVLAVLAVTPAITEELLFRGYVQRNYMQV
LSPAMAIVLSGLLFAFFHLSAANLLPLALLGCYIGYIYYCSGNLFVPMVA
HFTNNALALIVLWFTPQEHTMAMQAERIALLLSPAWWFLVVGCTLLFWRL
MRWLTDRIVRH
>Cag_0888 alcohol dehydrogenase, iron-containing
MPSIPFSPFITLPLPELYCGAGTVKHLPELAARFGRTVLLITGKRALQQS
GRLMALQAAMQQGGLKLFHEVIEREPSPEIVNRVVAEYRQYGVEVVVAIG
GGSVLDGGKAISGMLCYHEPVERFIEGLPNMVPFDGRKVPFIAVPTTAGT
GSEVTNNAVISRVGDNGFKRSLRHRALVPNIAVVDPELMVGTPASLTIAS
GMDACTQLLEAYTSPFATPYTDAVALSGLNHFSRSFVAACGSGTNDVAVR
GDVAYAACMSGVALANAGLGVVHGFASSVGGLIDIPHGVLCATLLYEATR
ENIEALRQLESNHPQLQKFSCAGSIMVRGVGMEVSSAPSIEEGCELLLAL
LEHWQERFNFARLQNYGLQLHHIPVLVQATRTKSNAVALSSTAMERILHN
RL
>Cag_0228 Magnesium protoporphyrin O-methyltransferase
MSGSSFNAEEHKNMLRSYFNGQGFQRWASIYGDDKLSTVRNTVRQGHAVM
MDKAFDWLEKLRLPKGATVLDAGCGTGLFSIRLAKAGYKVKSVDIASQMV
EKARAEATKQGVAGNMDFEVNTIEAVSGTYDAVVCFDVLIHYPAEGFAQA
FRNLSNLTKGDVIFTYAPYNKILAFQHWLGGFFPKKERRTTIQMIRDDEM
KRVMQSLNRRVKSQEKISFGFYHTMLMHVSHK
>Cag_1196 adenine specific DNA methyltransferase
MSMTIQEYVAALNRRYKTGIATEHSYRADLQALLSDLLPKVEVTNEPRRS
ACGAPDFILTRKNIPIGYIEAKDIGKSLDNKLYREQFDRYRHALQNLVIT
DYLEFRFYREGDLVTSLVLAEMHKGKVVIRQENVAAFADLMQDFGGYEGQ
TIGSASKLSKMMAAKARLLADVLEKALDGYSDENNGAIDEASNTLYDQLK
GFRDVLIHDITPKQFGDIYAQTIAYGMFAARLHDPTLENFNRKEAAELIP
KTNPFLRKLFQYIAGYDLDDRLVWIVDALADIFKATDVNSLLKDFRNATQ
QNDPIIHFYETFLAEYDPTLRKSRGVWYTPEPVVNFIVRAVDDILKTEFD
LRDGLTDTSKITVEIDKATTDKNFKSKHIKQKQEVHKVQILDPAVGTGTF
LAEIIKHIHKQFEGQEGMWNNYVSHHLIPRLNGFEILMASYAMAHLKLDL
LLAETGYTSTTDQRFRVFLTNSLEEHHPETGTLFASWLSQEANEANYIKR
DTPVMVVLGNPPYSGHSANKSKWIEELLRDYKQEPNGGKLQEKNPKWLND
DYVKFIRYGQYFVEKNGEGILGFINNHSFLDNPTFRGMRWHLLSTFDAIY
LIDLHGNAKKKEACPDGSSDKNVFDIQQGVSINLFVKTGKKKKGALAEVF
HYDVYGDRPFKYDFLSKKSLSTVDFTKLTVAAPNYLFVPKDFSVLASYNQ
GFAINELFSLNSVGIVTARDRFVIDSSKLALTQRIKNFFSLDKDELLRMY
GLKENSAWRIHTIKQSTIRYSADFIQMLAYRPFDNRYVYYDPLFIERSRS
DVMKHFLQENIVLTVCRQVKAGESYHHVFLANKIFESCLISNKTSEIGYG
YPLYRYLNINNQIPTNGEPQRIPNLNAGIVNQIAIVLGLTFTPEKEETDG
TFAPIDILDYIYAVLHSPAYREKYKEFLKIDFPCVPYPQEPQQFWQLVAL
GSELRQLHLLESPTVEQRLTSYPIAGNDVIEKITYLDGKVYINEKQYFEG
VPLVAWEFYIGGYQPAQKWLKDRKGMALCYDDVRHYQRIIIALTETARIM
EEIDTVGVE
>Cag_1103 conserved hypothetical protein
MKSVRFEWDEEKNKANQKKHQVSFALAQHAFLDPKRIIAEDIKHSNEENR
FYCVGRVNEEIMTVRFTYRGNVIRIFGAGYWRKGRKIYEEQHQLYR
>Cag_0352 NusG antitermination factor
MSAKRKDAEIQSVPVPRWYALRIYSGHERKVKEAIEFEVARCGLSDKILQ
VHVPYERFVEVKNGKKRSLTKNAYPGYVLIESVLDKQTRNLILDVPSVIG
FLGVDDNPIPLRPDEVEKILEPEKDVEHRSVVDVPFNVGDSVKVIDGPFS
SLTGVVHEVCTERMKVKVMISFFGRGTPTELDFTQVKPLAQ
>Cag_1090 pyrophosphate-energized vacuolar membrane proton pump
MLLHHFPLQRIRSASTLLLLSSLFPQKLFAGEADLILPDLRSVTFLEPLG
GASGYTLLLWGLAICLFGILFGFLHYQAIRKLPVHHSLSEISELIYATCK
TYLVTQGKFILVLWALIAAIIVAYFSGLRHFEPVKVLFILLASMVGILGS
YALAWFGMRINTFANSRTAFASLTGKPFQLHEIPLRAGMSIGMLLISIEL
FVMLCILLFVPSDFAGACFIGFAIGESLGASVLRIAGGIFTKIADIGADL
MKIAFKIPEDDARNPGVIADCVGDNAGDSVGPTADGFETYGVTGVALISF
ILLAITDPIVQVQLLVWIFVMRLVMILASAFSYWLNHGVARLRYGNADVM
NFEQPLQSLVVLTSAVSIVLTYAASWLLISSLGDGSLWWKLASIITCGTL
AGALIPELVERFTSTESSHVRNVVQCSKEGGAPLTILAGLVAGNFSAYWM
GLAIVALMGAAYGLSELGLASLMLAPSVFAFGLVAFGFLSMGPVTIAVDS
YGPVTDNAQSVYELSLIESTPNITSDIETNYGFSPDFQHAKHSLEANDGA
GNTFKATAKPVLIGTAVVGATTMIFSIIMILTNGLTDTAAIEKLSILSPP
FLLGMLLGGAVIYWFTGASINAVTTGAYYAVDYIKQHLHLDGSAQKASME
ESKKVVAICTSYAQKGMVNLFLTIFFSTLAFACLDAYLFIGYLISIALFG
LYQAIFMANAGGAWDNAKKVIETEMDAKGTPLHDASVVGDTVGDPFKDTA
SVALNPIIKFTTLFGLLAIELAIELPQNISVTLAIIFFALSLLFVHRSFF
AMRIKVTD
>Cag_1021 conserved hypothetical protein
MKQLPVGIQTFNKIIEGDYLYIDKTDIAKNIIEKYQYVFLSRPRRFGKSL
FLDTLKNIFEGKQELFKDLFIYNQWNWNVTYPVIKISFSGGIRDKESLRR
NLVYVLKDNQKQLNITCEEKDDPNLCFAELIQQASEKYQQKVVILIDEYD
KPILDNIENIAEAIIVRDGMRDFYTKIKESDEYLRFVFLTGVSKFSKVSL
FSGLNNLEDISLNPDFGNVCGYTQNDVDTIFAPYFEGVDMEEVKRWYNGY
NFLGDKVYNPYDILLFIKNKYVFDSYWFETGTPRFLIELIKKNNYFIPDF
LTLKVKKSIVNSFNLENLNLETILFQAGYLTIKRLISTNKGISYELRFPN
KEVQISFNDYLLQELTTISENELICDDLFDLFNNGDIANLEPVIKRLFAS
IAYNNFTNNYIESYEGFYASVLYAYFASLGFDMIAEDITNKGRIDLTLKT
LDKTYIFEFKVIAEEPLEQIKKMRYYEKYDGERYLIGIVFDPKARNVSRF
EWERV
>Cag_1412 Anion-transporting ATPase
MLSRDLAEAPSQTRVIIYSGKGGTGKTTISSSTAVALARQNKRVLIMSSD
PAHSLSDVFNTRISRNDPQQIEGSLYGLEVDTIYELKKNMAGFQKFVSSS
YKNRGIDSGMASELTTQPGLDEIFALSRLLDEAQSGKWDTVVLDTSPTGN
TLRLLAYPEIIIGGNMGKQFFKLYKSMSSIARPLNSSKSLPDDEFFEEIN
VLLKQMEDINKFILSPEVTFRLVLNPEKLSILETKRAYTFVHLYGINIDG
IVINKILPESTTVGEYFEFWSNLHSKYLMEIDQSFYPTPVFRCHLQRTEP
IGTDALHDISKMVFGEQSSDKIFYTGKNFWIESRKNEASDDLHEVLCIRI
PFLQDAEEVNVERMGTDIVVTVDRAQRIITLPRALYNLEMEKFVREDNCL
RVMFRELHPEKEESELNVNKNVLEKLRSMRRMKL
>Cag_0284 oxidoreductase, Gfo/Idh/MocA family
MSSNIKIGFIGSGWSRIAQAPAFSLMDNVSMAAVASPTADHRQKFMKMFD
VEHGFADWRDLLQCDLDFVCVTTPPFLHKEMVEGVLRSGKGVLCEKPFAL
SVADATEMADVATRTPLLALVDHQLRFHPAVRHMKQMIDGGEIGKVFEVR
AVVNLASRNKTDLPWSWWSDATRGGGALGAIGSHLIDLNRYLVGEIAEVS
CNLGTSIAHRPDGNGNLLPVTSDDHFAMMMKFGRSSVALGSSSLMHVTTV
GAYTWFSFEVVGSLKTIRLDGGGRLWEVYNDNAQRGRSLIDMPRWKQIEP
ILPWDELVLQEKIKQSSLAVHGIFAVGFAFLAHRIVKALQCGEKAILDAA
TFADGLMIQRVLHAGTESHQQRNWVKI
>Cag_0639 NADH dehydrogenase I, 23 kDa subunit
MSEYFSNIKSSAVTIASGMAITLKHFFNAVHRKGDAGLDDVDYFKQVDGL
VTLQYPHEVIPTPQHGRYRLYNNIEDCIGCGQCVRACPITCITMETIKVL
PEDLASCGKTSDGQQKKFWIPVFDIDVAKCMTCGICTTVCPTECLYHTPV
SDFSEFDRNSMMYHFGNLTKLEAEAKKKKVAEAQAKAQAEKAKKAAETNS
>Cag_1078 transcriptional regulator, AbrB family
MPITTMTSKGQVTIPKEIRDLLELHSGDRLEFTFEKWGRLVVSPVKKNVD
DLFGRLFDPTRKAFSTEAINDALRQKFQHDEQ
>Cag_2012 phosphoenolpyruvate carboxykinase
MSLMSSISLNLPDYIQHKKLIQWVQETADHCQPDAIHWCDGSQEEYDYLC
EQMVASGTFIKLSEEKRPNSYLCRSDPSDVARVEDRTFICSIRRQDAGPT
NNWVAPKEMKEQLHELFRNCMKGRTMYVIPFSMGPLGSPIAHIGVEITDS
PYVVTNMRIMTRMGRKVMDILGDKGEFVPCLHSVGAPLEPGQQDVAWPCN
STKYITHFPEERSIISFGSGYGGNALLGKKCFALRIASSMARDEGWLAEH
MLILGVESPDGEKTYVGAAFPSACGKTNFAMLIPPKSFDGWKITTIGDDI
AWIKQGEDGYLYAINPEHGFFGVAPGTSEKSNPNAMATLRANCIYTNVAL
TPDGDVWWEGMTDEKPERLIDWQGYEWTPDCGRLAAHPNARFTSPASQCP
SLDPAWEDPNGVQISAFIFGGRRGSTVPLVYQSPNWMFGVYLAATMGSEK
TAAAAGKVGEVRRDPYAMLPFCGYHMGDYFNHWLHIGRTLPELPRIFGVN
WFRKDENGNFLWPGFGENMRVLKWMIDRVRGNAGAVESPLGWMPKYDMID
WTGLEEMTEEKFAELMSVDREAWKKELLSHEALFEMIYDRLPKEFSHIRE
LMLSTLWMSPAKWSLAPEHYSADHDVEDEE
>Cag_0971 NolG efflux transporter
MTLTELSIKRPTLIVVFFTVLAILGLFSYRQLQYELLPKMTPPVVSISVV
YPGASPSEVETSLTKPLEEAVSALEKISSISSTSTEGLSVVTIEFDNSAD
IEQSLQDAQRKINEVSDLLPTEAKTPVISRFALDEVPVLRIGATSSLPDS
EFYQMLKDEVKPQLSSVAGVGQVYLVGGREREIRVNLDLDRMQAYGLTVT
DVVREVEKANSDFPTGAIEETERQFVVRLAGKFRSLEELQALIIQATPEG
NVQLRDIATVEDGFREITTLSRLNGRESVGILIMKQSDANTVDVSRLVRA
SLSKIEQLYSNRNLRFSVAQDASTFTLDAVTAVQHDLLLAVLLVAFVMML
FLHSLRNSLIVLVSIPTSMVTTFIGMYLFDFSLNLMTLLSLSLAVGVLVD
DSIVVLENIYRHLEKGEEPRHAAIVGRNEIGFTALSITLVDVVVFLPLSL
VSGIVGNILREFAVVMVISTMVSLLVSFTLTPLLASRFSRLSAFTGKNVA
ERFALGFEERFHAMREFYLRLLGWSLRNPIKVFLSAMALLVASFSLLVMG
VIGGEFISVSDRGEFAVKLELEAGTSLAETNRITRQVEHRLESLPEVRDV
MVTVGASSEGFFNQSSDNVAELNVRLSPKEERSRSTNELMARVRADLQHL
PGVTATINPIGIFGTANETPVAVIFSGPDRNEVARVAEALKARLSTVPGT
ADVVLTSKPGSPELRVVIDREQMAAFGLSIADVGAGLRVAYAGDEAGKYR
DGDDEHIIRVMLDAGYRNDVAMLSSMIFRTPEGDMVPLGQFATFEQERGY
TQLQRKNRNNAVWVKAQVIDRPVGDIGQEIERIIADMKRNGTMPSTVSYA
YESDLKQQRESNSTLGLSFMVAIAFVYLIMVALYDSWFWPMVVMFSIPLA
IIGALFSLALTGKSLSMFTLLGIIMLIGLVGKNAILLVDFINNARAEGMA
LTTAITEAAKERLRPILMTTFTLIFGLLPIALSGSSGSEWKSGLAVALVG
GLFSSMFLTLLVIPVVYVWFDRLHERVLALRQRIMG
>Cag_1891 putative transcriptional regulator, ArsR family
MSSSMTAETTKAHTPQLVIPDEMLDAVANRFKLLSEPMRLRILRILSEED
HTVQDIVRKINASQANISKHLTLMHDNGMVSRRKVGLKCYYSLADESIIH
ACNLTSHSIVSSLHNQLHWLQKVAPKSEE
>Cag_0894 conserved hypothetical protein
MIKAILENRTLKNMADKHLMTPSTTTLSGLLGNGLTYKVPLFQRDYSWKI
DNWTDLWEDIKILLNTGKDHYMGAVVLQKVGEKQFLIIDGQQRFTTLSLL
ALATIKKIQDLIDAGIESENNTERINELRRGFLGQKDPASLHYSSKLFLN
ENNDPFYQRNLLQLISPQNQKTLIDSNRLLWQGFDFFYKRIGEHFHNATG
AELATFLSKSIGDKTMFIKIEVEDEFSAYTLFETLNYRGVELTVSDLLKN
YLFSLITPSDLRIAKEIWKKIATSVGMENFPIFLRHFWISRKPLVRQEQL
FKTIRTEVASNQQVFDLLNNLDAYSEMYLALQDPYNINWQGNRERIKRIR
EMKLFGVKQQLPLLMVAKEKFSDQEFDKTLKVISIISFRYNVIGSRQANR
MEEVYNIVSQKIFNTTITSAQGVFNNLKELYISDADFRNDFSTLELYSYG
QSKKLARYILFELENNMMQGGDRDYELDPATIEHILPENAGSHWDIDFPQ
IIQPSYIYRLGNYTLLEDHLNRDCETLLFDKKKPFYVRSQYEMAKAINYN
EWNPLTLDARQSDFAKKATSIWRISYAD
>Cag_0498 Phosphate transport system permease protein 1
MDVKNKMVAESMSFYYDDYQALKNISIAFPENQVTALIGPSGCGKSTFLR
CLNRMNDLVPKTRMSGRIMLGDLNIYNPKIDVVDLRKKIGMVFQKPNPFP
KSIYDNIAYAPRIHGLVRTRQETDALVEDSLHKAGLWNEVKDRLNDLGTA
LSGGQQQRLCIARAIAMQPDVLLMDEPASALDPIATQKIEDLVLELKKRF
TIIIVTHNMQQASRASDYTAFFYLGELIEFGKTDQIFTKPREKQTEDYIT
GRFG
>Cag_0720 conserved hypothetical protein
MKKVIAFHIAESIDIKRFWKGFTDTENHMSSLDIFYTNDKDQYLYLLAYG
VVVFVGYDELKMSDMIDYLKPLCKNLLTEKMREEFIINTTTNKDAFEYNE
IHISNSNPNVIRIIMLNVAQSVALDYFSKLAEDLMIETTIYTQQLEKYGK
INISIKRLQMFIGKVLNIKNRIAENFYILDSPEETWEDEYLSKIDFGLRK
TFDVKIRFREIDYQLQIIRDNLDLFKDLIQHWKSNMLEWVIILLILVEVV
NLFVEKFSH
>Cag_1244 Nitrogenase iron protein
MRKVAIYGKGGIGKSTTTQNTVAGLAEMGKKVMVIGCDPKADSTRLLLGG
LQQKTVLDTLREEGEEVELEDIIKEGYRNTRCTESGGPEPGVGCAGRGII
TSVNLLEQLGAFDDEWNLDYVFYDVLGDVVCGGFAMPIRDGKAEEIYIVC
SGEMMAMYAANNICKGILKYADAGGVRLGGLICNSRKVDNEREMIEELAR
RIGTQMIHFVPRDNFVQRAEINRKTVIDYDPTHGQADEYRALAQKINDNK
MFVIPKPLEIEELESLLIEFGIAN
>Cag_0133 hypothetical protein
MENLMSKNAIIYGIRKLNVVERLNIITDIWDEIKDSQELEIVSENDKKVL
LDRLANYRANPEFATDWDELKQKIHDRYAD
>Cag_1342 hypothetical protein
MNTIDPIVDEIRMYRKEHAALYGYNLHTIVEVLRKKEQESKRIFLNPGPK
PLGIETSSTSVATMPHV
>Cag_1655 FtsK/SpoIIIE family protein
MATTIKQRLKNLFVRALDRSMIKRELAGIALMLGSLFMVSAIISYHPDDE
ALYSALRWFDVFSNPARDTADAIHNHFGLFGARMANFLIHFVLGYPVLLL
ISSFFFWGLSLVRARSLKPALFFFLYSVVMAIDIATMFGLTSLAFSDVMS
GSIGRMLAAFLITTIGFSGAWVLLLSVGLLLTFYMGRSFFIPAFHALMAM
VPRLSSLWDNIRARISAIQKKKPLQSP
>Cag_1942 Uroporphyrin-III C-methyltransferase-like
MSASKQGMESNGYVYLIGAGPGDPELLTLKADRVLREADVILYDDLVSSD
ILARYHAEKIFTGKRKDNHHFEQDAINDELVRQAKEGKRVARLKGGDPFI
FGRGGEELETLRQHGIRYEIVPGITAAHGAASYAEIPLTMRKLSSSVAFC
TGHPVTKIQVPSADTLVYYMVASSAHAVLEAVAASGRPDSTPVALVHNAT
RYNQKALFGTLGEWLRGERAVYSPALLIIGESISLAVADNWFQAKPKVLL
VGDDPALMLLSDAFFVSFTKAAQRETLVAEPLELHLFDALHFASVADVEA
FVEHYPSLPSTLRVTCATDQVMALYRQYYGG
>Cag_0484 Pyridoxal phosphate biosynthetic protein PdxJ
MRLAVNIDHVATLRNARGEFHPDPVEAALIAEQAGAAGIVCHLREDRRHI
KDDDLRRLRAAVTTKLDLEMAMTEELQAIALATKPELITLVPEKREELTT
EGGFDIVRHFDTLKAYLQPFRSAGIEVSLFIEPDKQAIELAAQAGADIVE
LHTGLYALKAGEAQEEELTRIGNAAAFARNIGLKVVAGHGLNYSNIAPFR
NIVEIEEVSIGHAIITRAIFSGLEGAVREMVALIR
>Cag_1184 conserved hypothetical protein
MLSRVAESLFWMSRYFERAENTARFLDVNFNLLLDLNKITHVDNPNYWIA
LILVTSDKERFNNLYDSYSAESVTDYLVFNKSNPNSIISCIGLARENARS
VIESISSEMWEQVNNLYHYLQSVTPQMVHNDPYIFYKEIKNASHLFQGIT
DNTFSHNEGWDFIQIAKYLERADNIARLIDVKYHMLLSDAHHQAEIVAGS
EDIIQWMAVLKSCSALEAFRKVYLAKIDPDNILRFLILDRSFPRSINFSV
CAAEDALWRISGSSRHRYANNADRLIGKLEAELIYTTVEDIHEKGLHDYL
EDMEQRLIKVGEQVHLTYFAYHTPEIEPDSSEEALPFTGVAGGRPNWSQA
QQQQQQQ
>Cag_0207 IscU protein
MLQAGEWAYSETLKEHFMNPKNILQGENTDDFDGVGMEGNLQCGDQMMVV
IKVDKEKETITDCQWKTYGCASAIASTSVLSEMVKGRTLEAAFNISPKEV
AAELGGLPDNKIHCSVLGDKALRAAINDYYTRTDQTDKVKKELAKIVCQC
MNITDHDIEDAVLEGARTYFELQEHTKCGTVCGQCKEETEELLAKYKHLH
FGL
>Cag_1073 Phosphatase kdsC
MLTLSQNELVRRAQAIRLVIADNDGVFTDTGVYYSERGEELKRYSIRDGM
GVERLRNAGIETCIMTGERSPNVQKRAEKLCMKWLYLGVKDKRSMLATLL
AETGMERHELAYIGDDVNDCGIMEEIAPFGLVAAPRDATRFVEPYLHYRA
TADGGHGGFRDIAEWLLELKNS
>Cag_1482 ATPase
MNLFLRLFRYLLPSKGKIALVVVVSMITSLLGVISIYSILPLLNAVFTAD
KTIEAPIKPQEEKKNNKEKVELTDTKQALGLQKFDSKKIKESATEWFQQS
FYAETKEKTLFNICLFLIASFAIKNFFSYLNGQLIFRVQTKTAKKLRDDV
FNSIIEMHLDYFNKNRVGNLMNLVYNDVQAVNNTINSTFVNFLQNPFSIF
VYVSVLIMLSWKLTLFAAFTSLLIFFVIRTIGKRVKGLATTFRSTMGDMN
SVLQEKFSGIKVIKSSAFENVEMQRFQEFTRGFRVLDIKIAQLRNIIGPL
NETLLVSAIALVLWFGGLQVFAGEMTSSELMLFAFTLYSTMGPIKMFGDV
NTQMALGMISAERLFELLDAQPDIKNGTRPINGFSHNICFEDVCFKYSKD
ADTPNVLDHVSFEIKKGEMVALVGQSGSGKSTTVDLLLRFYDVDSGRITI
DGVDIRELDYKQLRRMIGVVSQEVILFNDTIKANIAYGTHQEVTEERLMA
AVRMGNAHQFIEEKPEKYDTLIGDRGVQLSGGQRQRLAIARAMVKNPELL
IFDEATSALDNESEKVVQQAIDHALENRTALVVAHRLSTVRNADCIIVMD
RGHVAESGSHDELLLHGGLYKHLYDIQFVGKN
>Cag_0977 conserved hypothetical protein
MQPTALTTYYTVIYPQLASFFCALPLPHGWRLEAEHPTSDVWSFYDTFSW
HAFLQDVAIVKKGKILALYHMKSGVSNATISLPTTPTSFFASELPPCALQ
EELATCTTLRAYLQLCSVEVNNIAYRLLDNAHKTLAIINYLQFQLPEQSE
QDTLSIITVTPLRGYQKESGMVEALIKADEAIGTYSFNELYRSIMQAYNL
NVGAYSTKLRVTLDADAPAAISIAKILQWTFSIMQRNEEGMCNNFDTEFL
HDYRVAIRRSRSLLKQTRSIFLNNEIEPILTFLTTLGKRTNALRDCDVLL
LHQKAFTNALPPLLQPALQQLFTLTINEKEKLQKELSHYIKSNTYIKERQ
EWQQKLLPPYSFFDSHTRETLTTKIVAITSLRKAWKKMLRYGRTLHNNST
DTELHTLRIHAKKVRYLLECFQSLFVSAQLAPLLQQLKALQESLGTCADK
AAHLQLLEKKLSHPKLTQESAAALGALLAILYQEHQTARLIAQEAFAVFD
SNEKSALFNTIVTNVKL
>Cag_1428 conserved hypothetical protein
MRAIIVTLPQKIIPYIPMRRITVLLVFILTTIIAGTSLQAATTPFTGSMD
MALTMPNGRGTVTYLFGKGAQRMDMSVQMENIPSLLRTTVLTQANQPDNA
TIINHQTKSYSQVNLTHAAQSALLMDFNSVYRVTRLGRTTLRGYNCEHLR
LQSASETVELWVTGDLGNFSTFQILQAQNPRLATTQLAAAFRNNNIEGFP
VKMVQEVQKQRYSMELLKLTKKAIAASQFRVPAGYKRVDASEPTLNSEQK
QQLKNLMEKMKQVE
>Cag_0802 tRNA (guanine-N1-)-methyltransferase
MRFDVISVIPYFFDSVLTSGLLNIARKKGEVEIYIHNLHDYGLGRYKQVD
DAPYGGGAGMVIRPEPVFACIEALQAERHYDAVIFLTPDGELLEQPLANR
LSRLENLLLLCGHYKAIDERIREQLITMEISVGDVVLSGGEIPALMLMDA
IVRLIPGVLGDSESALTDSFQNGLLDAAYYTRPADFRGMKVPEVLLSGHQ
AHIEQWRTASALERTKIRRPDLLERAWREEF
>Cag_1644 2-vinyl bacteriochlorophyllide hydratase
MPRYNPEQLAKRNASVWTDIQIILAPIQFFFFIGGLTVTILYANDLFPGG
FYWVSLAILFKTLFFALLFITGMYFEKEIFNQWVYSKEFLWEDVGSTVAA
FFHLLYFVLAFAGYPRDILILDAFLAYFTYVLNALQYLVRIILEKLNERK
LRMDGQI
>Cag_1497 chlorobiumquinone synthase BchC related protein
MPLTAQAIVLQKASKLKMVTASFEVPTVNGLLVQTIASTITPGYDRLLIT
NKPVTNKVFKYPIMPGSEAIGQVLEVGSGVSDIAVGDFVFVFAAQGWQGI
EAYAGCHANIIPTTRDGVLPLGRLPIHRDLLTGLLAYAMSGIEKIPLNPS
QRVLVLGLGSVGLMVTEYLHHLGYQHVDGVETFGLRGQLSRAENIALDIA
DFTEAFNNCYDIVVETTGRILMLEKAIRLMKPHAKVLLMGNYEVMACDYR
LIQHKEPHLISSNITTMEHHRKASELLESGLLDTEKFFTGVYPVEQFELA
YRHALDDKPSIKTVLTWL
>Cag_1836 50S ribosomal protein L6
MSRIGKMPIPLSNQAKIEINDSIIRVSGPKGSLEQKLTDQVIITEENSAL
LVRRIDDSKKARAQHGLYRMLINNMVQGVTNGFTTKLEIAGVGFRAEMKG
ELLALTLGFSHLIYFKAPEGIKLETPDQVTVLISGIDKALVGQVAAKIRS
FKKPEPYRGKGIKYSGEVIRRKEGKAAGK
>Cag_1622 Pseudouridine synthase, RluD
MVFIRLRQINHLSIMQDATHTPPNEVVQEVPLEPKKITLHVSRTQSPMRI
DVYLTQQVENATRNKVQEAIEEGRVVVNGKCVKANYRLKSCDVIEVTFLR
PPAPELAPEAIPIAIMYEDDDLMVINKEPGMVVHPAFGNWTGTLANAILH
HLGNSAEALDASEMRPGIVHRLDKDTSGLIIIAKNPIALHKLARQFAERK
VGKIYKALTWGVPKQASGTITTNIGRSHKNRKVMANFPFEGLEGKTAITD
YCVEENLEWFSVMSLRLHTGRTHQIRVHLQHLGLPIVGDVTYGGATPRAL
PFSKSEHFVKNLLELLPRQALHAETLTFHHPTTGEEITLSAPLPEDMLQA
LEKIRGICKLLKE
>Cag_1287 hypothetical protein
MLQTISGCGIASNATSLYMSVKNTALLFELIAPLEHYGSMNVPYETHSIP
IAIKSNAQIVEVFLKNTIPLIEKQQEARRSSAKILRN
>Cag_1190 conserved hypothetical protein
MAMPNLGDMMKQIQQAGEKMQEVQKQLERLVAHGEAGGGMVKATVSGKQK
LLSLAIDPEIMDDYEMVQDLVVAAVNSALDASLKLAQDEIGKVTGGMMNP
TELLKNLNLGQ
>Cag_1610 hypothetical protein
MKTEMHPFERILSVSTEDIELELRGITEINWADFWLNPRKLRGSDFLMRW
SQGVWSEKRLIDAVNNTGEFYAIAYGPSGTAPTDDVRAFELYFERLEAAG
LGNIKRPDLLVFKISDREFVDDFLSKNGGEEELPFITEDKLQALVQKAII
AVECENSLWVAEKMPDYRTPMKAQRRLGGKMGLAKNAVLPTVIIKEEDRI
PLNKWQEENRIPIHVWHVFFDRAYGLSFDEAQRLVTEGLILPTEQVFQAP
GGATTKKAIYKYYYHYAYPLGVASERPQLIPAFIEDKNGHILPYVKFEGG
SLTIADEAINVLNQL
>Cag_0740 Type I secretion system ATPase, PrtD
MKKPEVKSPLREALWAQLPALLKTFYFSIVVNVLVLAPSVYMMEVYDRVV
NSRSHNTLLMLTLLVVGAYLLLEALEWVRRQIMQSAALQLDGKLREEVFS
AIFAARLQNIPSAGAQALRDLKSIREFLPSQALLAMVDTPLALLVLILLF
LMAPLFGWFSVAGAVVQFGIGFFNERRIRKPLQEANRSAMVAQGYADGVI
RNAQVIESMGMLPHIHRRWMERQQEFLVNQATASDHAGTNAALSKLLQSL
LSSLLLGVGCWLTLKGEVFGSAMIVASILGGRVLAPLVQIIGSWRQVEGV
MEAYHRLEAMLRELPMPQKGMPLPAPTGQLSVEGIIAGAPRSPMPILKGV
SFRVMPGGTLAVVGPSASGKTTLARLLVGIWPSTQGKVRLDGQDIYLWDK
EELGRYVGYLPQNVELFEGTIAENIARFGEPELEKVEAACRLVGLDALMV
NWPKGYDTQIGEDGAFLSGGERQRVALARAVYDMPKLVVLDEPNASLDEA
GDAALINTVKKLRENGTTVIVMTHRLNILAAIEYMLVLVDGQVQKFGTVK
EVMEALQNPQQAGGAQQQPQPKPAPQPKPMPSTPRLA
>Cag_1883 Oligopeptide/dipeptide ABC transporter, ATP-binding protein-like
MEKILELKNLRTYYATSNGIAKSVDGVSFSLQRNRTLGVVGESGCGKSVT
ALSIMRLVPMPPGYFAGGEIFWNGTDLLKLSEEKMRAIRGNEIAMIFQEP
MSSLNPVFTCGNQIMEQILIHRPLSQQEAKERSIELLRLVGIANPAARFS
SYPHELSGGMRQRVMIAMALSCNPALLIADEPTTALDVTVQAQILDLMGT
LQADNGMSVMLITHDFGVVAELCEEVLVMYASKVVEQGSVKQLFANPLHP
YTRGLLQSIPRLGHRHERLHVIEGSVPSALNMPIGCRFAGRCSLADAECR
ATEPPMQVAEAGHSVACWKYAFS
>Cag_1661 Acyl carrier protein (ACP)
MTAAEIKDKVFDIIVSRMGVNKDQIKMESKFSDDLGADSLDTVELIMELE
NAFDVQIPDEDAEKIATVQQAVDYIVSKQ
>Cag_1311 hypothetical protein
MSALSLRLPKTLHEQLRELAQEEGISVNQFVMLAVAEKVASLSTIDYLQK
RAERGSREKFLEILNKVPDIEPADFDKL
>Cag_1108 5,10-methylenetetrahydrofolate reductase
MLVKEILSAANNPVFSFEFFPPKKDEEWDTLFKTIAALSPLHPSYVSVTY
GAGGSTRSRTHNLVTRIQQETGLTVVSHLTCISSEESDTEAILRNYQENG
INNVLALRGDKPTGVATLQEAIKDFPHAIDLVRFIKRNFPDFGVGVAGFP
EGHPETPNRMQEIEYLKAKVDAGADYIVTQLFFDNRDYFDFVERCQLAGI
TVPVIPGIMPIATKKGMIRMSELALGSRIPAPLIKQVLAAENDSDVAAIG
IEWATAQVQELINQHIRGIHFYTLNMADATLKIFRNLSA
>Cag_0856 conserved hypothetical protein
MDRITKRLLLAFVLLAAAGGCVHAKETIVLQQGVDTFKEKKPAWFDYLPV
VSGITPVIIFLYLRKRRIKDELEKKEALSIAEQKAKKKYIDIENNKYAER
YQVALAKELDKSNLPASNALDSFSVKLTDTFVSLRLSETWKCDTKFIPDQ
SQDMLKEKHRVRTPEEVMGLVFEKFSLMLVIGDPGSGKTTLLRHYVLTCL
QKDGYNALGFTEPVMVFYLVLRELKKSGSNYASLSDNLYAWSEEHQLGIP
KALFFNWLHNQKTLVLLDGLDEISDVDDRISVCKWIDKTFAGFPKAYVVV
TSRTTGYRKGDGIEIVSEHVRVDIMDFSQSQQAEFLKKWFTAAFIRDQPL
DGVSQYQQKQSEALEKAATIIKYLNKAENKSLQSLAGVPLLLQIMATLWK
DREYLPGSRVKLYDAALDYLLDYRDRQKMINPLLPSEDARRVLAPISLWM
HEELKKDEVYKAELHMRMQYKLQTVKNAPSAEAFCKNLVERAGLLVEYGD
SEYVFRHKSFREYMAGVQLKEDHPDKQIDKLVAHFGNDWWEEPLRFFIAQ
VDENIFDLFMQKLFDSSVSEKLSPKQQDLLATLIKEARQTKIDALQVKLL
DPRTTPRKQRYILDCLKTISIGNQAALEVIRTFIETGITKDVEIVLKAAA
ITRKKDISDTIHVLLDQQGAQYILIKGGVFTYSVTKQQEAVSDFYISKYT
VTNQLYRRFISYLDAKEPEFEQILSLDAYKESLYAMAERIKGFSDYLQAE
TLLAGCICSHYANDKRFNQDEQPVVGVTWYDAKAYCLWLSLLESNSCDAN
LYRLPTEKEWEYAASGKENRTYPWSEADPTITCANYNQNEGVTTSVGCYP
DGATPEGLYDMAGNVWEWVEFLCDRDDGWRSMRGGAWNHLSNTLRCSARL
VTYQPNRVENNTGFRVILSGHAS
>Cag_0447 conserved hypothetical protein
MRYLFVHQNFPGQFKFLAPTLAANKSNKVVALCMKPQAPTIWQGVEVRSY
SANRGTTKGVHPWVSDFETKTIRAEACFMAAQQLKAEGFTPDVIIAHPGW
GESMFLKEVWPHAKLGIYCEFYYHPEGADVGFDPEFPPKSESDRCRLRLK
NLNNIVHFQIADAGLSPTHWQASTFPEPFRSRITVAHDGIDTTLLSSNVA
VRLTLNNSLTLTRKDEVITFVNRNLEPYRGYHVFMRALPELLQQRPNARV
LLVGGDKVSYGAKPEGEESWKEHFIAEVRPRISDADWARVHFLGTIPYNI
FVQLLQLSTVHIYLTYPFVLSWSLLEAMSIGCAIVASNTKPLLEAIHHNE
TGQLVDFFDEKGLVENICELLDNTNERARLGANARRFAQATYDLRTICLP
QQLAWVESLSKK
>Cag_1347 TatD-related deoxyribonuclease
MFIDSHCHLSFPDFDADRNDVLQRLQAAKVSLLIDPGTDVTTSKNSIALA
QEVDCVYANVGLHPHEATQPIGDDVFAQLEALAHQPKVVGLGEIGLDYHY
PDCNASAQQAAFREMLRMAIRLDIPVVIHSRDAWSDTLRLLDEEQHSALR
GIMHCFSGDVAIAKECIQRGFKLSIPGTLTYKKSLLPEVVAQVALDDLLT
ETDAPYLAPVPHRGKRNEPAYVALVTETIARIRSLSVEDAATAIYRNTLS
VFEKINGNGLSVKIADNK
>Cag_1809 hypothetical protein
MPSIAPIIPFLLLFLWLLQGCASDRAPSGGSADTTPLRLLASTPINGTQN
FKGNQLQLYFSHEVSSRALLRALRTFPDIGQFELTVNGKRADIQLLDTLQ
ANQTYTLLLNRHLNDFRGQLLHAPTTLAFSTGNNVNNGTIRGTVVQYNGT
PASNALLLAFASAEKGATVNLLENKPTQIAQCDASGSFAFNHLPHGSYHV
VAINDRNHDLAWAPSSEEYATPSQPLMATNSANQLLRLSPPLKSPKPLKI
PLEASSAPTNSTIATGSLSGMCTVRGNPPSVIIEAISPSATYYTVAVRKK
AGSYTYHFNQLPVGDYTITASIPTASYQPNQAWQWNAGSVAPFVPSDSFT
FYPETVTIREEWLTERINITFPTILQ
>Cag_1808 Glycogen/starch synthases, ADP-glucose type
MSRRNSKVLYVSGEVAPFVRITALADFMASFPQTVEDEGFEARIMMPKYG
IINDRKFRLHDVLRLSDIEVHLKDKVDMLDVKVTALPSSKIQTYFLYNEK
YFKRHAWFPDLSTGNDARVTVEKIVFFNVGVLETLLRLGWKPDIIHCNDW
HTALIPLLLKTMYASHEFFRNVKTLFSIHNAYRQGNFPLKHFQRLLPDEV
CAGLHCVKDEVNLLFTGIDHVELLTTTSPRYAECLRGDTPEAFGVGKRLL
ERELPLHGMVNGLDARQWNPAIDKMIKKRYSTENMNGKVENRKMLLEEMK
LPWRDGMPLVGFIATFDDYQGAKLLADSLEKLIAQDIQLVIIGFGDKKCE
QRFQEFVVAHPEQVSVQAECSDSFLHLAIAGLDLLLMPSKVESCGMLQMS
AMTYGTIPIARATGGIVETIQDHGENLGTGFLFDDYSVEGLSGRLADALR
CFHDDECWPALVERAMSSDFSWHNTAEAYGQLYRNLLGK
>Cag_0609 conserved hypothetical protein
MVVGIAGGYEIIIYWSTEDNAFVVEVPELPGCMADGKTYEEALNNVQVII
EEWVETARVLGRAVPEPKGRLMYA
>Cag_1611 putative ATPase involved in DNA repair
MTTYNRFSRGSEWRIWDLHIHTPGTKKNDQFTGATLDEKWQNYVASINSS
TEEISVIGITDYFSIENYFKFKHLVANGEITKNFDLIIPNLELRVLPVTG
SSTPINLHCLFNPSIDSEIESRFLAKLQFNNSGSNYSATKSELIRFGRDH
LNNQGLDQESAYKKGVEQFVVSLDALGLIFKNDPTLRENTIVVVSNRSTD
GVTGAIKHEDFFINKGESQLDATRRSLYQFSDAIFSSNDKDRLYFLGESV
DDEKIVIRKCGSLKPCYHGCDAHSNDKIFKPDGNQFCWIKADPTFEGLKQ
TLYEPEDRVKIQAIQPDIKNDRYVISELQFIDSGNLFGNQKIELNENLNS
IIGGKSSGKSLLLFSTAKSIDPEQVDKASKRLNFEGYRFETTFDFKVVWK
NGDEDFYSKSDPTQKLHKISYIPQLYINYLVEKNNKEELNQLIKNILLQD
SEFKVFYEDTFREISDTNSEIERLINEYFKVRKTALDVQQKSKDTGNSKT
IQQAIQNTTELIQKGQQASNLSETEFKEYSELIQTKSTLDAEIRTMSLKG
LTLQKVLNEVISNKHNLLGKIDTGVILKGSIDRILDELGDVGADIAKIKE
LIVSDYDTLIYNLQQNIISLDLVNKKMAIEKQLIQNSEKLRPLLEKLAGQ
KEIQKLSKQLEDEKQKFQTALALEQQLKNLFEEYNNIRKQTANLLQQRFT
LYKRVVDKVNSTKNNIGSDIQLNSSLIYRQENFALNNQVNKAALRSDHYF
NGFFVDGALQYPKAIELYEKPLRVNEEKLFSTTDDIIPLKLKITLEDVLR
GLAKDSFEIDYTVTYKGDNLLHMSPGKKGTVLLILFLQISSSEFPILIDQ
PEDNLDNRTIYELLCKMIKEKKKERQILIVSHNANLVVATDTENIIVANQ
KGQSDGSSGGTYRFEYVNGSLEHTKIKDDRILSILQSQGIREHVCDILEG
GDEAFKQREKKYAIKN
>Cag_1265 putative membrane-located cell surface saccharide saccharide acetylase protein
MKLNYRPDIDGLRAIAVLAVLLFHTNTPGFSGGFVGVDIFFVISGFLITS
IILNDIEKEQFSLARFYERRIRRIFPALFPVIAFTLVVGAYLFDAGAFKH
LGQSISATTLFFSNILFLRESGYFDAPSLQKPLLHTWSLAVEEQFYIFFP
LALLLIHRYLKSNYLLWIGITLMLSLAASIWKVHHNPVATFYLIPTRTWE
LLVGSVLAVGVLPNPSSSWLRNMLSVIGLGLIIYSVGFYTEATLFPGHNA
IAPVLGAGLIIYAHKESDTTIINKLLAAPPLVFIGLISYSLYLWHWPFVA
FMKYLMFRPFNIYERLSIIILSFVVATLSWKFIEQPFRNKKIVVSNKKIF
AFSATTILLTATIGQFIYIQRGMAWRYPEANATIQKCIDDPYVQKSKLFS
ENEMVIEDFINNKKRPDVIGKEDVVPSFILWGDSHAQALIPAISNQANRF
KLCGFVTTKGKGCPPTLGIDNCNTPFNEGRFNDGVIAFIKKNPNIHTVIL
AARWPLYNKDVKLKDLYSNKSEMSNSELLINGLQRTINTLLAMNCKIIIL
SDVPELSDNVLRYFWLKNTIYKNSNIFFPTQSTTQYYNRNNGIYQLLTKG
FASKNITVLFPEKLLFDKHGQSIIMHNGKLLYRDKDHLSQYGAYYIAPIF
DEVFKEMAAK
>Cag_1525 type I restriction modification enzyme methylase subunit
MLQNNPELKSKIEQLWNKFWAGGISNPLTAIEQITYLLFMKRLDELDLKK
QADAEWTGEPYISRFAGEWIPPEYRAKLSEQDTVEEQQKKQAEATKFAIA
KQTLRWSEFKHMQAEEMLLHVQSKVFPFLKDMNGAESNFTHHMKNAVFII
PKPSLMVEAVKTVDEIFEIMEKDSQEKGQAFQDIQGDVYEMLLSEIAQAG
KNGQFRTPRHIIKLMTELVQPQLAQRIGDPACGTAGFLLGAYQYIVTQLA
IKTSDHFRGVTNMTDRGAHTFQPDEDGFVRTSVASGLTETAQAILQSSLY
GYDIDSTMVRLGLMNLMMHGIDEPNIDYKDTLSKSYNEEAEYHIVMANPP
FTGSIDKGDINENLTLSTTKTELLFVENIYRLLKRGGTACVIVPQGVLFG
SGTAFKNLRQLLVERCELKAVITMPSGVFKPYAGVSTAILLFTKVYESKE
KVRQPATHQVWFYDMQSDGYSLDDKRTKLEGYGDLQEIVAKFHARNPATN
SDRTQKCFFVPYADLEAEGYDLSISRYKEDVFEEVHYEAPSAILHQLLAA
EVGETANAHLANIHGGIVKELLELREMIG
>Cag_0516 dTDP-glucose 4,6-dehydratase
MHLLITGGAGFIGSHVVRHFLTRYPSYTITNLDKLTYAGNLENLRDVEQL
PNYRFVKGDITDALFIMELFQANHFDGVIHLAAESHVDRSIANPTDFVIT
NVLGTVNLLNAAKASWQGAFESKRFYHISTDEVYGTLGNDGIFTESTPYD
PHSPYSASKASSDHFVRAWHDTYGLPVVISNCSNNYGSFQFPEKLIPLFI
HNIIQQKPLPLYGKGENIRDWLWVVDHAAAIDVIYHKGKLGETYNIGGHN
EWSNLALVRLLCTIMDKKLGRENGSSEKLITFVTDRAGHDLRYAIDSTKL
QRELGWVPSITFEEGLERTVDWYLANGEWLHNVTSGEYQHYYEAMYQGR
>Cag_1822 Glucose inhibited division protein
MYLLPNTTTITKSTMPHPQQPHAILEGLCKEQSLALAPEMVDKLVEYGRL
LEEWNNKINLISRKEDAPIIIKHIFHSLLITQHHTFQTGEKVLDLGTGGG
LPGIPLAIIAPQATFLLVDATGKKITACQEMIATLKLTNVTARHLRVEEL
HGETFHTIVSRQVALLNKLCAYGEPLLHEKGKLICLKGGSLEQEIKQSLE
ASQKHHGFPARVEEHPIDEVSPIFSEKKIVIAYR
>Cag_1848 Ribosomal protein L2
MAIRKLAPVTPGSRFMSYPVFDEITKSKPEKSLLEPLTKSGGRNSAGRKT
SRHRGGGHKRHYRIIDFKRNKDGIVATVAAIEYDPNRSARIALLHYIDGE
KRYILAPKGLKVGDKVESGEKVEIKTGNTMPMKNIPLGTDIHNIEMKAGK
GGQIVRSAGAFAVLAAREGDYVTLKLPSGEIRKVRVECRATIGVIGNADH
ENIDLGKAGRSRWLGIRPQTRGMAMNPVDHPMGGGEGKSKSGGGRKHPKS
PWGQLAKGLKTRNRKKASQKLIVRGRNAK
>Cag_1320 Elongator protein 3/MiaB/NifB
MTENKKVLLIFVQADSGVDGASLLDEKAISRNLFATLKDKALSNIIRRAQ
FAIPPLALMILSSQKVEGISQTICDLRFEKLPLDEHWDMAAISVQTGAMR
PAFELAATLRSRGIKVVLGGPYVTIFPECCREHGDTLVIGEADDIWRDVL
KDLRNNTLQPEYRVTTFPDLSIARPVEKSALDIKRYFTTNVVQTTRGCPY
SCDFCNVHVMNGHRLRHRAIADVVRDVERFLLEDKRIFFFLDDTINADEH
YALKLFQELEQFHIKWVGQATTTLGEKPRLLETFARSGCGALLVGIESLS
DGSNHAHQKFHNPAARQLTSIRAIRQAGICVYGSFIYGLDDDTLELPAQL
EAFIEESGVDVPGINLLRPIPGTGVFNRLAEEGRLLHASNNHDAFRFSWG
QEMLYKPLRIPLNHFIPSYTNLTSRVFTVQNAIKRALQAPQLRSAVLMFN
LLYVHMYGMARKDLMRQLRELK
>Cag_1579 hypothetical protein
MITTRKAMYLNPQFIEKAGKKEFVVLPYEEYQAIEKMMEDYMDLIDVRET
KAETQNQPSVPLDEVITMLKKRMNV
>Cag_1658 glycine cleavage system P protein, subunit 2
MKEPLIFDLSRPGRKGYSLSPCDVPEVPLESIIPASLLRKEAVELPEVAE
NEVVRHFVRLSNLNYHVDKNMYPLGSCTMKYNPKVNDYTCDLSGFSALHP
LQPTSTTQGALQLMYELSNMLAEIAGMAGVSLQPAAGAHGELTGILLIKK
YHEVRGDKRHKLLVVDSAHGTNPASAALAGYETISVKSNGDGRTDLEDLR
SKLDGDVAALMLTNPNTIGLFEKEIVQIAEMVHANGSLLYMDGANMNALL
GITRPGDMGFDVMHYNLHKTFAAPHGGGGPGSGPVGVNEKLLPYLPAPLV
VKEGDTYRLTSGGDDSIGRMMNFYGNFAVLVRAYTYIRMLGAEGLRRVSE
NAIINANYLLSKLLERYELPYPKPVMHEFCLSGDKQKKAHGVKTLDIAKR
LLDYGFHAPTIYFPLIVSEALMIEPTETESKETLDIFADALLAIAREAEE
NPDVVKMAPSTTAVKRLDEATASRQLTICCM
>Cag_1620 Magnesium chelatase ATPase subunit D
MIAFTDIVGMDLAKQALMLLAVDPELGGVVIPSAVGSGKSTLARAFADIL
LPGTPFVELPLNVTEDRLIGGVDLEATLASGQRVVQHGVLSKAHGGVLYV
DSLSLLDSSAVSHIMDAMSRGEVLVEREGLSEVHPAKFMLVGTYDPTDGE
VRMGLLDRIGIIAPFTAQNDYRARKKIVSVVLGERDAEDVQDELNMLVGF
IKAAREQLQHVAISDDKIRALIQTALSLGVEGNRADIFAIRAALASAALA
QRNDVDEEDMKQAIKLVLVPRATRMPDREADAADAPPQEEMPPPPEETDM
EDEEAPADAPDSDGDAEEEKEETPDMLEELMMEAVETELPDNILNISLAS
KKKTTTGSRGEALNFKRGRFVRSQPGEVRSGKVALIPTLISAAPWQEARR
RDSKLAGKSKSALIITKDDVKIKRFRDKSGTLFIFMVDASGSMALNRMRQ
AKGAVASLLQNAYVHRDQVALISFRGKQAQVLLPPSQSVDRAKRELDVLP
TGGGTPLASALFVGWETAKQARVKGISQIMFVLITDGRGNIGLQSSFDPT
AEKPSKEELEKEVEALSASISADGIAAIVIDTQMNYLSRGEAPKLAQRLN
GRYFYLPNAKADQIVEAALSF
>Cag_0165 Protein of unknown function DUF28
MSTSSLITIFYIMSGHSKWATIKRKKAATDQKRGNLFTKLVKEITIAAKM
GGGDPTGNPRLRLAIDTARSNSMPMDNIQRAIKKGTGELDGVSWDEITYE
GYGPAGIALIIETATDNRNRTVADLRHIMSRNNGSLGESGSVAWMFQRKG
SLDVPHSAVSEEQLMELLLDAGLEDLDQDDENYFTVITDIKDLESAKKAL
DEAGIAYENAKIDLIPDNYIELDADDAAKVMKLIDALESNDDVQAVYSNM
ELSESAMNSLSE
>Cag_0616 Parallel beta-helix repeat
MKPRFYIEQLEPRILLSGDILSELVPLLSSREASQMQSDYLLEHPEARRV
APLSAQEAARACMVVVQNEAPPLLTEDGLMYPFEVGVGDERSSEANAEPT
LAADFSADYTFSKSEWDALEDGWRNLSSMVGDTLLDENLVAVESLLSGGS
RLYGGDELAALLQQPIDEYGSVFAQSSKGVLEALTQEWRNGDLVVVGKVL
GGYNATTQDERFELSTKSNDEHLSSANGILDVSWFDDDVTYLFNTTGELT
IGRGEWSFTDGVLCFDEGVLSFSLDDVDSLIGGTGDDTYVFAGDVPVYIA
DSAGDDTLMVLGGEATTWNVEGEGTGTVGGLWFQGMENLIGGINNQDTFV
FGERGSIAGAIDGSVGGYDTLVLAGGTYNSITYEAFSPQSGTITRDGDVI
TYYGLEPIIDNTSTVERVLGLSNASDTNAQLSSGANGTLTLSGSTFESIN
FIKPSSSLTIRGLDGTDTVTISSINLGATALTIEAENIIVSTDQVITSSS
DITFTAFDSKNSTSNSNVTTTLGATIEVDGTISTTGKLTLGAQVRSAITV
SNSSLSSSVSSTSLVRAHVGSHAVITANALSVTADTTTAITVTVTDVIAG
NTTVNSQQQTFAIIDGGATLAISTGAISANEPASVLVQATDNTSITTTVS
SDDDSFVSLTGFDVLVSSITLSRDTKATIGDANGRLNLTGLNGGRAGVVK
IIAEHGGRTLGNVASSFVGVHTNTITKDDVIAGVQNATFSIAALAISANN
SSTTQAISKVSTNNITGQTRAWLHNAIVDAHGLAGVSVIALDTSITYAES
SDFQGNTWQFPSIEVGKAAVSNEISKDITATVLTTTMTVPSGAVVVEAKN
LQDTRAIAKATTVVDSVWNPFNTFSMSLGGTYAWNQILGDVIASVTDSSL
SNAASLTVSALNQSVVDARTEATSQLTGGSGSALAVSVAFNAIGWSLGNF
LFAALDAIIGDDFTVTANSTDTLAFVSQSNISVVGDVIVSAVNNSEINST
VSNAAETKASALFNAVGMAAGGLIASNRMLGSTKAYVDESRNGKKVSAGG
NLLISATDATLIYSNTKIVSSSITTNDGGLTFTKDLAAVIAYDYTTNDGS
KQLQFGDLVFVENDYENGGTPRMLYKYLGQTATLDLGEIDYTNTDDWKPS
EVTNFFPSGINITDSDSVSIGGLVVRNDVRSTVEARISNAVVTAGSVTVI
AEENATIQATADSTSKSSGGSSFGTGTSLAVNGVIATNLVLADASALLSN
SDVTTTGNVTVAATNESNIDATTKAITESGNQAVGVVLAFNTMGWEAQNV
LFAAIDALLGTGIGNEDPCRVEALVTNTLLDVGGDLSVTADMTAQLTADI
SNDTTSAASAFIDASGMAVSALLASNMVSGSSKAAIDYTSTKGSVTVDGA
LTVASSDAAGIDATSSMKAVSSTTNDGGASLLGGMVDKVGALYKYSSHSG
SQSIANGDMVKVASDHEAGGVAGALYKYTGSTQTLNLGSVDYSGSNWERL
TVENLSSALFPNIGNITESDSIGVGGQVVRNDVRSSSIAYVNNATITLVG
LLTVTSEEASQLSADIESIVSSSGGSAFGEGTSLAVNGTIATNLVLSQSK
ATITNSTISGGTESGVEVSATNDAAIDAFIDSVTTTGNQAVGVTLAFNTM
GWKAQNILFQAIDALIGTDIGDEQASKAEAYIKDTTISVGKDVVVNADNA
VQLNATVSNAAESVASALFGATGAAASGVLASNMVSAGAKAYITYSTSGS
VTAGGDVTVTATDNAGVYANVKLVSSSITSNSGGMDVLNKLNNNRATVAL
VDFYSSDGTQPIKFGDYVSLAADYDEEQGEAGTLYEYLGSDATIDLSDTD
YTNADDWKQFPLTNIFPQGLNVASESDSIAIGGLVVRNDVRSGAEASISK
SNVIAGSVTVIAEENATIQATADSTSKSSGGSSFGTGTSLAVNGVIATNL
VLADASALLSNSDVTTAGDVIVAATNESNIDAITKAITESGNQAVGVVLA
FNTMGWEAQNVLFAAIDALLGTGIGNEDPCRVEALVTNTLLDVGGDLSVT
ADMTAQLTADISNDTTSAASAFIDASGMAVSALLASNMVSGSSKAAIDYT
STKGSVTVDGALTVASSDAAGIDATSSMKAVSSTTNDGGASLLGGMVDKV
GALYKYSSHSGSQSIANGDMVKVASDHEAGGVAGALYKYTGSTQTLNLGS
VDYSGSNWERLTVENLSSALFPNIGNITESDSIGVGGQVVRNDVRSSSIA
YVNNATITLVGLLTVTSEEASQLSADIESIVSSSGGSAFGEGTSLAVNGT
IATNLVLSQSKATITNSTISGGTESGVEVSATNDAAIDAFIDSVTTTGNQ
AVGVTLAFNTMGWKAQNILFQAIDALIGTDIGDEQASKAEAYIKDTTISV
GKDVVVNADNAVQLNATVSNAAESVASALFGATGAAASGVLASNMVSAGA
KAYITYSTSGSVTAGGDVTVTATDNAGVYANVKLVSSSITSNSGGMDVLN
KLNNNRATVALVDFYSSDGTQPIKFGDYVSLAADYDEEQGEAGTLYEYLG
SDATIDLSDTDYTNADDWKQFPLTNIFPQGLNVASESDSIAIGGLVVRND
VRSGAEASISKSNVIADSVTVEATETATITATADSTASSSGGSAFGEGTS
LAVNGVIATNLVLADASALLSNSDVTTTSDVTVAAVNESTIDATAKATTS
TGDTGVGVLLAFNTIGWQAQNILFAAIDALLGTSIGDEDSCRVEALVTNT
RLNVGGDLSVIADMTAQLNADISNDSTSAASALINASSLAVSALLASNMV
SSSSKAAIDYTSTKGSVTVDGVLTVASTDATGIDATSSMKAISSTTNDGG
ASLIGGMLNKALSLYEYSSHSGRQSIANGDMVRVASDHEAGGVAGAVYKY
IGSAQTLDLGSVDYSGAGWQRMILANTFPQFGNVTGSDSIGVGGQVVRND
VRSSSDAQINNATITLGGLLTVSSDEVSQISAYIESVASSSGGSMFGEGT
SLAVNGTIATNLVLSESSATVTNSAISGGALSGVEVLATNDAAIDATVSS
STSSGDTAVGVTLAFNTIGWQAQNILFQAIDALIGTDIGDEQPSTVEAYI
KDTTITVGQDITVSADNAVQLNATVSNAANAMASALFGASSAAGSIVLAS
NMVSAGAQSYISFSGTTQGQVTAGGDVSVTVTDDAGIYSNVKLESTASVS
NNGGVSVLNNSLRADEANKRADYLSSDGEQTLQFGDLIALTQDYDNGGIA
GGLYRYMGKDGDSVDLSTANYEDGGYWEQDNQSVGAVGEWLENLDFTNSD
APNVGGLIVRNDVRTGIEAYINASAVQAATVTVSAYENAIIKAINDSQVG
ADGGSSKGKGTSLAANGVIATNLVLSSADAVIKGSTVTTTGDVTVDATND
SFINATNLSATQSGDKAIGVTLAINTLGWEAQNILFQLLDTLLGADALGD
EDPVTTQAFIIDSTVQAGGALSVTATSTASIKATVSNETTSQADDIKGAS
GMAVGAILASNLVSSSANAEIISSTTAQYTITAGDDVTVVASDDAKIRAN
AKLSAVSSTVNDGGMSLIYQSINDIYPIAFTDRSGVQDLIFDEVDTPILV
RLDEADYKSFDQPEQVVAGDRVQLEFDCAGGASGDVFEYIGASPLEGDFA
LDEQNYSNTSLWNKIKGVTGAVYSYIGNNDSDVNLATEDYTDKTRWKPFV
SFSPSSLIPGLNLNITDSSSAGFGGLVVRNDVRSEVDAHIKRANLTAVGD
VLVQAIESAVILAQNDSVVKSSGGSMFGKGSSVAVNAMIVTNLVLSSADA
SITDSSITTTSGGDVSVLAENTSTINVKTVSSVESNGFGIGVNLAFNTVG
WEAQNFLFNTVDALFGTSIGDEVPADVKAWIQNTTINAAGGVHIEAISDA
NITAPIDSAARSISVTPAGGSKTVTVAAIIAMNKISTATTAFIAGASTVT
AALGDVVIHAEDTSLIDAVVHSSALSVAVGVKDGTAISISFAAARNEIRN
NVEAAFQNAGSAAAPVRVSGDVIVTAKKSATINADIKATAIAVAVSGKGG
LAVSGGGAIAFNTILGKNNAFIANSVVDVTGGDVTISTTDVSTIDALVRA
AAVSVAVGAKSSPAIAIGLSVARNLIGWDSESVSNKYTNTSKPSSLVKGD
TVKIVNGPLTGNVYKYVGENITDSTKVKLGAENYSDRSSWEQIGLASSAN
QVQAYIKDSSVTASGKLDLSATSTALIDAKVLTLAVAVAASGQSGVAVSV
GGVYTDNSIKTDVKAFIDGSGAGTSNIRAASLSLYASDGSSIFTTAAAAS
VAASLAGQSGIAVTIGLAIAINEITGSVDAYASHCDTLTTTSGGITATAI
SKGTPLFSIDLPSAGLSVAQLDDMAQHEGDSSDTKNVDEAVVDATADAAL
LTKLADALTAQGEDIADFESVRVDWLYTTTDASQKLKKGARVKLENGYRG
GGIGGVIYEYLGANDTTVDLSKADYSDTAKWKVVKPELKVARIDTVVGEN
SYDYTTSNGSTSISKGKQVKIGDTYKNGGEAGRVYVYTGEDTKTFNLSEL
DYQEESDWTLLDLPVLGTTWQIVTGDGTTYNLKLSQDGTKVEVSRINIRA
IAAAASLGVGVGGSTGVAVSGAGAVAINSVLSETAAHLDNSTIAAAGAVM
VSGTSNSMIDALVIAASAAVGAGGSTGVGASIGIAVARNFIGYTATANDG
AGGVRASITESSVNAAGAILVDAHANQTIDSLVFSGSVAIGAGGSTGLAA
SGSGVWSENRISMDIVASIDGDGSNGIRATALNLVAEDTSQISSFAAAVS
IAAGLGGSSGLSLSIGATFARNSISNSVNAFVSHVDQGVVTTVLGITITA
SEESNIEVIAAAASAAIGVGGSAGIALSGAGVDAKNVIQTGTLAYADSGS
LTSAADLNISATDESTIEALVAAISISVGVGGSAGVGASIGVSLARNTIG
WTSDPNTSYTYTTDSTTSSINKGNRVKVLHGVRENEVYEYIGDKAITGLC
TGDKPTLLNNLDYSNTKNWKRVDLVQVQAPVKAFTNDVSLTVDGDIAITA
KADQAIDAWVLAASAAIAVGGTVGVGLAGAGSSADNTIEVDVEAYMAGAG
TISANDITLLATDSSSIDSFVGAVAISVGVGGVVGVSVSIGISIADNRIS
TITKAYIADATTVTADAITITADEAATIDTFAVAASLAIGGGGVVGVSLA
GAGAIAENTINTITHAYILNSAVTATGDVTISATNDAEIDAYVGSVAGSL
AIGGVVGVAVGIGFSLAENFIGWNESGDAKAQVFAYVDGSSIDTIGDLSV
TAKNQAIIRSGVDAGAVAIGGGLVAGAASGAGTDSVNRIEVDVIARIANT
FGDGIIADNVLVQADDEAIIRARTLAASLAASFGAFGGSVSISVSLADNT
IADSSQALVDNADIMAGGDVTIKSRANADIAAISQAVSMAASVSLGFSLA
GGGAEATATIKTTTLALAERTTFTLDKGDLTINASNTTKADADVVAAAVS
LGLAAAAAAGSVTDIATDPITKASLGDSCVVLSGGDVSISATADNSAHAY
SAGLSVSTGISVGSTNALLAHDGEVSATVGDGTKVTADALRIKSNAESDF
LLTSEACSGALLAGITGSYSGLDDTIEVTSALGTDSEVKVNTFELNAKNN
HVFDSSSDSVTLGLGAGGAAVVENTLSSSSSALVGDNASVEAEAIIINAT
NKADKSKYANQNNVTAASASAASATALSSTTTFGTDTDPFTARVAVGDGA
ELIAIGNYLNPARLDIASFIDINGFDKVSVEGISLVGSMSVARSVLEAAT
LAEVDLGDARIENRSGDLYITTLTYNKINPATTLSVMSGLSGIAGAESKG
DINAINHIVTTGTTILGGTVYLKAGVDKGGDVNRIQASADATMTMASMFP
NIGVPLPDITVDETNLIELKGDSNIKAIGNVYLTAKQGIYGDKSGTETGS
QVSLSMVPYGQSIDRNGDYKNINKVAIATTAKVEAALNNQVFFWVKPMFV
SGKDGLTRQLASEKLGTIVTTEELTALNIETTIPYKYERLDLSAISSLGT
PEATTLATSLQDMFYVIQPVEMPAAKLTVANVQNMLVERQEALKKLMISQ
SNNAEAIARYQVQLDLVEAALTKLGLTETVITVTPNGEVSTTLVKKEYDQ
LFLDMPNVYAAPGSIFIDGTHTDGNTITTLLGTKNLVARAAALIDIFNGT
PINLRVQDAIIRDTTRVLPVNGELTTFLPGNVFLNNQNLTKKEAKVENTI
KIVQSSLPADLLGLGAFKQLSSGQQDLYLIGSVVNQDGNITIRNNEGSIN
VSGELRGKSIDIIAAGDFNLNVSGWQHIQDPRQYVDYMYIAKEDDLYNDE
GTPTSKEYTSLDGTELDDAVNHPDDTKSRIIALGMVNITAQYLNINGLIQ
SGVSELELNITKEFKPEDTVEFMDQSKTTAQTLDDIRVQDSLSITGVNFG
KLDIPVDGYYDGAKNRIVVEKLDLSPGQITLAGHLLSTGGGELRVSDGSP
KVTIKNESGIDLELQGISMDAESVGKITLFDTPMLTRTEYFVRNGKVEMV
FSRGTKEVDEKTGAASINYTQEGATVPYELTDEIIYQPRENQFYVWTEGQ
EKTQVEIRKYEENVFDFGFFEWDAIMADESYDFRSVEYLDEKPLLESEDI
FFPTDAMMPKGVTDTTYFFEAYFQATDEEIDVTNGVTQVLDISNYKIYRY
KDSASSESQNSVSMKLPEQTLTDTTKWEYVKTLVTKYDKNATESLTLTKD
ETVLYVPGNVVLSSGENAEGAEIIYPWSDQDGKTPDKFYLYIGGSTSKLP
SAIDLSDEDLWEELERKDSDHPAKWQDTKGDQFFSNFVNYTYSVRTWTTG
GGFLTKKTYHFETTTSTGLKDYYTYALRADYPIAISVEDSADAGIFVNTN
SNLILSGNLKVPDGSLHVINESGQVVVIPEDEIGVEGDPEGSYRIPNGEI
VLEATGSVTCGEGAAIFGDSPEVVAGGSVYLLLEGDKGELNVEAGGDITV
TVFSLDNKSSRLNAGDVVSDYGDVVIRAANGILGAAPFIQGNRIELDANK
GEILALGEESDFVTINSDILGTGGVAARATTGIALTEDVGDLKLIAPTAW
AGISVVATEGSVQLDAPHGSIRDAVYENSKFAATVVTAENGEEETEAERA
ENDRRRYALSQGLMGELFPHARLFEGGVTSSFETLNVSGINVTLTAGGAG
MVGNVSNVLTINRPDQFDAVTLEQKQALANAGASDIIGVRYTMYQYVGAD
AELDLTHIELLPNVTLVQDLSSGIIYRYVGAADKRASLEEITYATDSDWA
VTDINLKDFTANPTLHQYLSSDLNTTRFSDANVWREVTIDYSTNTDRKQS
QVVSLTTNSFVEVRYSKSEYGIYQYKGASGSVNLSSESFDNADRWVKREA
DHATDDGTIALKKGEWVENRFIAERLTLQVWDDVDIQATGLFKATGEHIA
VQATGNLTIDKVLAGGDVRIIGDGNIVLDKASEVLASDANSDVYLDAGLA
LHLLQGSGVTAGAEFEDVDGKPVPHVTGAHSTIYLISGKEMFIAGSVTAS
SEMTLKMGTSTMPYASYFDTIPGKELFTVAPTPALLDALEALTLPSAIST
AFTTNKITLSGTPTLRVIEEGKRWELTDGNGTQYIVYASDPQDLGVVEEI
QVLTPHYLHGQRGFGFLLTGTLTTLDGDADVSVSGKDDLIVRGNINLAGI
GSDLSLQSDKWSYWEGFAEVGGNIEILGGIERSGILLIDREGANDDDVSV
YVHTTSSLVTPYAGTAIAIRGSKDVVLDGAVVAGGTIGEHGVTFAEDGDA
TALITAGATLWMNNAVAASGDITINAGKVFLDTVSGLTSAGIVSGGIGST
ITVNATGDIQVLGHILSGGNTEQTFDDDGNLVSEVITWINDLSTVVLNAE
GQLYLGGMTETKSGGEIETGGYVRAAKSIALIGGTPTVGSIGVLVPGAAE
VLVHNTDGVISISAEGDAEVLGLLVAGGEVVEQRDSRGDYIGRVISTFNG
NSEIFIEAESQVRLGTDIRAGKTIDVRGGKKSADATSQYAADGLVLYGTA
QLATWAEQSNINLSGSGSVRILPAGWMREVEAEGFAEFADGTLSADVTLH
IVCDGVDKNVVITKQATLDNDSLTGLRLDVQAAVDAAFGAGKMAVYYRDG
RFLFTSAYEFSIKTDSVRAAALGFTQLTAGDAVAISHIALDAAAEGSTIN
IGSATESNAAIYLGAKVRAHSGVLFNGSSDVGGALELAITSEIETLSGTI
TINPGVNGVFAGSIIARGDNGGIVVNGGDSFTLKGSLAAETNILINAGKV
ITENTSTLKALGSNATIRIAGTNGVEINSTVGETESSPDLAKLFIESLEG
DVILKKESGKIITSAELQISGQNVQLLGWVTSNRNTTATYDNEVTLRVDN
TLTLDGTFELAGSLLLAGKDVTLSNTSITISGAGQHLRIEADNDVVIGNI
APQAGDTSSTAVDVSATAEVSIIAGGDVVVGFNAAIFSTADNSKVALSGE
NVMVAGTVRGGGYRNSANANTIEWSGKNALVTIDAAKLLYVGGTGLDEDG
AVVAQWGGNLESSGRMLISAGKNSSNVGVGIGTSSKVSVDATAGGKFTAV
TLGYLQLVSSGDVQIASLMESVDNGSTVSIKSDGLVLVDGLVRAEQSLTV
AGGSGTGGVGILVQPTVFDGATRLSGGTLDTADGGSINMMASGSILLQGT
IGQLVGATPAPTVASITAESTTGALSVGGSGRVDASATIALKGKSITVLA
GGRVVGWGNTSEVLLQSQTTNFIAAGGGVTAHALTHLVGSTVRIDGAVIA
SDESGKVLLNAATEATITGVVQSAGDIAINSGVNLRDWSLATLLAPITDA
NLAEGDIKVLASGSLAATGAITLQAGRDLVLLADATVAGTTTVNDPYVYT
APYEIKVVTGNIDVAVNTIQVPEEVWTPTTITVQTGTDQVKVGVEYHTME
VTLSQKGYYNPRAAEADRFREYFVEGVDYKNNAIDWSGLKTKLGITATVE
EPSADYTASTYKRFEQLDSNQRQIVLASLGYMPLFAFSYTNAFVHKTENG
NPSEAAWTPTWSGQPSVIYRVDVDGWRDRYIEMPKGAQDDVLRVVSQGVT
NYLTGDTTLDGNENGSGWASGSGTGELVAQYKEEANVAYDQKYSDFSLTD
NTPETGNRYTIVDEDTIPPDVDTDTYEYVDTDEKNDRWTVSYADGGLFTI
QEFGRGDVTLEQMPGWKVAQSGEIDAIDPKTDGPDGTDRNIYAPGTFVTA
LTSNTETLDLTNTTTIADQTIGTLTYTKMVDKWTSDKEEWLKTFDPEKYI
ADLNEVAAQLGFSITLTMANLNASLVKGWLDSHAVYFYEDGNYDGWQQGY
FPGSYTNTQNYSIDDDETGGDLDNDDFMSVKVPAGLRVTAYENNSYTGEK
VVITTNQPTLGALTLGDDNTWDNEITSIIVEWQDEDGNWSEDPVAYWATG
NANYVEFFLEDYYNSNDTNNSQTFWPGRYLNDSDSAWATTDLNEDDFTSV
TVPANMKVTAYWDDDFKGESIVYTSSSSNIGDHWNDEISSVKVEANVPVA
TTVTLDIKEDQYDYDYHWTSEWHEIRDERIQQRYNWVSQDHDIWGYRPVY
QTQDTWLKSITYHDITLWETQEVTGTQTVLASALGLTPSSVPYASFDDIS
ISASTITISVGRDSTLSGNISSTGTAQTDTLDVSAGRNLTVKGIVPTGSE
AGTLASVATLSSQTTVALHADTTLTLASSSLLQANQAGGSITLTSGKDSL
LSGQLSATQGSISSEAGGNIALQGKTTALNTITVSAGDVDQNPSQSGGIT
GTIRASLTVTAATGDISLSAGHYGGNITLTNSTVSAPDTLSFTADAGSVT
ATGNVTSDTLYGRAERGFDVSSSSTTTVNNVTTTKPVANATVNRLEVDVK
GVGNIIVNNSKAIELTDVQANDGSITILAYGSITVGDVQTLAISDRNDIT
LSALALGNAVSNLTVHTLKAGGSGDVILALNNGDFIQAGGVVVADELSLD
VRGAVSITTDVKSISLTTKEAGNVTITQSTAKALKLDTVRILDGSLMVSA
GGDVTLNDVRLLSNQDANDVTVMAADSIVLGYVSAGIYAATAGEVPLPDT
IKYPNAQSGISSHGDITLTATNGAIRSEVGAGALQLVADEVTLDAKTGIV
LDVAVNKLKDITTNAGDITINENDGFGERTIGLQVLKVAANTTDGDVRIT
AENALYVGTTSATTAAITGSTIRLTSTNDVLEVVAPTSGSTLNYTKGVAF
IAARDLKLYTFFNAPDFIEYRAGSYFNFTLPNAITAKSIILETGGVIDYE
GTLTAYDHLELISAADVFVSGNIYGTPDELIAVAKGRESYARTIQKLNAS
GNYVTVSSSDTGYVNFQVGSMDAAAYEIRALHDIFVSTTNNLTLNGFVGG
LSGFEKAVNVTIDTGSATLTVPSGIVSASGELTLRAGNINSSNGSIFIAN
HIDAQASGAISLNTLTATLDAASSGSGAIRINEADGLTVGEVIANSGDIE
IAAGDTFYVGLVRTIVDSDNDITLRSAGTLYVDNVEAGFAAGAQKTSSDV
TLDGKNGILERSSEEAGIDAFGFTVTLKGGGSYNVIKVPSDLPGSGIEVS
YISGTGGVTTGATTMSGGGIPSSVSGDYVLIAPTHSGSTNISVSGSLSVV
ELPTSAGNNVTLSAGNDLAVVVPLNVGIGTVTLSAGGGLSLGDKVTADTL
NVTTGGNLNLTSDVDHLIATINNGGDLVIQQTGDIVINSLSMSGGDVTIV
ATTGSVTINSFSGTGGKVTIIADKGIVLNNTAHLDSLTLNAKNGGVSGNI
NTSTLNLNATGTINLVNQGAVVVEQLATTTGAISLQSSGTMTVNSTVVAG
GANTIDLKTTAGDLNLNRALVSATGAITLQAAGDLAMSALADVTSTSGTV
TLNAGGVLTMTGDTWVKAGSGTIALQGGSTVTVGKLVTSSLTNLTITSTG
GAVHDGGDSPTDIYAPDATLVISAYNGIGGGVYGALETEVAAIEFTNSGS
GTTGGIFIEEVDDLAINTIVQQASGDISITTLNGAITTTAPTSGSGITAT
TGNITLYAGGTNGVIHHYADITATTAVGKTVALTAESGDIMLGAGIRVTG
AGDITLKALHGAIVNDHTTVGWRTVSSAFDAQIDWAMRLGKFVVTQATGE
IRTANIESYNVANHIANNQELRAENGNYLQTTGGRLTLKAQDEIGERYGS
FLFSPLALVVDAVELSMSSSERANVSVIATGSVKVVSDSGAGSLGGSTGV
SNLTGGQNISDAVDASGEDITIVANDVTFTDTLRSAGATITIKTLDPTRA
IQLGTLSGDVENTLYLSTEEIEKLQAGFDTIVFGSTEGSSVINIGDPDDD
DDIVEFHDDLLLMNPAQGGEIYFYGDLKAKSLTIKGSGHTTHIIDSALTS
DDFIDFNDSVIVEGTGSVTADTYIQIGATSSHSLNGDSDDTNNPDKLTLN
AGSYITVNGPVGNTDSLDGLYIIGGAGLDNIIGTDDDLAGATDVTFWGQV
IMDGDLIIKATGKVTFKDSVILNSGNLRITGATQIIFDDSVTVNGTGSSI
LLEGDEIDLPSGTSSVKGNGTLTIRPSHKTVNVEIADPAYPTDCLNLTLQ
EMKAIANTFSEVIIGWKDSGTSHTVNDVAGTVRIGADNGELGNPTFWNKT
SIYGSAITVEDYGVPTYTLLAYKDLTFDAISNITLKNQVKIYDGTTLHDL
YLYSENGKISEVDSGSDLKEPVYAKNLIATAQTGIDMRWIDVDTITATNE
GSGAIQLNVIAAGGDVTVLKMAQLSSNTSDTTSIALTTENGSITVSNTNI
TASNGVTPLNALGVWTAGNGSITIDANDTGTDTTLTVDKFVSGTTGLITL
EADGLVTLSALVSNSGAGGITIKSNNNNITQNANVTTVGGLITYNAGVTD
GSGSISMTSGTLTDAGAAGGVTYRAANDIALSIINAGGTVSLTADAGAIT
DNLDGETVNDLNIKGDTTALSLSATKGIGTSSEDIDTKVASVTATNTGVV
DSGIYIQERNALTVASGNISASGTGGNIVLDLLDGALTVNGTIGSTGTVG
NILLQTAESSGDVTDSSVTITKNITSSNGNISVLSSDGIVIDDSDIGTPR
LQVSAAGKTIDVQAADVVTMEAGAQMLTNAGNLRVKGGGTVTVGILDARI
VSDRSGSGSLTNQASWGSVSVVSAGGSILDNASDIAVNVFAKELRLTANG
AIGALGVGSSNALESEVATVTASAGGVINLLEATSLTLGSVTAVPVNRVT
TSGSAGSNDETDGQAQAGVVTTAGSAGTIVVVSGGAMTVSNVVTSDGSGN
VRLEATGVTGTLNVNAAVDGGSGNITIVSTGNQSYDANGDVSTTGGTIDV
QATGVGSTIGMNAGTEFKTNGGNIRVMSGTVDDSGVTLNAGGKITVGLLD
ARTTGDRGASVTNNQTSWGSVSVVSTGNSVEDNSADTAVNVYAKELRLTA
SVAIGALGATTSNALETEAATVTASAGTGGINLLESSALTIGSVTAVAAN
RVATTGVAGNGDQTDVAAQAGVVTTAASNGAIVVVAGGAMTVSNVVTANG
TGNVRLETTHATTGSMAINAALSSGTGHITVVAKTDITQSAAGDITTIGV
GTIDVEAGGFIQMATGAAGADTAVSGAKDIRYEAKGGNLTLGSFSTGTDA
ATGGTVVLIASGSIVDGDIGVDVTANKLYMQAGSTGAIAGGGDHLEIAVN
TLSLSAGSGGAFVKESNGLTVGAVALSTLQRVANTSVASTQSGSWEDLNT
TDGGALVLTVQAGGLTVNGGSTTATTGITASGSGNVLVDVTGTLNVNAKL
DGGSGNISIHSTGTQTYGADGDVATTNGTIDVQATAANSTIGMNGGTVFQ
TNGGNIRVMADSTLTVGVLDARVTADRPAAIDNQASWGSVSIISTSGSIY
DNDGDSSVNVYAKELKLTAAAADQAVGKGDQHFETEVAKVSANVAAGGLF
LTESTNIQIGTLSAINVQQVGVDGTTLTPTADSAQSTLTSAGDLVLVTTA
GSIETLAVGGAVSATGNMLLQAGGSGDITLRATVTNNDSGGNTTIDAADE
LFQNANIVANTSGTTIDLLAGRAITMGDSASTSTTNGDILLYAGTGDITI
ETLSSGTADIGITAKLGSIIDRDGTSSEDSENDITASSLILQAGNAIGGG
ANHIETTVTTLAANAGDGGLFITESNGVTVGSVTVTVNRVDDKADQSTPT
TDPSTQVIKEDLIATGTVTNTGNIVLVSTTGALTLGAGTSGSNALTAGSG
NILLDAQDGDLNINAKVESVGVLVDNVLVGSNISLHASGTVTQGASGAVV
TIGTGTVDVQAATITMTHGATTTTGSGNIRYVATSSLQLGALSTSGDVSL
SASKITDAFSSGSDTTNVTADELRLVATSATNSYGIGEYNDHLEISVNTL
AADSKGTGSYGGIFLTETDAIQIDLLNAINVNQVLVTGVLGAQTADTAQS
DLVSSSNLVLVAGGTITVNEGDNDTKAVEAVGNILLKTTGTASDIVLNAS
VISLNAPSGGNISLDAGQDIKQNAGGNITTQASPKTIDLVAGRHITMVGD
TSTTSNNGNILLYATSGNIELETLTAGTGSVSVTAAATPTEATPNVGKII
DIDGTSSEDSENDITASGVLLNAGNAIGDGSNHLEITVTTLTANTGADGL
FISAQEKVADGDIKVDTLTVNVNRVSTSASHATTLYAAQADLTSTGAGNI
VLRSTDGSIILNDGDINGVEADGTNGFAVKNTGGGNILLQTTTAADDITV
NADVVSSTGSISLLSGNDVTFIADADIRTEGTNSVGTIDVVAANGGDVVM
AANSTFASTNGAIRVVAADAIQLGIITTAATGTTGDGSGMVSLTATTGSI
VDAQNLGNLSNDTTVNVTASGLRLSAGSGVGETINHLETNVATLTARAAG
GGIFLYQYGSSDVTVGDVAVVVNRVKNDGDVTDSIQSDDAQSDVVTTAGD
GDIVLRTKNSKLTLNDGTAAVDLNISGVAVQAHGSGNILLETEQASTNVE
ATADVVSTSGNITVKAGGSIAFTDADIRTGTAGTIDVEAGVANTHNITMS
ASSLFTTNSGSGDIRLKAGNNIVVGDIATAADVSLIAAAGSITDADVVTT
TNDVNLNITSAGLRLWAGNGIGETVDHLETSVDTVSARATSGGIYLLETN
GITVGDVAVTVNRVKNDGDVTDSNQSDDLQSDVVTTAGGGSIVVVASAGN
IVLNDGTATVGLLNISGVAVQADTTGNIRLEAAAGSITVNANADVVSGTG
SISVISNVNVDFNSDGADIRTSTGGTIDVEATTGSITQSVTSLFTAGTTV
NDTGDIRLLAAQHVVVGDIATAGDVSITATAGSITDADVVTTTNDDNLNI
TSSGLRLWAGVGIGDAIDHLETSVDTVSARATSGGIYLKETSELNVGDVT
VTVNRVGIDATTTTANSSDPTQSDVAITSGNGNVVITTGGNLTLSEGSGD
TNDNPNPPLNYSGKALNAIGGGNVRLNVTGILTLESALDAGSGNVTLLAS
GLIKQEAAGDIFTTAGTIDVESTADAITMVDGAIAQTNGGNIRYQASGNV
TVGLLDARLAVDRGGSLTKQSDATTPWGSVSIISGASILDNSENTVDVYA
KELRLTATGAIGALGDGTSNALDTEVATVTASAGVGGINLLESTALTIGT
VTAVPVNRVATTGVAGNGNQTDALIQAGIVTTTGSDGSIVVVSGGAMTMS
NVVTSDGSGNVRLDVTGTLTHESAVDAGTGNVTLYATGLIEQKAAGDIFT
TAGTIDVESTAGAITMLDGAVAQTNGGNIRYKASGTVMAGVLDARTSADR
GGATIDDDKRDDQIKTTGGWGSVSITSTVGSILDNSEATADVYANELKLT
ATPAGAGAVGLYNQHLETEVAKVSANVGSAGMFITEATNLIVGQTALLSV
NRVEPNATTSITNSSDATQNNFVSKGALVLVTTAGSIETLAIGGAITATG
NMLVQAGGNTSDIKLGAVVTNTTLGGHISLNAGQDIRQNANIVANTTTKS
IDLVAGHDIKMADGTSTTSANGNILLYAGAGNITIETITAGNITNGYGNV
SVTAAATSGSSVGKILDEDAAGDNGDNPDITANNLILKAGYGIGLGSNHL
EATVTKLTANAGDGGFFVTAKERVTTGTSTDRGVTVESMTVNVNRVDAEA
NVPNTATGTATVTQEDLSVTSGGHLVLDVTSGALVLNAGSDAAYAVTAGS
GNTRISTQSGALTLNAKLDGGSGNISIISSGTFTQAETTGDIVTTGGTID
VSANAIDMKAGADTSAFGNIRYASASTITVGTMSSTGANVSLVAAGNITD
NESGTDIDISANNLRVEITGGTGGFGSGTNHLETTVTRLSGSVGSNGFFM
TETNDIELDSVAQISVNRVALTGVAGTGDVVDAAKSDLVSGGALVLQTFN
GSITTAVDNGDIQAAGYILLNASESATATVAGITLGGTVTTTSASNGSIS
LTAKDFVYQLATGAITAGGTGTIDVEVSTGTTSGVITMDDGAATASTSGN
IRYVATTTLSLGTIATLGNVSLQATSITDSADDDAVQLSSLPDIDVTASS
LRVQTTVNGFGEATKHIETTIGTLAATLGTIGNLFVTETNDITIDTVDTI
EVNRVTDAGSITNSIQTDNALSDIATGSGHVVIDATDITVKGGGDTTGIT
TTGAGNILLNARSGNITAQAIINGGTGNISLNAVGTNLNGNVVLWNTTSA
TNGTAFEGVLQTNNATIDVKAGDAIDMKNGSTILSKGGDIRFEAVNNINV
SYVDATTTTLAGDVALLSTSGSILDVDNNTTLDVYAAGLLMQAATGIGTS
TNHLDTTVTTLTASAGSGGMFISETDGVDVDTVTVVVNRVNDQAGTAVES
KTLSDLLTISNGNMVLVAGGTITLKEGDADNTGVSAAGNMLLKAKVDDID
IKSKVTSTDGNISLDAARDILQNANVEAQEITKSIDFVAGRDITMDNGTS
TTSANGNILLYAGTGNITIETITAGNSTNGYGNVSITAAAIPSGGNSDVG
KILDRDGTAAEDSEYDITANNLILKAGYAVGDGNNHIEETVTTLTANAGI
GGLYVTAKELVSGGNVTVDKLTVDVNRVGTDASVPTTATGTATVTQEDLI
ATGAGHIVLDVTSGDVVLNAGTSGTNAVTAVSGNIRLIAAAGALTLNAKL
DAGSGNVTLLASGLIEQKAAGDIFTTAGTIDVESTAGAITMNADAVTQTN
GGNIRYKANGTITVGLLDARVSGDRGGSLTTQSDATTPWGSVSIISGSSI
LDNSEATVDVYAKELKLTAIPAGTGAVGESTNHLETEVAKVSANVGSGGV
FITESTDIQIGRTAAVTVKRVDTDGTTPQALDQTDGVQDNLQSAGALVLV
TTAGNLETLATGGAVTATGNIFLQAKAKQTTTYDITIGAAVTSSNGSISL
DASNDIKQNSTITVSGGSGTVDLLAGHDIVMQQTTSSISTSASNGNILLT
ATSGSITIETINAGSGNVALYAANATNGFIYDGDDAGDSEVDITANGLIL
KAGNAIGSGTNHLETTVTTLTANAGVGGLYITAQEKVADSGITVDILTVN
VNRVDDKDATASTNNSAQVDLTSTNAGNIVLRSKDGSIILKDGDSNGFAV
KNTGSGNVLLQTTNSGSITAYADVVSTSGNISVLAAQSVTFTANSDIRTS
STSTTTGTIDVVAGSGGSITMSDSSLFTTSGTNGDIRLLASQNVIVGDIE
TTIADVSITATAGSITDSDALVGSANDNDLDITASGLRLNAGIGIGEVVD
HLETTVGTVSARATSGGIYLLESNGVIVGDVAVTTNRVGVTGATTTDNSS
DATQSDLRTTANNGNIVLVAGGTLILDDGTAADDDTAISANGSGNILLKT
TSGLLDINAAVKSGKGNITIWNTTGAIEQDAVTISTDGGTIDIEAMDATN
GSITMVAGSTIVSDGALATDGNIRIKSGADMSITGINAGSAHVNLLAGSF
IKDIGETTTDVVANHLRIEAGSWVGEASGTNFGLLDISVTRLSVRAGNSM
YLNELSDITIDTTDAITVKRVLADGSVLNSVETDAKQSDLVTTANDGNIV
LVAGGSITFNDGTANTNGAGVEGIAVSANGSGNILLKTTSGTLAINSAVK
SGEGNISIINTTGAITQGAVTISTDGGTIDIEATAGAITLVSGSRIISDG
VSTTDGNIRIKSGADMSITGINAGVANVSLFAGSFIKDIGEAIVDVLANH
LCIEAGTWVGEADGTNLGLLDLSVAVVSAKAATALFLKEANGVTVGTTSQ
IKVKRVGSNGLTTEDDTNGAAQSDLQTTNNGNIVLVATAGDITLQAGATQ
DDNFAVSANGIGNILVQTEAGSVIAQDHADIKSGSGSISVIGKTNVSFNS
DGADIRTSGGGTIDVLAETGKIEQSATSLFTTGTGNIRLLADDSIVVGDI
TTAGSVSLVATTGSITDADSADETIVDDDIQAVGLRLWAKSGIGTNSNHL
DTSVDNLSAYVDAGSLYVLESNGVTVQSVGVSVNRVVAAGTASVVAATTD
SAQSDLRTNSNSGNIVLRATAGDIELSDGIANSGTAGIAGTTVQANGNGN
ILIDAISGSLAVKSDLSSTTGHITLHANDSISLTSDVDVTTATSGTISLQ
AKHGEISMVSDATVMASNSSVRLAAHQDILLGDVAAQNVSLISAMGSIHS
AASNIQNIAATNLRIEAQQAIGKSDLHIKTAVDTLTAKANGTVTSGTAET
GIYLTEANSVTVDTVSVSVTEFSAIATTSIVKDSSQSDLVTGNNGNIVLV
ADGKITLHDGTDIASPFEDNTDGKAVKADGSGSILIDANSSNLLIYSDIE
SGTGHITVKAAIGVEIGSSSATDVDISTATMGTISVDAEGGELKMAGDAE
IKATSSSVRLNAATDVTVGNVVATNVSVVADNGSIINASGSSKNVTATNL
RLEAKQAIGAPTNHLTTDVTTLTLFAAGTVASGTPLSGSYISEVRDVKID
TVTVTVTEFTHVALTNDVIDAAQSDMVAGNNGNLVLIAGGTITVNDGSDN
DSLGVEAGGNIRLEATESNEAIESNIKLSSGIVSSGGNITLLAKDNIAMD
AAGDITTSSVGKTIDLQTDDAISMVNGAIIESNRGNVRLTALNDDITVGE
IKAGTANVAVDAQVGNIFAVDSSNKNILANDLILLAGNAIGENDNYLDVS
VTNIATQSGSGATYVENDGVNVNLGGLSVLVQRVMATGSTEDSSTSTQND
FKAGDDIYLVATSGNIVITANNENALTQAKNIVLIAEQGDITINCGGADQ
GFFASESIKLIAEAGKITINSTDANSAGLVATKNILIDAKETVEDTDATL
VVNAKITSKEGYISLLADDSITMTAFGDVTTEKSGNTIDIEANDSIAMSD
GALVATSNGTVRYQAFVGNITIGEINAGSGNVALLAGGSILDISNDTSSI
DITANELLLRAGAAIGTDGITVNHLETSVDSLSVKSTTGSAYVIENNSVE
VGVVTVTVSRVQEDDTVQALSADTLSGGESAGNLVIVTTAGTIETLAGGG
NITAAGNMLLDAKTNLMLGAAVSSTGGNVSMVVSGNFAQSAVGDVSAAGA
GTIDVRVSGTMTMTDGAEIKSDNGNIRLSVTGSLLLGALSTSGDVSISAS
TITDAGADASDTVNISADEVYLATTSTAVGAGIGSGSNHLELNASKLAAD
VNGTGTGGLYITENNGLQVGTLNAINVKNVASDGTSTASTSDAAQSNMSS
AGNLMIVTTAGNIETLATGGAINATGNILLDANGNGSDVVIGAAVSSTGG
NISMVSGGNFEQSAAGDISAAGAGTIDVRVSGMMTMNDGAEITSGSGNIC
LAVTNGLQLGALSTSGDVSISASTITDAGTGVSDTVNIGADEVYLSSTSS
ANGAGIGSGSNHLELNATKLTASVSGQGGMYITESNGLQVGTLSAINVKN
VAADGSSTASTTDAAKSSISSAGNLVIVTTAGNIETLASGGTITAAGNIL
LDANGNGSDVVIGAAVSSTGGNVSLVSGGNFEQSAVGDISAAGAGTIDVR
ISGAMMMTDGAEITSGSGNIRLAVANALQLGAFITSGDVSISASTITDAG
TDASDTVNISADEVYLATTSTAVGAGIGSGSNHLELNASKLAADVNGTGI
GGLFITESNGLQVGQLTAINVAQVANDGLSTVSTADAAQSNISNAGNLVI
VTNAGNIETLAVGGTLIAAGNMLLDAKTNLMLGAAVSSTGGNVSMVSGGN
FEQSAAGDVSAAGAGTIDVRVAGSMTMNDGVEITSGSGNIRLAVTSSLQL
GVLSTSGDVSISASTITDAGAGASDTVNISADEVYLATTSTAVGAGVGSG
SNHLELNANKLAASVSGQGGLFITESNGLQVGALTAINVNKVANDGSSTA
STADTAQSNMSSAGNLVIVTNAGNIETLAVGGTLTATGNMLLDAKTNLTL
GAAVSSTGGNVSMVSGGNFEQSAAGDVSAAGAGTVDVRVSGMMTMADGAE
ITSGSGNIRLAVTSSLQLGALSTSADVSISASTITDAGSSTSDTVNIIAD
EVYLSSTSSANGAGVGTGSNHLELNATKLAASVSGQGGMYIIESNGLQVG
TLTAINVNKVANDGSSTASTSDTAQSNISSDGNLVIVTSAGTIETLAVGG
TLTAVGNILLDANGNLTLGAAVSFTSGNVSLVSGGNFEQSAAGDVSAAGA
GTIDVRVSGMMTMTDGAEITSGSGNIRLAVTSSLQLGALSTSADVSISAS
TITDAGSSTSDTVNIIADEVYLSSTSSANGAGVGTGSNHLELNATKLAAS
VSGQGGLYITESDGLQVGALTAINVKKVASDGSSTASTSDTAQSNISSAG
NLVIVITVGNIETLAVGGTLTAAGNMLLDAKANLMLGAAVSSTAGNVSMV
VSGSMTQSAVGDISAAGAGTIDVRVSGTMTMNDGAEITSGSGNIRLTVTS
SLQLGALSTGGDVSISASTITDAGSGASDTVNISADEVYLSSTRSSNGAG
IGTGSNNLELNANKLAADVNGTGTGGLFITESDGLQVGTLNAINVKNVAN
DGSSTVSTADAAQSSVSSAGNLVIVTNAGSIETLATGGAINAAGNILLDA
NGSGSDVVIGADIKTPTGHITIKADDSIELASDVDITTAAAGTISVDAEG
GTLRMAGNSNISAVGSSMRLAATGTVTVGNTTAEFVSIVSRRGAIINAAG
STRNVTASDLRLQSYGSIGSANRHFTTQVVNLSIDPEEDGAGIYLSELDD
VVVTTVRVDVTEMTSFADTLGISDQSMADLVTSSNGTIVLVTIDGSITLT
DGDHNGVSISADGTGNVHLEANGADNNVIIEAAIQTDTGSITIVAAGDVE
QQANIVTNGNLVSVQAEQGSITMDQNVQTITNNGTIEYRSYEDVLLSLLH
AESGSVAVYAETGSIENNTTSNTTPNVTSETALFKAGADVGLREIQPVVI
SVERVAAEAVTGEMSLVNLGTVVIDVLEDADGNTVSGLSAGDGISLESLQ
GSIVVAAPVDTKGTADALLTFTNGQLIGKSAYFDDAGTFLKMQYKQFQFL
WNGEGATIRQELLNMVVGRQVDSDIARYRESASERQTVSPARSTMPMRSY
DPMESLRNVDVDVLEEQPGYVEVHNGYAFFRWAEVPGAQSYLLVLERDKL
EYASRWLEETAWAPFEELPEGIFEWSLYSWTTDGLQLVFGPMQFNV
>Cag_1180 hypothetical protein
MRIINIHSPDIKQYQEFYTMPIIARFYGIIIKMFFIQSEHQPPHFHAIYG
EYNAIFAIESLEMIEGDLPKRAYAFIAEWAQEHQQELLDMWNTQEFKQLP
GLE
>Cag_1466 conserved hypothetical protein
MATHHVLLYGKEGCCLCEKAFDALQRLQQSVAFTLETKDITDDPELFRSF
RYRIPIIMVDGEQACAVRVDEAKMRTLLA
>Cag_0025 Large-conductance mechanosensitive channel
MSVMKEFKEFAVKGNVVDMAVGIIIGGAFGAIVNNLVSQVILPPLGLLIG
GVDFSSLYIILKEGATQAAPYNSLAEATAAGAVTLNYGVFLNSVFSFVIM
AFAVFLLVKSINMLRRTEGSKSAVPAPSTTKECPYCLSSVPLKATRCPQC
TSELK
>Cag_0072 Tryptophanyl-tRNA synthetase, class Ib
MATARILSGMRPTGKLHLGHYTGALENWVAQQNQCSADGNRAYDTYFLIA
DYHTLTTSLSTDDVYAHSLDMLVDWLAAGIDPEKSPMFRQSQVKQHAELF
LLFSMLITSARLERNPTLKEQVRDLHMDSMSYGHLGYPVLQSADILLYKA
NVVPVGEDQIPHVEITREIARKFNNHFPHPLYGNVFAEPEPKITKFARLA
GLDGKAKMSKSLGNTIFLSDPPDEVLRKMRTAVTDTQKVRKNDAGRPEVC
TVFSYHKRFSTPEQCEEIAAGCQSGALGCVDCKKQCAANISAELAPLLER
RTYYEARMDEVKNILFEGEAKARTVAEQTMQEVRTAMKLGEANCSATFFN
TSCS
>Cag_0630 conserved hypothetical protein
MPLSIHTVANIISWVINPVFVAPAAYAAIVLLGYGNTADAFPLFVVLFAA
STLVPILLIVGLKRIGKISDYNITFREQRFLPLLVLVAVNLVGYEVLKQM
HPPRLLTGILLFNAVNTVFILLITLQWKISIHLFSYASAVGLLFVQFGNV
ALWLLLVVPLLMWSRIYLKAHTFMQTLAGTIAGFATIVIELKWWTAL
>Cag_0921 diguanylate cyclase (GGDEF domain)
MTLFLQRLTNAVKHISPLQALAVAGVVGMALVVAGSVFFVTTQHRLLREM
EHRQALYLALVEFRSTLLACEHSIGDESLPSHCTPTALQQLLPLMRAAHA
EALPHLQQLALSKRTFTKKEIATLKTELERAIPLAQQKAAQASVDLLGVN
QRFMLALVLLLLLLGLTAGLVLRHNYRQTIIPLAQLVAQLKLLNRNIPES
IRDTAAEMSRELRGTAAHSHDITQITESVIRLCGDIEAKNRKLDELYIRD
EKTQLFNYRHFKEHLILDIERAKRQLDSLSLAMLDIDFFKQYNDANGHLS
GDKALKRFAELISAECRVYDLPARFGGDEFALLFPRTSAADAADIAERLR
HLIESASFPHAECFPDGKLTISIGLATFPNDAQDWHSLLNNADRALYKAK
ALGRNSVVSFSEVKQGIA
>Cag_1728 pentapeptide repeat family protein
MRLTFPPAVSLQFFNMNNQSMISSFFQRMAFTALVASALPSSLFAYDRAH
VTLLQQGVAVWNNQRQATMGQTLDLSRAPLAKAQLGEANLAHVSLSSAFL
QAANLRGANLQAANLRWSVLDGADLRDAVLVGAHLFEASLVKADARGANF
KSATSLEQADLSGALVSNNTIVPSGERAHGQWALRHHATFVQEPERPIAS
IASAISFSPERTITSPPNSAPTTVSQSAVTAHPSNVVPSPQASAQAPITK
EYARATLNGVNWSNADLAGANFYKADMKGAQLQGANLQGAHCDRAFLLQA
NLQGANLTKALLFGATLDKADLRNANLTEASLFGANCEGADLRGAILTRA
NVTDAVLTNALISSTTVLPSGKAATRQWALMQQAIFSQD
>Cag_1229 homocitrate synthase
MPNSAMELRKSWIIDTTLRDGEQAPGVVFSAEEKRDIAAQLAAAGVSELE
VGYPAISGDELETIRSIVAMRLPLRVTSWARAKWDDIEAARQSGTEAVHI
SFPVSALYLQLMERSYEWVQEQLSELIGKAKDYFEFVSVGAQDATRADIE
LLSRFVCDASAAGAQRIRLADTVGIATPISVMHLIGELQRVTSVDLEFHA
HNDLGMATANAFTALAVGCQAVSVSVTGLGERAGNAALEELAIALKLSGE
FEATIKTEMLSSLCETVSKAAGRVIDERKAVIGKAVFQHESGIHCAALLK
HPLSYQPFLPEQIGGREHELVIGKHSGSAAIQHFFAERGIPLSRSEATQL
LAKVRQMATEKKGLLTAKELEELYTELFNIH
>Cag_1399 membrane protein, putative
MSKMTNIFNIKSDCESFDTIYKTVNLDSTFNGSRLSILICAIILASVGLN
MNSTAVIIGAMLISPLMGPINGIGYSIATYDFLLLRKAIKNLLFAIIASL
ITSSLYFAVSPVSTAHSELLARTSPTIYDVLIALFGGFAGTISLVTKLKG
NVVPGVAIATALMPPLCTAGYGLATFQFSFFFGALYLFTINLVFIALATI
LFSQLIPFPLKNIINDDKKKKINNFITSITLITLVPSIYFGYVLSEKEKF
NENAKRFIQSVTFFDGEYLLHHEINAGNRLITLIYNGQGLTEKEKAELQK
RANDFGLKNTTLTFQQGIKFSPISKEKSENEALKTEINRLAIALKQQTTK
QDSLQKQIYRGSIFLRELQPLFPQVTTCAFMESYRFSNKSNTLPVKTAYV
VITTNNSSLKKTDKEKITMWLQARTQNDSLRVIFE
>Cag_0274 hypothetical protein
MQRTIEDITSELIGLPKNERLEIVRFLLFLDNRSSDNNDTDSVWEHEIAD
RVLAVEDGTAIGIDYEEAMKKINAQFAS
>Cag_1307 conserved hypothetical protein
MTTPIQILEYVDQHGNCAFRDWFNYLDSVSAARVAMYLERVAQGNFSNVA
PIGAGLSEIKINVGPGYRVYFAKKGSTILILLGGSNKKDQSQAIEKAKQL
WEEYKALVKQEKNKL
>Cag_1633 conserved hypothetical protein
MTVESRKRQFIVEGKIKPSFCEGCGQLTAKIFVGEWMPSDKPKEEDVLAP
LTSKKIKEQKAQAQSPSAAENQYWVRCAECNQIYLLKEWQIQIDREVDIN
QLTPEECQVYSPHGIYAKGAAVYHQALGEVGIVREKQATGSGAFVIIVEF
AKSGRKQLLENVQLSSGNGQSSTELLKLKLRRQA
>Cag_1328 transcriptional regulator, MerR family
MYNKEQENKMAFESTKSYYSISEVTKIASVPAYLLRYWENFFTELNPSRD
TRGNRRYTNRDIAMVLHIKELVYEKGYKLGKASELIKGKPRDTEEDHKTA
EILRLQKQMGNDKNRFHTIEERRTLLLREIKLEIEEMLQMLG
>Cag_0278 hypothetical protein
MEMEKSEKNQYLEQLDAYVHALMQELRIMEQRKAILEPLLFDEDLKSSLN
MKFKDTDGAVAYNHFVPLLAQDLIRDISRLFLDEGKKAGSFTNLCRKISN
KKQLGWLRERYCESQVINPNELASEFHNIWDNVKKGKEKIMHDPNSEKLK
TFRDKYYAHLEMTPMGNEPGPFNIKALGLTYCDIFNFLDTHQNVIYNVAL
MITGTNYDNEEFLGIHRKSANEMWRLLAGE
>Cag_0963 polyprenyl synthetase
MNIKDVTASVADEIALFQEKYRVVLHSRNTLVDKVSRYVLKQQGKQIRPA
IVLLSANLCGGVNESSYRAAIMVELLHSATLIHDDVVDGAEMRRGLPSIN
ALWKNKISVLMGDYMLARGLLYSLDHNDYGFLHMVSEAVRRMSEGEILQI
QKTRSLDITEKDYLSVISDKTASLISSSCAMGAMSATTDESNIAALKNYG
EYLGLAFQIRDDLLDYTGDSKTTGKQMGIDIKDKKITLPLIYALSQSSAT
EQRHIRSILKSARKRAIKSPEVIDFVRTKGGLEYAATVAENFASNAVQSI
SHFPEGSAKTSLLRLVDFVMMRQY
>Cag_0143 Competence-damaged protein
MRAEIISVGDELLRGQRVNTNAAVIARMLSAIGVSVSHIVACSDDEADIM
ATCSAALGRAEVVLVTGGLGPTRDDRTKHAIQQLLGRGTVLDEASYRRIE
ERMAARGSAVTPLLREQAVVIEGSHVIINSRGTAAGMLLDCGEPFAHHHL
ILMPGVPVEMEAMMHEGVIPFLTSLSNSVICQTPLKIVGVGETAIAAMLV
EIEDAMPPATMLAYLPHTAGVDLMVSSRGNSREAVEAEHQQVVDAIMERV
GTLVYATREISLEEVIGEMLLRQTFTVAVAESCTGGLLASRFTDISGAST
YFQQGFVVYSNEAKERALGVPHETLVAHGAVSEEVAQGMALGCLEKSGAD
FALATTGIAGPTGGTPEKPLGTLCYAIAVKGGGVVVCRKVVMQGTREQRK
VRFSTAVLREFWMLLKEREASEE
>Cag_1217 conserved hypothetical protein
MPNVFYYPVSYPAWWLDYYYLATWALSLLFLGGVWAIFFRFGKFSYGIDL
GCFWKSALLVVCTTISLGGPMYYNTRFVGEHGQDGDSVRLADGKVVYSDR
NGNLRQLAIGDITTIYQESVTYNPPPKIFIVAKTAQGKDSLFVTTNLPNY
RQFIEELSKQSGVTATVR
>Cag_0116 Elongator protein 3/MiaB/NifB
MGNSVTQNGKSLNILLIAPKGKKDSKSNQKPLFNMAIGVLVSITPPQHHL
EIIDEHFGDEINYDGKYDLVGITSRTIDATRAYEIADTFRAKGKKVILGG
LHVSFNKEEARAHADCIVCGEAENLWSTLLDDAANNALKPDYDSKDFPSV
TEIVPIDYAKIAKVSKREKVDGTKSIPIYITRGCPFECSFCVTPNFTGKL
YRAQKPDDLKRQIEEAKRVFFKANSKTAKPWFMLCDENLGVSKKRLWETL
DLIKECDINFSVFFSINFLEDKQTVKKLVDAGCIMVLVGFESIKQSTLEA
YNKGHVNSADKFSRLIEECRQAGLNVQGNFLVNPALDSYEDMDDLAQFVS
KNNIFMPIFQIITPYPGTTMYWEYKNKGLITDEDWEKYNAMNLVIRSEKY
DPIEFQHKFMTTYYKAYSWKNIVNRVRVNPYKLLNLVTSMAFRKNIREQL
QTFRQDHHIKK
>Cag_0004 conserved hypothetical protein
MNNHIVITGATGVIGVELAQKLIKRGEKVVLLARSPNAAQQKIPGAAAYV
RWDSDMQEGEWKSTISGAKAVIHLAGKPLLESRWNEEHKQECYQSRIIGT
RHIVAAIAEAAEKPQVFISSSAIGYYGSFDKCSDTAPLTESGNKGSDFLA
HICIDWEEEARKAENLVPRLVFLRTGIVLSTRGGMLQKMMTPFQYFAGGP
IGTGLQCISWIHMDDEVNAIIASLDNSAYKGAINLVAPTPVSMKEFASKL
GAVMGRPSLLQVPEFAVKMLMGEGGEYAVRGQKVLPTFLEKQGFTFRYPD
LSNALGDLIKHGK
>Cag_1156 Ribosomal protein L25
METIVLGVEPRVIKKNEAEKLRKSGLVPAVVYHKGEETVAVSIQELALTK
LVHSAESHIIDLQFPDGKLKRSFIKEVQFHPVTDRIIHADFQLFSADEIV
EMVVPVSVSGESVGVEKGGKLQIIMHSLTLKGKPTDMPEHFVIDITALDL
AHSIHVREIPMTNYPGLTIMDEPDAPVITVLATRKEVEAAAEVAS
>Cag_0219 chlorosome envelope protein A
MSGGGVFTDILAAAGRIFEVMVEGHWETVGMLFDSLGKGTMRINRNAYGS
MGGGTSLRGSSPEVSGYAVPSKAVESKFAK
>Cag_0019 putative NAD+ kinase
MNFGIVVNISREQALHLARTLATWFDERHIGYMFETLSGSTLGLGPSAPI
EELNCHCDVFISLGGDGTLLFTSHHAVTKPVIGINVGYLGFLAEFTQSEM
FAAVEKVLSGNYSLHTRSQLEATAFMDGVSHQFRALNDAVLEKGTYPRIP
AFIIKLDGELLSAYRADGIIIATSTGSTAYSMSAGGPIIAPKSSVFVITP
ICPHMLTVRPIVISDDKVIEISVDAPDGEFPLNCDGSLKKMLAPHECITI
KKSPVAINLVANEKRNYGEILRTKLLWGREHDGCCS
>Cag_1199 conserved hypothetical protein
MIDPMYRQQVDLLLQILPLVAKEKVFALKGGTAINLFVRDMPRLSVDIDL
TYLPLDDRDTAMKGISEALNHIRQKINHAMPGIKAYLVQQSSGQEAKLTC
QSSSAQLKIEVNTIIRGHVFPPRIMDIAKSVEAEFQKFVTMPVVSHAELF
GGKICAALDRQHPRDIFDIHQLFAHEVFTDEIRLGFIAMLISHSRPIHEL
IRPNLLDQRTVFQHQFTGMTFTAFSYDDYESTRKRLVKEIHEHLSDTDKR
FLLSFKSGTPDWELLPMDNLRLMPAVQWKLANIVKLKAQNSAKHKAQLKA
LDNALKDC
>Cag_1582 heterodisulfide reductase, putative
MAQKAFTPDIKFIRELKEAGADTMKKCYQCATCSVVCPLSPNDRPFPRKE
MVMAQWGLKDELLKSADIWLCHNCNDCSKHCPRGARPGDVLAILRKSVIQ
ENAFPGFMGKLVGDPKNIWQALAIPVIFFLIVLAVTGHLNIPEGEVIFRK
FFPVHYIDMVFVPLSTLSMLLFAVSIKRFWGNMVAASGNVKPKQEFMPSL
IETLKEIMSHSKFRKCGENPERTGAHSLVFFGFIGLAITTAWAVFNLYVL
QWESPYPVDDPEILALFGGSTVVPWILRIVFKLLANASSLFLLIGGVMIV
QARLKERSFETISSSFDWTFITMVLLVGASGFLAQLMRVTHFPAVLAYGT
YFMHLVFVFYIIIYMPYSKLAHFVYRTVAITYTKMLKRDVEM
>Cag_0490 GTP-binding protein LepA
MGLPNTDSCRIRNFCIIAHIDHGKSTLADRLLEITRTLDRTQMGSAQVLD
DMDLERERGITIKSHAIQMRYNAADGLEYTLNLIDTPGHVDFSYEVSRSL
AACEGALLIVDATQGVEAQTIANLYLALDAGLDIIPVINKIDLPSSDVEG
VARQIIDLMGVKRDEILAVSAKAGIGISELMESIVHRIPPPAVKNNEPLR
ALIFDSVFDAYRGAVVYLRIVEGLLKRGDKVRFFASDKLFTADEIGIMTM
TRQPREQLASGNVGYLICSIKDVKDAKVGDTVTLADNPAVERLSGYKEVK
PMVFSGLYPINSNEFEDLRESLEKLALNDASLIYTPETSVALGFGFRCGF
LGLLHMEIIQERLEREYGVNIITTVPNVEYRVFLTNGEEVEVDNPSVMPE
AGRIKQVEEPYVSMQIITLADYIGNIMKLGMERRGEYKTTDYLDTTRVIM
HFEFPLAEVVFDFHDKLKSISKGYASMDYEYIGYRDSDLVKLDVMLNGDT
VDALSIIVHRSKAYEWGKKLCQKLKGIIPKQMYEVAIQAAIGSRIISRET
ISAMRKNVLAKCYGGDISRKRKLLEKQKEGKKRMKQVGRVEVPQEAFLAL
LNIDE
>Cag_1772 hypothetical protein
MLTNITIENFKKLERISFPLSQSVVIIGPNNSGKSTIFQALCLWEIGVKN
YIAAYQKNDLNRQGTITINRRDLLNSPIADARFLWKSKKVTQRNISGAGQ
KHVPLSIELEGDNNGVQWSCQAEFTFSNSESFSCKICTGLQQMVELYENE
HGLHFGFLQPMSGISTTEDKLTKGSIDRKLGEGKTAEVLRNICFEILNPE
TASKNRNNAENNWLQLCNVIKVMFGVILQKPEFIKATGLISLEYIENNIK
YDISSGGRGFQQTLLLFSYMFANPNTILLLDEPDAHLEVIRQREVFQKIN
DIATITNSQLLVASHSEVVLDEAAEASKVIALIENQTFEVNTSTNSKSIQ
YIKKALTEIGWEKYYLAKSKGHILYLEGSTDLQMLLAFATALNHNVAALL
RFANVSYTSDNVPNTAVANFVALKEIFPELKGLAIFDKIEKNLNDIKPLT
VVCWQKRELENYFARPYLLIKYAQSLHEKYEQFSLEQLEKAMKKAIEDFT
LRAYLNDLNHNWWNSAKLSSEWLDNIFPEFYKQLNVPLNFYKRDYYQLIA
LMERQDIADEIVDKLDLIYEILK
>Cag_1970 conserved hypothetical protein
MPTYQYRCSTCGHELEVMQKMSDAALTLCPSCQQEALQRVISADGGFMLK
GSGFYKTDYNKSAASPCSTGSCSTGSCPLAS
>Cag_1028 heterodisulfide reductase, subunit C/succinate dehydrogenase, subunit C
MSNIKISATDALKHRLEEATGNNYNCCYQCGKCTAGCPAGAFMDNPPARM
MRLVQSGNVNDALLSDSLWFCVGCMTCTARCPQNMEIAGTMDELRAMALE
QGVDSGSRSKKLVTAFHTSFLNNIRKHGRLEELSLVNSYKLRTRTFLQDA
KSGITMIRSGKVNPLHTLTGKGGIQQKEVMTKIFEASEKASHQPAKKRRP
VKKEFIAKTPLILKPGMTIGYYPGCSLSGTAREYDISVRKMCQLLGITLK
EIDDWNCCGATSAHATNHKLSLLLPARNQALADAQGMTAVLAPCAACQNR
QVVTRKALLESEELQGEVQALTGIAPTCKAEFIGVTQMLEAYDAEELKRR
VKKPLSNLSLACYYGCLLVRPMEDMSFDDPENPQKMEAIMQLLGAKTVEW
AFKIECCGGGLTLAQQPLIEELTHNITKNAAEGGADAFVVACPLCQANLD
MRQDSMRKRFENDVKEMPVYYISELVAIACGADPAAVAVGEHFVPALELL
SK
>Cag_1448 outer membrane protein OmpH
MKTSSFFAVSRRIATGVMLSLTLAAPQAFAAQESGKIGVIDSAKILQQLA
DTKQAESALQAAAAPMQKELDRMNQDYQQAVAAYRQKAATLAKTAREQKE
KELTTKGKAIEKYQQDNFGRGGALEKKQQELFSPVRQKVLTAVEAIAQKE
GISVVVEKNSAIYATTDADITYKVLNQLNVK
>Cag_1136 conserved hypothetical protein
MNNMQYNRRSIRLQGYDYSQSGAYFITICTQNRECLFGKIVDGNMILNDA
GEMIKNIWHKIPTYHPYSYLDAMCIMPNHFHAIIMTVGADSISAPIDSIS
APIDSISAPTIGAEMDSAPTLGNIVQTFKRYTTIEYIKMVKQNKLPSFNK
RIWQRNYYEHIIRNESDYTHIYDYIQNNPQQWEMDTLYPNTL
>Cag_1443 conserved hypothetical protein
MVKNKKNEVIGREITLYSDKSEDYISLTDMARYRDTERSDYILQNWMRTR
STIEFMGLWEQFNNPNFNSIEFDGIKNMAGSNSFSLTPKRWIAATNAVGV
VSKTGRYGGTFAHRDIAFEFATWISAEFKFYLIHEFQRIKEQETNRQKLD
WNLQRTLAKINYTIHTDAIKERLIPEKLTAKQTSLVYASEADLLNMALFG
TTAADWHTENPNAKGNLRDDATLEQLVVLSNLESINAVLIRQGLTQSERL
MQLNQIAITQMTSLVKNAHLKKMQ
>Cag_0388 Valyl-tRNA synthetase, class Ia
MSSENAWVNLEKTYNPHDVEERWQSEHWEKSGIFHAESSRVQKGERQPFT
VLMPPPNVTGSLTLGHVLNHTIQDIFIRYNRMVGREALWLPGTDHAGIAT
QTVVERKLKKENITRHDLGREKFLEHVWQWRNEYGDLILRQLRKLGISCD
WRRNLFTMDESASKAVMNAFVMLYRDGLIYRGSRIINWCPVSQTALSDEE
VIMKPRKDSLVYLRYALVNHPNEFITIATVRPETILADVAIAVNPNDPRY
SHLIGSYAVVPIANRHIPIIADDYVDIEFGTGALKITPAHDPNDYQVAKR
HNLPVMSVIGKDGCMCDEYGYAGMDRFAAREKMVADLEALGNLVKVEEYE
HNVGYSERADVVVEPYLSEQWFVKMEPLAKEALRVVNEGEITFHPQQWLN
TYRHWMENIQDWCISRQIWWGHRIPAWYDRDGKVWVAENYEEACALAGTT
ELRQDEDVLDTWFSSWLWPLTTLGWTDRQSDNENLRAFYPTNTLVTGPDI
IFFWVARMIMAGLYFKGSIPFRDVYFTSILRDLKGRKLSKSLGNSPDPLK
VMETYGTDALRFTIIYISPLGQDVMFGEEKCELGRNFATKIWNATRLLFM
QREKFFASPEEFAATFQAFDPHAVTFSSAETWLLQRYNAMLQRYHYACSQ
FRVNDMVKLLYDFFWGDYCDWYLEALKVRLATCNREEAQQSLALALKVLE
GTLQTLHPVMPFLTDEIWHHLLPRSPQESIALTAMPQVDERLVATDASSF
NMVQQLITEIRSLRSLFGVPHASQADILLRPASETNAAALEANMALIATL
GQCRPQMMMESEQPRHAASSVVDGNELFVLLEGLISFDKERARLQKEIAN
IEAYTLSVSKKLANQGFVANAPADVIAKEREKLCDAEDNLQKLKKNLLVL
SE
>Cag_0492 hypothetical protein
MPSRPSKRFVLNWLKALLLSPGMLFPATYLLLYFSTTLYLNTALKSEVIK
TLMPMGHISVRTVTTDMTLERITLHNVTLQPSAIGESKKPQHIRQLTIAC
PKLIPQLFTKQGRTQTICQVEQALSPIAQ
>Cag_0446 conserved hypothetical protein
MKKQATITTEELDDKFDAGEDISQYLDWSQSQRPLLDHKRINVDLPQWML
NSLDFEAKRVGVKRQAIVKMWLSERIKAEQVAAGNAVR
>Cag_1297 Tryptophan synthase, beta chain
MEPSLYSAPDAAGHYGYFGGKFIPETLIKNAADLEAEYLRAKQDPAFQQE
LTLLLRDYVGRPTPLYHATRLSQQVQGAQIYLKREDLCHTGAHKINNALG
QVLLARRMGKKRVIAETGAGQHGVATATVCALFNLECVVYMGEEDICRQA
PNVARMKLLGAEVRAVSSGSRTLKDATSEAIRDWMNNPEETFYIIGSVVG
MHPYPMIVRDFQSVIGKETRQQIIAQAGRLPDVITACVGGGSNAIGMFYE
FLKDAAMVELVGVEAAGEGLTGRHAASLTLGTPGVLHGALTRLLQNNDGQ
VLEAHSISAGLDYPGVGPEHCYLQELGLVSYTSVTDTEALEALATLAKTE
GIICALESAHAIHYAIKRAKQMAHDGILIVNLSGRGDKDMETIIRELGL
>Cag_0771 hypothetical protein
MKKRTLFFAALCTVGLTMPLSNAHAEWTLHINSRNEHPPTLVNNATIQPD
SRALTMSSQPPIAVAPPAPVFITPQPVAIAPPPPVVVAAPRPNYQVVVYE
NSYYNRRPDGWYRSYHPQGTWVRVQQRHLPPRFAVAPRAPEPRFAPPHRR
FDDRRGVELRVRY
>Cag_0636 NADH (or F420H2) dehydrogenase, subunit C
MEETKELSPSVQASAAAYTLIKEEFGDGVSEFDANETMPFFEVVDVERWT
EIALFMREHPKLRFNYMACLSGVDYPADNKLGIVCNLESLGEHNHRIAVK
VKCDREGGAIPSVAYVWHTANWHEREAFDMYGMTFTGHPDFKRILCPDDW
EGHPLRKDYKVQERYHGIKVPY
>Cag_0901 Ribose/galactose isomerase
MTMKIAVGSDHAGFEFKKTVVAWLEAHGYEVEDMGPHNEESVDYPDYAHH
VGRAVAGGHCDSGVLLCGTGIGVSVAANKIKGVRAALACNPEFATLARQH
NNANILCMPARFTNRETIEQCLHNWFEASFEGGRHERRVNKIEPCCKVNS
>Cag_1860 hypothetical protein
MHEERSKLDEYSLKHGPTIGKMYEGLASDVLGRAIPESLGLQLQHGIIHD
GKGAMSGEIDCMLVKGEGEKIPYTDSYKWHVKDVIAVIEVKKTLYSADLK
DAFGHLRGVADNYGSYVQSGEGSEKFDINPAKKAFSETTGLIPPDHSKVD
SLGIEMEMLYHTFVMEHISPIRIVLGYHGFKKESSLREALVSFINENQMT
QGFGVGSFPQIIVCGNYSLIKMNGYPYSAPMDSDFWNFYASSNANPILLI
LELIWTRLSYQIKVENLWGEDLSIENFTLFLSAKIKKVGDLTGWEYKYTP
ISEDLLKERAPEDEWSPTIVSSTQFVIFNRLCNEAEESIGDKSLREYVEK
EGENFDHFIESLTSTGLIALKDDKLQLTTYECQCVILPDGRFAVADNNSG
RLTRWVNKQIEKNKA
>Cag_1694 HRDC
MQIKLFTIPISDSGAPEEELNAFLKTHKIVSVDSELANNKDGAWWCFCVR
YLEQAMNALPERKVKVDYRQVLDDVTFQKFVKLREIRKRVASEEGLSAFI
VFTDEELAELAKLDEISVKSMLSIKGIGEKKIERFARYFITTPESDEAQG
EIG
>Cag_0825 conserved hypothetical protein
MTHTTPLLPLAPPINLVDLRYNELHNAITAFGEPPFRAKQIHEWLFSHHA
NSFAAMSSLPLRLREKLAERFTLQRPEVVEVQESCESGCLRPTRKILLKL
SDGALIECVLIPAEERMTACLSSQAGCPMQCTFCATGTMGLQRNLSAGEI
WEQLYALNGLALQEGKTITNVVFMGMGEPLLNTDNVLEAIATMSSRNYNL
SLSQRKITISTVGIVPEIERLSRSGLKTKLAVSLHSARQEVRQQLMPIAA
ERYPLPLLSKSLEAYSKATGEAITIVYMMLNGVNDSKEDAHLLARYCRHF
SCKINLIDYNPILTIRFGSVQESQKNEFQAYLMAQKFHVTVRKSYGASVN
AACGQLVTQQQRRTIK
>Cag_1558 conserved hypothetical protein
MASTQPPSNRRDDYDSPWKEAIELYFPEFMAWYFPKAYAAIDWSQPYHFL
DQELRSILPEAENGKRIVDKLVQVHLLDGNESCLYIQIEVQGNRETDFPR
RIFICNYRIFDKYGMPVASFVILTDTDSSWRPTTYSYEFADSKMTLEFDM
VKLLDFEPRMEELLASDNAFALVTAAHLLTQKTRENSLERLDAKTQLIRL
LYNKQWTKERVKELFRVIDWFMELPKELEQQLQTEIYNIEEEQKMKYISS
IERYAMEKGWSEGKELGVLEGMEKGKAEGLEEGLMQGRLDVARRLVASGM
SADEAAGIAGVDVYLL
>Cag_1890 conserved hypothetical protein
MTCTEARLLMSAAVDCEISFIEQEQLQQHLALCAPCCIEFETVKKTKLIV
RARLIQFKAPQSLVNSIMELTNADCHHE
>Cag_0774 Acetylglutamate kinase
MSSERKAPHAAIGQVLIEALPYIRQFEGKTFVIKYGGAAMKDDTLKNMFA
QNVTLLRKVGIRIVLVHGGGDAITRTAEKLGLQSRFVQGRRVTDKEMISV
IQMTLAGKVNQDIVQLISEHGGKAVGVSGLDADTIKAHPHPNAEQLGLVG
EVEQINTDYIDLLSQAGLIPVIAPVGFDCDGNLYNINADDAASSISIALK
AEKLIYVSDVAGIQVGDKILKTISKAEAADLIERGIISGGMIPKVVSAFN
TMDGGVGKVHLIDGKSTHSLLLEIFTHEGVGTQFIN
>Cag_0747 hypothetical protein
MQYCHVCGGTLFNVKAILWDELIKDWQLSPKEVNYINQQQGGRCSKCLCS
IRSIVLAKAITNSLGYKGVLINLLKSSFVNDLSILEINQAGNLSRYLRLF
KNYTFGAYPDVDMHNLPYKNNTFDLIVHSDTLEHVANPIHALSECYRVLK
PNGFLCYTVPIIIGRLSRNRSGLKKSYHGTPNAIPDDYVVCTEFGADAWT
FLLEAGFDSLSMYSFHYPAGIALAAKKREI
>Cag_0362 Histone-like DNA-binding protein
MSKAELVEKIAKQADLTKADAERALTAFVDVMTASLKAGDDVALVGFGTF
SVGDRAERQGRNPQTGETITIAAKKVVKFKPGKALKDEIGG
>Cag_1269 hypothetical protein
MGATHVTVTIRNPANPEKFWEGLFLVDSGAIDSLVPRDALESIGLKPKAQ
RSYELADGTEIKMDITTGDIEFMGEIVGGTIIFGASDTEPILGVTALESV
GIDIDPRNQQLKRMPSTRLKKLKPIASC
>Cag_1332 conserved hypothetical protein
MQKQHIIFLKLSSLATGCLLAFSLLAPPTSVDAAATRSYARDITSALASN
NVERLKSLQDKLTVPAEKLVVEALLTEDGAQAQELYQKQLSLYPDAALDQ
VSRSRLVAFAQAQERAVALTPSVAQSPKSSSADSAAPAVKPVVPPPAPIV
QVPVQPAATATVAQVLPPQQPQAQAKGFTLRFGSFKSEANAQTFASDVNR
YAPTEIVLLDGLYRVQLSARYATAEEARKVSRAIPMQSLTVSAQ
>Cag_0441 hypothetical protein
MSASLKSWATPLAVGTFAIIAVTGILLFFDIEIGFVEPVHKWASWAFVAG
VLLHVVSNRKALTHYFSSKPALAIIGTAMVVLVATLLPFGEKEEHENPAK
MAAYALSSSSLETVALVVKSTPDVLSTQLAAKGLVVSSPTATIEQIAKSN
GKEPKEVLANILSLSKGKAEHDGDDD
>Cag_1025 conserved hypothetical protein
MKPLPVGIQTFSEIIKQDYLYIDKTSLANELIKKHKYVFLSRPRRFGKSL
FLDTLKNIFEGKQELFKDLLIYNQWNWTVTYPVIKISFSGGIRDTESLRE
NLFYILKDNQERLNITCEEKSNANLCFAELIKKAFQHYQQKVVILIDEYD
KPILDNIENIPSALIIRDGMRDFYTKIKESDEYLRFVFLTGVSKFSKVSL
FSGLNNLEDISLNPDFGNVCGYTQNDVDTTFAPYFDGVDMEEVKRWYNGY
NFLGDKVYNPFDILLFIKNKYVFDSYWFETGTPKFLIDLIKKNNYFIPNF
LDIKVDKSLVNSFDIENINLQTILFQTGYLTIKQFLPSGMGIGYKLGFPN
KEVQISFNNYILQVLTSDSDKEPIRHELFDIMNNGKVANLEPVIKRLFAS
IAYNNFTNNYIESYEGFYASVLYAYFASLGFDMIAEDITNKGRVDLTLKT
LDKTYIFEFKVIAEEPLEQIKKMKYYEKYDGERYLIGIVFDPKARNVSRF
EWERV
>Cag_1381 alpha-amylase family protein
MSTSPNNAPCKATCTVQTPHLDLLLQTLESLTPKQPAQPAPPALSQQPYR
VPTLWQSDNVGVEVIEPAEYYARCIRSLLDSAHKAQPQDIDGDWKRHAVV
YNLFIRLTTAFDHNGDGELSNEPMECGFRETGTLLKAIALLPYIERLGAN
TLYLLPLTAIGEQNRKGSLGSPYAVKNPRTLDPMLAEPALGLSAEILLKA
FVEAAHLRNMRVVFEFVFRTTAVDSDWVREHPEWFYWLKESEANETFGAP
YFEQERLNAIYAAVDRNDMNNLPAPDEAYRAKFVAPPQQVVERDGALVGM
QKDGTTCRIASAFSDWPPDDRQPAWSDVTYLKMHQHEGFNYMAYNTIRMY
DSTLERPEYRTTNLWETLISIIPEFQDEYHIDGAMIDMGHALPAPLKQTI
VERARAKRPDFAFWDENFNPSVEIREHGFDAVFGSLPFVVHDIIFIRGLL
NHLNRIGVALPFFGTGENHNTPRVCFRYPKQAAGRSLATFIFTLSAILPS
LPFLQSGMELCEWHPVNLGLNFTDEDRATYPSETLPLFSPRAYDWEKSNN
LEPLNHYIKRLLTVRERYLDVVLCGDAGSIGVPYVSHPELFAVMRSAGGK
SLLFVGNSNLTESRTGMLEFSVEQASLVELISERPYTITNHRLEVSCTAG
ECLLFEIPSFS
>Cag_1810 Bacteriochlorophyll 4-vinyl reductase
MSEPSKIGPNSIIQTVTALEENYGKSKAETILRKIGQGYLIGNLPKEMVE
EIKFHTLVGALNKEIGSTATANILKESGERTARYLMRVRIPAPFQKLVKL
LPPRLAFRMLLFAISKNAWTFAGSGEFRYSMSTPPEISVKVTFPSQPVVG
NFYLGTFTALLKEMVNPKTSIKADIQKAGSDIQCTYRCEI
>Cag_0132 6,7-dimethyl-8-ribityllumazine synthase
MHIEQIEGTLSAAGMRIALVVSRFNDFIGQKLVEGAVDCIRRHGGSEEQM
AIYRCPGAFELPMVSKKAALSGKYDAVIALGVIIRGSTPHFDVIAAEATK
GIAQASLETMIPIAFGVLTTENLEQAIERAGTKAGNKGFDAAMTAIEMVN
LYRQM
>Cag_1039 Outer membrane protein and related peptidoglycan-associated (lipo)proteins-like
MKRLLRSLFIPATLLLGACCIEDDIVVAPAAVPAPVAVPAPKPTPAPAVV
VVPAPVAVVVVAPVPPPLPKPIVLPALVDVFFDFNKSELGTTRNEQLKRN
ANWIKAHPTSNIIIEGHCDERGTNEYNIALGERRANSAKDYMVTLGVDPA
RLTTVSYGEEKPVAIGSTEEAWAQNRRVHFIAE
>Cag_0805 Signal recognition particle protein
MAMFESLSDKLEATFKKLAGQATINEINIGVAMRDIKRALLAADVNYKVA
KKLIEDIREKSLGEEVIKSVSPAQMIVKIVYDELTELMGGEQKPLNLSPK
KLPAIIMVAGLQGSGKTTFCAKLALRLRKNGKNPMLIAADVYRPAAVDQL
KALGEQVEVPVFSVDEKDAMKAALQGLEAAKAAAKDVVIVDTAGRLQIDQ
AMMAEAEALKNALKPDELLFVVDSMMGQEAVNTAKAFNDRLDFDGVVLTK
LDGDARGGAALSIRQVVEKPIKFISIGEKVDDLDIFYPDRMAQRILGMGD
IISFVEKAQENLDLDKAIEMQKKLMKNEFDLNDFFDQLQQLKKMGSIQGL
IEMVPGLNKMVPKQELENLDFKPIEAMISSMTKEERSNPEMINGSRRQRI
ARGSGRKVQEVNLLLKQFGEMKKMMKAVSKLSKSGRKITPQNLALDKFLK
R
>Cag_1679 hypothetical protein
MRHISSALYRYAFHSAALCTMLYVAAPLHVVAAAESVGTRKANVQEAPFL
TTSLSNATPYVGEEITIHYTLYFQEIAPKIVRESQPSMQGLWAQASNPDR
FINSKSVQFKGKAYRSAVIKSYRVAPLQSGQLPIDGYTMHYVLSENEGEG
EKSITAPSVAITARPLPQPIPQGYSGAVGTFSIAQHSSAQQVRVGEPLTV
TLTINGTGNLFTVTPPTLSVPASFRADTPNKKIEVNSDANAKSGSITLTQ
TIWAEQAGSFTLPACSFIAFNPNSKSFATLRTEPLTIIVTPALQSGGQSA
SKGDSLSSSMAASEDGTAASFPVIPAVAGALIVIMAGVGAWLLRHRKGSA
QTSIEGSVPKSSVAITARSAKEQLFAQLHLVGIKHPNGLTRSELLQALCT
TRLSPEMQQQFMALLDALDALLYAPGAANPLLTPELAERFILFDKELNQT
YSAS
>Cag_1665 Fatty acid synthesis plsX protein
MLTIVVDAMGGDNAPACVVEGTVQALQESSNRFNVILIGQSEKVEPLLAA
HDCSALNLRFIHAPEVVTMHDVPATAVKTKQNSSLVQGLQLCRNKEADAF
VSAGNTGAQMAASLFVLGRLPGVMRPTIYAYFPRPTEGLTNIVDVGANVD
CKPEHLVQFAEMLTIYQRYAVGIEKPLTGLLNIGEEEGKGSDVLKQAYKL
LKEADSKGKVTFVGNIEGHDVLKGKASIVVCDGLVGNTLLKFGESIPEFL
GTLFKRSLTSVVESGEMSQDAAAVTTKLFKGLFTPFDVEQFGGVPFLGVD
GISIVGHGRSSSRAIKNMIYMAEHMIEKKVNEHIADVLAEQ
>Cag_0701 conserved hypothetical protein
MEKRDLKKELRHLYNPSPKAVEMVDVPTMNFLMVDGMGDPNTSQTYADAI
EALFSLSYTIKFMVKKGEMALDYGVMPLESLWWCEDMSTFSPDDKSNWQW
RAMIMQPSWITEDLVNQAIEEVKRKKGLAALPFVRFEAFEEGVAAQIMHI
GHFAEEAPTIEKLHHFIAANQKQRRGKHHEIYLSDIRKAKPEAWKTIIRQ
PLA
>Cag_1589 putative transcriptional regulator, AsnC family
MSCHLDVIDLKIIESLGENGRIRLSELAEVVGLSIPSVSERLDKLQKGGI
IKGFTLEVDERQLGFDIHAFVRVRIDSSKHYKSFMEHVLKEDEIMECFSV
TGEGSHILKVMTHNTASLEQFLSRIQSWPGVLGTNTSFVLSQIKKNRKIC
SDIVRRNLQDRQVLTTFDEKKSKR
>Cag_0871 Secretion protein HlyD
MKAIEKKTVRLLAVIVPLFLLFAFVAFRSGPLAPVPVTVTRVQQQALTPA
LSGIGTVEARYAYRIGPVASGRVLRLLVDVGDTVRAGQVVGEIDPVDLEE
RLLARRAAVLRAEAVVRSAAAKLHDAEARQRFAVGQEQRYTELLAVRAAS
SEQVEAKRQEAEVARATALSSQAALVAAREELAAAKADYQGLEEQRRNLL
LIAPADGLVTNRLVEAGSTVVAGQSVLEIINPQSVWVAARFDQLQSAGLR
AGLPATVRLRSQAGAPLAASVERIEPLADRITEELLAKVVFNVLPNPLPP
IGELCEVSVGLPSLPRTPVVPNASVHRSESHGGKLGVWVVEGSSLRFVEV
RIGATDQMGNVQILSGLKGGEQVVVYSKSALNEGKRITIVKPTSKQGAQG
VQGGLQ
>Cag_0464 Di-trans-poly-cis-decaprenylcistransferase
MALKKVTNVTTSLPSRWFSSDTNRTDKEQQAALQKAGQLPQHIAIIMDGN
GRWAKMKGKSRMEGHIAGVSSVRDIVEACHQLGIGYLTLFTFSTENWKRP
EQEISALMQLLIRVIGRESRKLHENNIRLQVIGEMHRLPQKVERVLRDCM
ELTKENNGLMLNLALSYSGQWDMLQACRAIAADVQAGTLQLEAIDEACMA
SYLSTAEMPDPELLIRTSGEFRLSNFLLWQSAYSEIYYTSIYWPDFRREQ
LYEALYDFQQRERRFGQTSEQVQTRMPASAQTQHH
>Cag_1552 hypothetical protein
MPISLVYLSAALLLLSLHPLLKAYVDIIYMVVWGTFAWGVYANVLKKKLL
LPLAFLPFVVLFNPIDPPALPLYADIAMKIGGATLLLFTRKHIAV
>Cag_0052 Phospho-N-acetylmuramoyl-pentapeptidetransferase
MLYYFLKYINDIYHFPALRVIDYLTFRASAAAITALLITLLIGPKLIAYL
KQRIIEPIKEEAPPEHRKKKQLPTMGGLLIIFAFELSVFLWAKVDSPHVW
LVMVAVLWMGAVGFLDDYLKVVKKVKGGLAAHYKLIGQVLLGLLVGLYAW
FDPSMAVLRDTTTVPFFKNLTINYGIFYVPVVIFIITAISNAVNLTDGLD
GLAAGTSAIAFIGLAGFAYLAGNAVYATYLNIPFIQGGGEVAIVSMALVM
ACVGFLWFNSNPAEIIMGDTGSLALGSAMAVIALLIKQELLLPLMAGVFV
LETFSVSLQVLYFKYTRKRTGEGKRIFLMAPLHHHFQLKGWAEQKIVIRF
WIMAILFFLASLMTLKLR
>Cag_1344 TPR repeat
MIHMPDFFDDDRFEFSSNNGELPPDLDGLDSIFDSEELVERIMQYMEDGF
PLEALAVARRLEQIAPYNSETWFYLGNCLTMNAFFDEALEAFHKALLLSP
TDSEMQLNLALGYFNNSMYEEALEQIERVMVDFAFEKEYHYYRGIILQRL
DRYDEAEKAFLMALELDNEFADAWYEIAYCHDVCGRLEESTTTYNTALDH
DPYNINAWYNNGLVLSKMKHYDEALFCYDMALAIADDFSSAWYNRANVLA
ITGRIQEAAESYEQTLELEPEDINALYNLGIAYEELERYPDAMECYRRCI
TIVPEFGDAWFALACCHEVLEEFDEAYSATLEALKTSADCVEFLLLKAEI
EYTLNKAEESIHTYERIIELEPDNPQIWVDFAIVLREAGMVNASIEALHC
SLKLQPMSADAHFEIAAAYFALGDKLSTLKALSKAFKIDPDKKELFQSTF
PELYQQDSVRRMLGILEMPNE
>Cag_0631 ferrochelatase, putative
MVDGIVKRNKYLVLVTTYGEVEPVTIRGLWPSSRKILETVTRQIAKIPPA
LMYMVADYRSLKHYVDWKLHGYHSSLNAINHAQAQLIASALTKHDAEVLK
ACRVDVQDANYFQPPYFEEVVRKAHGEYDGVIVVPMFPIESAFSCGIGCQ
MAIDEYGASIFHHVAILSGLWSDPHLHQLYATYLYERLSVEPNFPRQGKL
GLLLVVHGTLVRDRKGRPPSVFTGLEATMQFFQAMKQVMQQHPDCMFADV
RQGCLNHSRGGEWTSDTIEKALESFKRDGYDGVVMFPYGFFADNSETDYD
AFNRLKRAAFPYSLYVPCVNEYPPFAEWVAKRVLQRLQALHGMQAAQIL
>Cag_0469 polysaccharide biosynthesis protein
MAALSKLKLLFKDTVIYGASTILARSLNYVLVPVYANTLSTFENGIQTII
YANIALANVLFSYGLETSYLKVAADTHREGSEGEKPLFSTAVLTLLATST
LFALLIVLLAPWIGALVGLDSGAAPFVRYAALILWLDTMLVIPFAELRLR
RKALHFATARLLGVVAVVLCALLFIVVMKVGLSGVFLAEAAGSVVSLLVV
LPLFRGFRFGFSGQQLREMLRIGLPYVPTGIAGLLIHLIDRNILIRIAPS
DIERLYGAGYQPSDIVGIYGRIAAFGVVLQLVIQVFRFAWQPFFLQHGKE
PDAQQLFHHVLSISTLLTMVLALSATFFVPDLVRYHYGGAFYLLPPPYWI
GLSILPAIFLSYVFDMVSTNLSAGLLLTGNTRYLPVVTFAGAVVTALSCW
WLVPLYGMDGAAYAIVAGTVVMSVVMGYYSLRFFPVSHDWAKLLLLLLTG
IGMVLVQRQSEALLSAPAQVIGVKVGLVLLYLALALFLFRNKATRLLKQV
RHRNSSPHVVS
>Cag_0882 sulfide-quinone reductase, putative
MATVVVLGAGVAGHTCASFLKKKLGKRHEVIVVTPNAYYQWIPSNIWVGV
GQMTIDDVRFELRKVYNRWGIILKQAKAIEIHPEGNRDINRGFVTIEYTD
STRAGEIEYVEYDYLVNATGPKLNFEATPGLGPDKHTFSVCTYSHAAHAW
ENLQEVMKKMQAGQKQRILIGTGHAMATCQGAAFEYILNVAYEISKRGLS
KMAQITWISNEYEVGDFGMAGAFILRGGYVTPTKVFTESILAEYGIKWIR
RAGVHHVEPGKVFYETLEGEETSIEFDFAMLIPAFSGVGMLAFDKNDNEI
TDKLFAPNKFMKVDADYTPKSFDQWDAEDWPSVYQNPLYDNIFAPGIAFA
PPHPISKPMSSPNGRQIFPAPPRTGMPAGVMGKIVALNIAERINGAPDFR
HNASMSKMGAACIVSAGFGSFDGLGASMTLFPVVPDWNKYPDWGRDMNYS
LGEAGLAGHWLKFILHYLFFHKAKGYPFWWLIPE
>Cag_0290 conserved hypothetical protein
MKSPFPLFFPLLQKKRDCSKQAQCSKRRLIMALLAVSALAPANLYAASKA
KRQPPLVAHVYPDTLALPYNRYALKPFMRPARKSVAVALSGGGANALAQI
GVLKAFEEAHIPVDAIAGTSMGAIIGGLYSCGYSAAELEQLALTMPWSSI
LALQEDYSRSSLFVEQQRIRDRATIALRFDGLKLLLPQSLNSAQAFTRTM
DMLVLHALYHPHSNFSSLPIAFRAVTTDLVSGERVTLESGSLSEAMRASS
TVPILFEPIHRAEQQLVDGGLVANLPVDELAHFGADCKIAIDTHGSMYAT
GKELDLPWKAADQAMTILITLQYPAQRAQASLVIEPETGKHKATDFKNIP
QLIAAGYVAGKQQVPTLQRLLAITSPSNSSAPQTSSVPPSSIVPSVATPP
PISPILTANKKEMRNFSLATYTKRWSISPTSTELERLVGEKVASALELHA
LLRDLLATDYFARVSAEVHQEDRTVTVKLEALPSVTVVTVQGELADELSS
AELNECFAPLMGRLYTNHQATAALEALVRRLRAKGYSLAAIEQVHVENER
LTITFSSGKAAMLTISLNKGRTLLTPIQRELKLDATKPLRLRAAEESVKN
LYETGVFNRVSLFAEPITQTEAIAPISSTTPNQTIHLSLEEKPASVLRLG
LRYDETNNAQLLLDVRNENVGGTTNTMGGWVKAGRKGYLANMELNMPRIG
ATHLIFATRLFFDSYLFDYTNSDGSLAPYNIQKYGITSSFGTRLRKNGHF
LTDVSYYNSQAFTDEAHRPLFSTTNNNVLTIGTHLTIDSRNNALMPTRGS
YSYLTYAFTPLSLDDGLRYWQFSGTHQVNLPLGRETTLQLSAMTGVSSKA
LPLSEQYFLGGIGNSYSARFIGLQPHALATNNVATAGVQLSYEPSFPILF
PTTLQLHYNAGRGWNAMENVRLDGALQGALQAVGASMVWKTPLGPTRFTL
AKVLVNNDDNSLMLPHRDDDPVFYFSIGHDF
>Cag_1377 D-3-phosphoglycerate dehydrogenase
MIALAKKENENVMKVLITDSVHPQCGRLLLQHGFEVTEKPSLSPKELHAI
IADYNILIVRSATSLPAEVLAKATQLELIGRAGTGVDNIDLEAATRQGIV
VMSTPGGNAVSAAEHTCAMLLAAARHIPQAMADLKQGNWNKHLYAGIELE
GKTLSLIGLGRVGREVAMRMQAFGMRTIAYDPAIADEDAALLDIELLPLH
ENLLRADVITIHSALDESTYNLLGKETLSLTKPGVIIVNCARGGIINEVA
LAEALASGHVAAAALDVFTKEPIAATHPLLQFPQVIATPHISASTAEGQE
KVAIQMAEQIIAWKRDGILEGAINGTIVELAQLPGAHAYLHLAEKLGATL
AQCTPLDGKHITMKTSGEFLYTFHEALSAAVLKGFLAKRHAKSCNYLNAF
LVAQEYGITLQQKFEPNTHDFNNLLRVELASGSINRMIGGTVFGEKEVRI
VMVDQFLLEFKPEGSILLYTNNDQPDVIARVTQLLLAHHCAMAYLALSLD
DLHNSAMSALVVNGKVTNDLLEAIKQLEGVTSLSLLEL
>Cag_0596 acetyltransferase, GNAT family
MKHMNPKIVIRNEANADIRAISEVTAAAFQTLEISNHTEQFIIEALRATK
ALTVSLVAEIDGLVIGHIAFSPVTISDGTRNWYGLGPVSVLPAYQQQGIG
KALIWEGLLRLKDMNAQGCCLVGHPDYYIKFGFKNLPGLVHEGVPLEVFF
ALPFDGHIPQGTVTFHEGFKADG
>Cag_1058 nicotinate-nucleotide--dimethylbenzimidazolephosphoribosyltransferase
MKEKLETLLAAIQPADQTLRQAIQAHLDDLTKPQGSLGRLEEIAQQYILA
TGQIHPSLPKKKICCFAGDHGVAAEGVSAFPAEVTPQMVYNMLHGGAAIN
VLSRHVGAELSVVDVGVNHDFADAPNLVQCKVKRGSANMAVGAAMSETET
LQAILAGAELAFQAADAGYGLLGTGEMGIANTTPATALYAVLLNLPVEAI
TGRGTGIDDARLHHKIAVIERAIAVNAERCATPFGTLAALGGFEIAAITG
FILGAAARRIPVVVDGFISSSGALVAMKMVPSVVDYLFFSHLSAEQGHGA
VMEALGVRPMLSLDLRLGEGTGAALAMQLVDAALKIYNEMATFSGAQVSE
KIEG
>Cag_0022 patatin family protein
MKKILSIDGGGIRGLIPALVLAEIEAQSGKAIGATFDLIAGTSTGGLLAL
GFAKNDGNGKAQYSANNLADIYLSRGNEIFSKSFLKSVASVEGLRDELYS
ANGIEHVLDDYFGDDPLSSCITKSLVTCYDIQNREPLFLKSWREEYQSVL
MKHAARATSAAPTYFEPALIPIGGATKALVDGAVYINTPSVSAYAEALKL
FEDEQDFFVLSLGTGELIRPISYDKSKNWGKAEWVVPLLSCMFDGMADAA
NYQMKMLLDDKYVRLQTNLSVASDDLDNVTANNLENLILESQKLIRTHRQ
VIDMVCSLL
>Cag_1642 oxidoreductase, short-chain dehydrogenase/reductase family
MQNLWNDAELQGFVSNVCHEPDDHPELAALVYASRLLGRERSLVMHGGGN
TSVKCGLTDMVGNHAEVLLIKASGIDLSNVTCRDYTPLRLGPLSKLVELC
SSNDPIHAERVERFSTKEFKHLLMLNMFSLTDHMAEKRLTPSIETLLHAF
LPHRYILHTHSFALLTMSNQPNGEALCRETLGEAFGSVPYIKPGLGLARA
AAGVYEAHPAIEGLVLQKHGLVTFGETAQEAYNRMIDAVTKLEERIALAG
RKPFTTVPLPEEIAKVEDVAPIIRGACAEEKEVGRRDYQHLILDFRTSDE
ILTYVNSADVVRMSQKGSMTPDFIIRTKNKPLVVPAPDAADLNGFKAAVD
EAVQRYRDAYIAYFNAQQQASGMEVTMLDPMPRVVLVPGLGLFGLGKSAA
AAAVNADIATCTATAILDAESVGSFESISEREAFDIEYWDMEQAKINKVY
HGTFAGKVVMVTGGASGIGLATAKAFRQRGAELVVLDLSQEALDKAAEEI
GGNPLTLTCNVTSRADIRAAYDAVCKRYGGVDVIVSNVGAAIQGRIGDVS
DELLRKSFEINFFSHHYIAQEAVRVMRLQGTGGVLLFNVSKQAVNPGPDF
GPYGLPKAATMFLVRQYALDHGRDGIRANGINADRIRTGLLTEEMIKSRS
AARGLSEHEYMAGNLLQLEVYAEDVAEAFVHLAQEIRTNAAIITVDGGNI
AATLR
>Cag_1768 TPR repeat
MLMLAGCSSSSSTVSTQKIQAPLPKPLPETVAYELATASLLMAQGEYQQA
LERYRALLTTESNNAALHHALAKAYTANGEFVAARQHSQQSVTLEGTNVW
YLRLLIALTHNESDYAQAVALSKKLVTLEPDNREALTMLAYEHLAARQPN
EALEVFQRLLQLDPANAEVLLSSAEVALELGRRSDALRFFNQLLHYGIES
DSIHFFIGDLQQQQGLHEAALASYRNALKLNPHLLPAWYRRLELVALSPN
LSQSSKPTLFAEELQHFYKQSGTTLEQQLGLLQLFTNRATRNPAFISATQ
SMIKALQQRYSSHSLVRFTVQIAQGRLFVAQGQHAQAITLLRQALRSPHA
TRQPNVALDAESTLALAYERSGKVTESIRLYEKMLRRTPNNALLANNLAY
LLATQHRELPRALELAKKAVAAEPNNPIYLDTLGWVHFAMQQYEPARELL
EKALQGEPNEPEVIEHLIAVYEKLGNQSKVQELQERLRRVCL
>Cag_1480 Acetyltransferase (isoleucine patch superfamily)-like
MNPWKNLQKRYFPALCYWLGNHFFMNWTPYPVRHWFLRKYCNVKIGKDSS
ICMGCFITGQKIEIGLNTVINRFTYLDGRVALRIGNNVNISHYTLIQTLT
HDPQSSNFTCQEKPVTIGDNVWIGARAIICPGVAIGEGAVIAAGAVVIKD
VPPYTIVGGNPARYIKTRTNDLHYKTRYFPLLDTDIQ
>Cag_0304 conserved hypothetical protein
MNALVEDIKKLPVVERIELVEEIWNSIPQCSLELSAEECTELHRRYAAHQ
AHPSTAITWEEVRSKMLSTSQR
>Cag_0029 DNA gyrase, B subunit
MPPAAYGATNIQVLDGIEHVRMRPAMYIGDIHSRGLHHLVYEIVDNSIDE
TLGGFNDYIFVALNADGSITVIDHGRGIPVDMHPEKQKSALELVMTVIGA
GGKFDKGAYKVSGGLHGVGASVVNALSEWCEVEVYRDGKAYYQRYERGVP
QGDVKVIGDSDQRGTKTTFMPDGTIFKTTEFRKEIIIDRMRELAFLNKNL
RIIVQDTNGEQEEFHFEGGICEFVRFTDQNRLNLLREPIYLYGERDGTVV
EIALQYNDSYQENVFSYVNNINTHEGGTHVTGFRKALTRTLNSYAQKNDL
LKNLKLTLTGDDFKEGLTAVISVKVAEPQFEGQTKTKLGNSETQSIVETV
VNDQLAEFAESNPNTLKLIIEKVKGAAMSREAARKAKELTRRKSVLESSG
LPGKLADCSINDPEHCELYIVEGDSAGGSAKQGRDRSFQAILPLKGKILN
VEKARLHKMLENEEIKTIILALGTSFGDEEFAVEKLRYGKIIIMTDADVD
GAHIRTLLLTFFFRHMRPVIEAGRVYIAQPPLYLVKSGKDQHYAWDDDER
NSIVDNMKKMQKSKANIHIQRYKGLGEMNPEQLWSTTMDPAHRSLLLVSV
ENAMEADQVFSTLMGDKVEPRREFIEKNARYVRRLDV
>Cag_0222 dihydroorotate dehydrogenase, electron transfer subunit
MSVEEHYHAAECTCASHLCDVPATLVRKQQIAADVWIITFEAAAIAQMAK
PGNFVTIKVNSTTQPLLRRPFSIHNVQGSLLDVMVKNVGSASNILCSMSD
GSTCFMLGALGNSFSFDSPSFDVGVLVSGGIGTAPMLFLQQQLAAQGKEW
VHLLGGRTKHDVLAYNLSNCRVATDDGSEGFCGTTVDLLRHELPNLRALG
RLQLFACGPNGMLRALATLCGEEQLACELSLEAVMGCGIGICYGCTVEVN
DLAGGRRMLLLCQEGAVINGDLLI
>Cag_1192 transketolase-like
MEENKSGEVAERYVSRGNKATRVGFGNALLEIGKEESNVVAMCADLTGSL
NMNLFRDAFPERFIQAGIAEANMISMAAGLAASGKIPVASTFAVFATGRV
YDQIRQSVCYSNMNVKICASHAGLTLGEDGATHQILEDIGLMRGLPRMTV
VVPCDYSETMRATKAIIKHQGPVYLRFGRPNVPDFTRDDDGFEIGKSIEM
HPGKDVTVIACGIMVWKALEAARILEKEGVGVRVINMHTIKPIDTLAVVC
AANDTGAIVTAEEHQIYNGLGDAVANVCARNIPVPIEMVGVEDQFGESGK
PDDLLEKYKLTTADILEKIYLALRRKS
>Cag_0371 putative ABC transporter, integral membrane protein
MIKTLLSIAMRHLVGRRRQTLTTIAGVAISTMVLITTNSLTRGLLDSFVE
TIVNVAPHIIVKGEKINPMPINLFSGKQNNAIAIVEDNIQKQEREEVQNY
RQIIALLDTPAYASRVTAISPYVQSQVMAVKGSRNEALIIKGVDINQEDK
ITGIRKKLVAGDVAQFEKNATALLVGRTVARDMNIELNDQVVIIPASGKS
WQCKVAGIFFSGVNAVDNSVLVSLKFGQIIEGLPDNKVSGIALTVKDPFN
NKPLATELEQLSGYRCLTWQEENANILSLFSRIGYIVFSLVAFVGVVSGF
GVANILVTTIFEKSRDIAIMKSLGFSARQLVGMFVVEGFLVGFAGALAGG
VLALGAITIFASIPVESSQGPITKTGFSMSYNPLYFFIIIGITVLISTIA
ALLPSSRAARLEPVSVLRDSSV
>Cag_1965 Fe-S-cluster-containing hydrogenase components 1-like
MNYGFVIDARKCMGCHGCTVACKSEHQVPLGVNRTWVKYVEKGMFPESRR
YFTVLRCNHCAEPPCVDICPVEALHKREDGIVDFDKRRCIGCKACAQACP
YGALYIDPETHTSAKCNYCAHRKEVGMKPACVVICPQQAIVSGDFDDPNS
AVSILLAKNQTAVRKPEKGTSPKLFYINGDGASLDPLEVSAVNGYLWSEQ
LRGVGHFAGKSYEKPAKATVPQSGKHGRPKARKVYDVPSKGVVWGWEVPA
YVWSKAVSSGLFLVLVMMQLFVASLLSESMQWASWAASLFFLALTGGFLV
KDLDRPERFSFVMLRPQFDSWLVKGGLTITGFGAFLALWGAGSFLQLPLL
TSIAQVGGTLFAVITAIYTAFLFGSAKGRDFWQSPMLLLHMLLNSLLAGG
SAMLLLGVVTASSNDLFSLLQPSLAAGFAFHLIIMALELFGKHPSASAER
AAETILHGELKHPFWIGSVLIGNLMPFVLCLVAPSSFGLTIAALMALCGV
FYTEKVWVRAPQTVPLS
>Cag_0743 metalloprotease secretion protein
MAEKITSPLDKGAANAPDAKKYRDTRSPIRLGIWILLVGFGSFLLWASFA
PLDEGVPCQGLVGIATKRKVVEHLRGGTVQAVHVREGEIVQEGQVLISLD
SQTARARFDEIHQHYIGMRATADRLQSEMRGAGSIAFHPDLLRESDKSLV
RKNIENQKALFASRRQTLQILTEQLIAIKSLVSEGFAPLSQQRDLELKIA
EFKSSTASQLAQVQLEVEADAEKSRALAAELADTEIRSPASGQVVGLQVQ
TVGAVIQPGQKIMDIVPLDESLLIDAKIAPHLIDSVHQGLAVDINFSAFA
HAPQLVVEGNVESVSKDIVTDPPSSGTQPGASYYLARIAVTKEGLKQLGK
REMQPGMPATVVIKTGERSLLTYLLDPLMKRIHVSMKEE
>Cag_1363 serine/threonine protein kinase
MRKRLFIKKQKFDKWELKRFLGGGGNGEVWECCDEEGNKGAIKLLKHVKS
KSYARFCDETKIMEQNFDIEGIIPILDKFLPEKLDGSIPYYVMPMAESAE
KVFKAKNIVSKIDSIIEICKTLAKLHERGIAHRDIKPPNLLVFNSRLALA
DFGLVDYPDKKDISLQNEEIGAKWTMAPEMRRESSKADSLKSDVYSLAKT
IWIILTENPKGFDGQYSIDSIIELKRFYNKTYTTPIDNLLTKCTDNDPNQ
RPTVNEIILELENWKVLNKDFHERNQEQWFEIQTKLFPMTFPKRVIWENI
EDIVKILKVVCTYDNLNHMLYPNGGGMDLEDVRLSHEKSCIELDCQLINI
VKPKRLLFESFGYTAEWNYFRLELYELEPSGAYENDEYYENIQYYEEYDG
YVSPENTMQLLRWFRGSFVIFNKRSVYNRISSTYDGRHNKMNTEEFRDYI
QEMVSHTIEMNKKKSAMATIESKRRKTR
>Cag_0613 SapC-related protein
MYKKIVPLQTALHSNLKLKADNTFAFASGIHVAAIMANEFYRAASTYPIV
FIEDAGKDGFMPVVLLGLTENENLFVNAEGRWEATYVPAILRRYPFALAK
TGEEGQYTVCIDEESGFFTEEEGQALFNEDGQPGEVLEKVKTYLAQLQQM
EQLTRQFCEVLKEHNLLAPLNMRVQKDNAVQNITGAYAVNDERLSHLSDE
TFLELRKKNFLPLIYSHLVSVTQIERLVQLRDKTTVIENNHTQE
>Cag_1161 Tyrosyl-tRNA synthetase, class Ib
MNFPSIQEQLDLITRNTVEIISVDELEQKLVSSQKSGKPLNIKLGADPSR
PDLHLGHSVVLRKLRDFQDLGHHAILIIGNFTAMIGDPSGKSKTRPQLSA
EEAAENGRSYFEQAIKILDAANTTLCSNADWLGAMTFADVIRLSSHYTVA
RMLERNDFEQRYRAQEPISIHEFLYPLAQGMDSVHLKNDVELGGTDQKFN
LLVGRDLQREYGIAPQVCITMPLLVGTDGKDKMSKSLDNAISFTDTPNDM
YGRALSIPDTLIETYSRLLLSQFGAALNEILSQIETNPRLAKRAMARHIV
ADYYSPEAASAAEEHFDKLFVHKQAPDNLELLELEATSMPIIDLLTLLGA
APSKSEARRMIQAKSVSIDDEKIEDFNAVIALEGEAKIVRAGKRKFFKIR
SK
>Cag_1886 von Willebrand factor, type A
MSEWLASLPTLTFAAPWWLVLLPLVAIVLWLKERWQRQVAAISFPDVQRF
ERAKLVAPRWMVRMPQWFRWAALAVGMLLLAEPHLTLRSTTAAARGIDMV
LAIDISESMMQSQTDTQSRFEIARQAARNVVEQRSNDRIGLVVFRGEAYT
LSPLTRDHTVLSLLLDNLSSRIIQDDGTAIGSALLVALNRLQASESELQM
VILLTDGENNAGEVSPLTAAALAARRGVRFYVLNVAFESVKDENAPRSAL
YAAELQEVARRTGGSYFTVNNKTELETTIASIAARAKNGQGNMVVVQHNA
VTQPLLLLLLSLLGLELLVSATRLLKIPS
>Cag_0951 conserved hypothetical protein
MPQSQQLTNEQQAALQQALIYRFLGLVFAYPNDAFLPTLQNALQKISDNA
ARFQPLLDAFAAEPQEQLQAEYTRLFLNGYPNTPCPPYESVYREERMMGE
SSLAVQKLYQQWEIAIDANLSDHLATELEFLAFLSAATTLTEVATDALAT
REYFLEEHVRQWLPQFCRDLQKEATVEAYRLLSQLLANVLLH
>Cag_0266 Protein of unknown function DUF132
MSKYQIVVDTNVFVTALRSQYGASYKLFSLIDKDIYQLNISVPLVLEYEA
VAKRMIDKILLNEEEVDNILNFVIQNSNRWEIYYLWRPQLKDPCDDMVLE
LAITAACNYIVTYNINDFKGIEGFGIEAITPKAFLKLIGEL
>Cag_1387 ATP-dependent endonuclease of the OLD family-like
MNSILYGVGNKFIQTNTFERNDLHNLDYTNQIRIRIELQGSDFTCPQYWD
RQSNSYRTTKSITGTYEITTEIDDSELKSGMQPSMFGMNKHYNIFYINFH
NIKDEIKTQRTSWGNLTSFLAKHIKSIVDTDTSMAAKKEDYENEVELATD
KVLKNSQLSAFIDKIKENYSTNLRNNSCEVKFGLPDYEDIFLQMIFKVGL
NGDNANLIPIDHFGDGYISMFVMAVIQAIAESNTDDRCLFLFEEPESFLH
ENHQEYFYKTVLCNLAEKGHQVIYTTHSDRMVDIFNTKSIIRIELEEQDK
QTVVKYNNVGEFSPTMPTNSNGQEIISFANFNSYIKSVEPNLNKILFSRK
VVLVEGPNDILAYKIAIEREVEKAHGDKKYAETYLSFLNIAFVVHHGKAT
AYLLIELCKHFGLDYFVINDWDFETDFVTDLANFQDENTLKQDNLYLKDG
ADDRSSNSKAMITTNWKLLNNSGIDKIHFNIPKLERVLGYQSDDKDSLGI
LNTVQKLIYYTETFLPTKLKEFLELDKLTNLTENVVETANSEVDTDELPF
>Cag_1738 hypothetical protein
MIDKNELLKRISAIEQSEESVISIYSSHIQHVLRYSNINKESQAKIIEML
KQLDSDLEEHKIVTKQLVDAIAKSEKSIF
>Cag_0738 VCBS
MNYYTSTDLLLNSVSVVVTERREVAFVDTSVFNWQTLVADMRLGVEVVLL
DASQNALEQMASWSLTHSGYDSLHVLSHGSAGALQLGTVRLSADTLPDYS
AVLVQIGSALTADGDILLYGCNVAAGDAGQHFIAALAEVTGADVAASDDL
TGAADKGGDWELEANVGEVESESVSNDVFFDVLATITGDVTAPTIDGLSS
TPADNATGIAVDANIVIDFSENVAFGTSGTITVRNVTDNTTAGTFTINAN
HTATGNSGFGTATISGDKLTINPTNNLTAGKQYSVQFTAGSIVDTATSPN
SLAAISNDTTYNFATNIEEPATGALGGGEGIYIFAQSFTANKDGVINSIA
VAADFESNSQASTTASSTLKIYAGEGTSGTPFYTQVVGVIPDTVTTIDND
TAGGFNHVLTLTTFTLTTPVSVTNGGKYTFEFTPAGANALLYDNIGDYAG
GDLYLDGSQQSGLDLVFKVVLGDAAADTTAPTASVTAATVNNTTSVTTVQ
STETGTVYLVKQGSAVTNKASLDTLASGNNANSATVTTASSNTTISTSGL
TDGTYYVYAVDAAGNVSVASTNAVTLDSTAPTFDVAPATDDVTTTTLDLS
ASINEAGTIYYIVVQNNATAPTSAQVKAGVNYTGGTVVADGSQAVSSGDF
SHTFDNVAGLSEGTAYDVYVVGEDSANNLMTSPTKLDVTTSSFTAAVDAN
GVLSFGGTATGNILLSVTNAGVITATRGGQPFTAPFTLSNIHSVNIPTGT
TVSVSGSYDISTFNQDPAPANTTLLNFGQVFRKVTMGSSGTPVVYYLPAT
VDDLGQTDVFVLKNGTDAYKNSNLKTGFAFYTSDSINAITSNDTKDITAS
SIDAEGALTIYGMTSGEGRDLLNIKSFTEADFIDALKGNVGSVEVKISDT
AYKNVFANKSFVTVKITVDLGSGNSKVIELVDLTATDVVVAESTTVAQAL
YAALGNHSLTTSYQTVLDDNISDVMRLMADSLDFNDAPGITVTQTSGTTV
VTEGGATDTYSIVLDSAPTGNVVVTLDDTNQQIDLDKTTLTFTSSNWNTP
QIVTVSADNDTVGEGKHYGVIKHTVTSADTNYSNKTIGDIRVTITDNDLA
TGTPTFTSQASNFGISMGSYASPTLVDIDADGDLDAFVGNWDGNVLYFKN
QGEDVSHPQFVTTASNPLQGVNVGQSAQPTFADIDGDGDLDAFVGSYENG
ILFFRNTGNASMAAFAVSVDATQFGLTNVGAYVAPTFADIDGDGDFDAFV
GNKDGNTLFFRNTGNVTSAAFVSSANFGITDIGSYAAPTFADIDGDGDLD
VFVGSYKSGILFFRNTGNATSAAFVSSANFGITNVGSYSAPTFADIDSDG
DLDAFVGYSGGNTLFFLNAPTVTLSANPSSVAEAAGTSVITATLSAAATT
ATTVTIGRKSNSTAALNDDFTLSAATITIAAGSTTGTATLTAVQDVVDDD
SETAIIEITAVSGGATESGTQSVTVTITDDDAPAADTTAPTIDGVNSTPA
DNATSVAVDANIVIDFSENVAFGTSGTITVHNITDDNTLGTFTVSNGSAT
STIGTATISADKLTLNPTNNLLAGKQYSVLFTAGSIVDTAATPNSLAAIT
DDDTYNFTVAPAANPGITVTQTSGTTAVTEGGATDTYSIVLDSAPTANVV
VTLDDTNQQIDLDKATLTFTSSNWDTPQIVTVSADNDTVGEGKHYGVIKH
TVTSADANYSTKTIGDIRVTITDNDLSTATAVFTSQASNFGISDIGSAAS
PTLVDIDSDGDLDAFVGNFDGFTRFFRNTANATSAAFVSSGNFGITDVGF
YASPRLVDIDGDGDLDVFVGNTDGNTLFFRNTGNATSAAFVSASNFGITD
VGSSASPRLVDIDNDGDLDVFVGNSDGNTLFFRNTGNATSAAFVSASNFG
ITDVGGYAAPTFADIDGDGDLDAFVGNYEGNTLFFRNTGNATSAAFVSSG
NFGITDVGYDASPTLADIDADGDLDVFVGNYDGVTLFFLNTPAPAGDTTA
PTFDVAPATNNVATTSFDLSASIDEAGTIYYVVVADGATAPTAAQVFDPT
TYTGEIASSSSAVATTPFTSSFSAVTGLTASTAYDVYFVAADDESTPNKM
ATATKVDVATSAAPTVTLSATPSSVAEAAGTSVITATLSAIATTDTTVTI
GGSADSTATLTDDFTLSSTTITIAAGETTGTATLTAVQDVVDDDSETAII
ELTAVSGGDGATESGTQSVTVTITDDDTVPTLSVTDNTTLFTVGGAAVAL
APDAAIENPDNLAITEARIFITGALATDLLSFTDIDDITGSYDANVGMMV
LSGTGTTEEWQAAIRSVTYSSSDESPDMDPRDVTISVKTGVLGSEILANG
ASSGTPATAPVSLNQDEASIAKLTGGGFVVVWYAPDSEGNGVYGQLFNAD
GEHVGSEFLINQKEITDDSSDQDFPVVTGLTNGNFVVAWNSDEQDTPEEE
DYDREVVARIFNASGTAVTDEFTVNTWKGNNSGSDNQWEPAITALANGKF
VIVWESDEQDGSTEDNIYGQVFNADGTKSGSEFLVNTTTADEQDTPEITA
LSDGGFVVVWQSITVDGDYYQICGQRYNADGTTNGNEFVVIDTTTDDVAV
EVPFVSSLPAGGFIVAWTQGDPLVDSEVYFKRYANDGTPANAIQVGADLQ
GAQSEVSISYLNNGGFVIVIESTDNGVSGGVFAQPYNANNEPVGTVFQIN
TTTQYDQDSPVVAPTTEGGFIIAWEGEEQSGDGVDSDNDVYIQRFGSSIS
VSTATVALSILAPGITVTQSSGTAVTEGGATDTYSVALTMQPTADVTVTL
DDTNGQVSFDQESLTFTSSNWNIPQIVTVTADNDTVGEGTHYGVIKHTLE
SSDAAYDGIEGDNVRIIITDDDLPLNTDPTFTQQISNFGISDVGDYASPT
FADIDGDGDLDAFFGNEYGDVLFFKNEGEDVSHPLFVTTASNPFEGIDVG
IYASPTFADIDGDGDLDAFVGGNYYDNDSSESVSKVLYFRNTGSAESPTF
AAAEDAATLGLSNVGYRAKPTFLDIDGDGDLDAFIGKNDGVTAFFRNTGS
AESPTFAAAEDAATLGLSDVGYHASPTFADIDGDGDLDAFVGNQDGVTAF
FRNTGSAESPTFAAAEDAVTLGLSEVNKYASPTFADIDGDGDFDAFVGEK
YGEVLFFLNAPLASPGISITQSNDTTAVTEGGVTDTFEVVLTSEPSADVV
VTLDFDDDQLSLDHTTVTFTSSNWDEPQTVTVTAIDDTDDEGVDISPIQI
TVSSSDGDYSGISVTDLDITITDNVTFVPPQTNPFGLSDVGISAKPTFVD
IDHDGDLDAFVGNRDGKTLFFENEGNATSAAFAASVNASTIGISDVGNFA
ALTFGDIDGDGDLDLIVGNDDGTLSYFKNVGEDINPSFSMVTFSSPFATI
DVGSGSAPTLVDIDADGDLDLFVSDLYGKTFFYENVGEDASHPQFTSSVN
ASTFGIKDVGSCATPIFADLDGDGDYDAFIGKSNGSTVYFENVGDATEAY
FVTAGTNPFSLRNVGYSAAPTFADIDGDGDLDAFVGNHDGNVLYFESHDG
DVTPPVFDANFTVSGVTTSSFKLSASIDEAGTIYYVVVAHGSVEPSAEQV
QEGKNVFGIPVHLSNSADAATGEFTTEFTLSGLTAGTTYDVYVVAEDVAG
NLMEAATLVEVTVPTTTASQYLPEFEFSTVNPFGLTNSGYYAAPIFADIN
GDGDLDAFVGDVFGNVHFFKNTGSATSAAFTEVSTNFFGLQNTHHAVPTF
GDVDGDGDLDAIVGDSDGNQLYFENVGTTSSASYTAPITNPFNFSDVGYY
AAADLADIDGDGDLDLFVGTYDGDLLFFKNVGTAVCEDATPCENPCENET
PTFCENDSNTPLFESALTNPFGLANTGHHVAPTLADIDRDGDLDLLLGNC
AGNLFFYQNTGNASNPEFLFAATQIPAESLCAPILSTTLFGLSNKGTFAK
PTLADLDGDGDVDAFVGTSSGDIYYFENVAPVPAGVTITQTDGSTAVSED
GVTTDTYTVVLDSAPTADVTITLSTSNGQVRFGSAGDDTITLTFTTTDWN
VAQTVTVVANDDDVLEGAHVEFITHTVSSDDECYDGFEVNPLEVSIADDA
TDKSDIRLEFILHSTNPYGLTAVDAHAKPTFVDIDNDGDLDAFVGSMLSE
VTYFRNDGNASSASFVTVSGVLSTNAGWSAAPTFADIDGDGDLDAFVGNY
EGSILFYQNWEQEKYASQPIFVSVETNPFGLTRADNIFSAPAFADIDNDG
DLDLFVGNYKGDMLFFENIGTVSSASFAAPLTNPFGLRNIQASSYPCECN
PQNGAQFATPTFVDVDGDGDLDLFVGNANGDTLFFHNIGGEDSPLFALPS
TNPFDLTNVGGYAAPAIVDIDKDGDLEALIGNADGNIVLFEQNIRPTLTD
VDTLSVATEDTEFTITFASLTASADEADVDGDVVGFVVKEVSSGTLMIGE
DAASATAWNLATNNTIDADHHAYWTPAQDANGSALNAFTVVALDDDDDES
ATPVQVTVAVTAINDAPTFMVGSGVVTTSFGSANDGAYELTIDGNGRIFV
VGYTGAYNDWYNFALACYKSDGTLDNDFGTNGIVTTAINDYDWAESVAIQ
NDGKIVVAGLTWNNDANPDNYDFALIRYNSNGSLDSSFGEDGIVVTAISD
EWDDEIYDVTLQADGKIVVAGSVGNYYEDDWFNFALARYNSNGTLDTSFD
GDGVVTTELWDFEEAFDVTVQADGKILAAGYTWDDVEGAYEFALVRYNNN
GSLDATFGEDGVVASDITDYWDEGRSVTVQADGKIIVAGFIGEDDDWYNF
ALTRYNSNGTLDTTFDDDGIVVTVINDWELAYSVTLQNDGKILVAGKTYN
YDTQSYEFALVRYNNNGSLDTTFDDDGIVTTSINGWDYAYSVTAQNDGKI
LVAGETYNYDTKSYEFALVRYNNDGSLDKRFGIQANTLDESPTYIENGSA
IILDSNVQIFDAELSVLDNGVGNFAGATLTIARNGGADAEDIFSGAGIIS
GEDSGSIIVDTTDIGSYTFAGGELKITFDEDATQELVNEALQSIAYANSR
ESLGEDETDIVTLDWTFNDGNTGDQGFGEDLSGSGSTVVTLVGVNDVPTL
STVDTFTGATEDTKLTITFSDLTEHADEADVDLYGTVDGFVVKAVTSGTL
MIGEDANSATAWNLATNNTIDADHHAYWTPAQDANGSALNAFTVVALDND
DDESATPVQVTVAVTAINDAPTFVVGSGVVTTNIAEIDGSKSFDFSSGII
ALQDGDFLVGGTSMFITIGSALLRYNTDGSLDNAFGNNGIVTTPIPISIQ
SPFLLPNAITTVNDGYIVSGTTYYGSGDSDFVLVRYDVDGDVNTSFGESG
IVTIARTTSNNIFGNGHYGIAVDGDNRILVAGLNFDSVSSTNDILLSRYD
EIGTLDTTFGDNGVVAFNIGAISPFSYTNVVVVEDGYLVIGTAYNGNSGS
DVVLIRYNESGTLDTGFGDNGVVDFGSNNQEWGAIATSVFVDSAHESILI
VGAKGFNESETDFVLARFNTQTGALDTSFGTGGVVTTNIYSYSDAGTSYN
SIDIATSVTIDSQNNILVAGYSLDPVFSIATISIVRYNEQGELDETFGIN
HNGIVTTELNVPAELLYEMLFFFGASLNVTTQLDGKILVSTTNFDLATNN
ADIELLRYNSDGSLDTTFGIPTNTLDESPIYIENGSAIILDSNVQIFDAE
LSVLDNGVGNFAGATLTIARDGGADAEDIFSGAGIISGEDSGSIIVDTTY
IGSYTLSDGELQITFGEDATQELVNDALQSIAYANSRESLGEDETDTVTL
DWTFNDGNFADEQGLGEDLSGSGSTVVTLVGVNDAPTITSGIDDVSFTED
VSASAQDLTEGGTLSFDDVDTNDVIDVKYSVKNGATWSGATASVAMPSGL
AAQLEAGFAISGEDEAAPGSVSWSYGVTDANLDFIAEGEQVTLSYTVTVT
DNHGATATDDVVITINGTNDAPTITSGIDDVSFTEVSGDSSAQDLTEGGT
LSFNDLDTNDVIDVTYSVKSGAAWSGATTGVAMPSGLAAQLEAGFAISGE
DEAAPGSVSWSYGVSDANLDFIAEGEQVTLSYTVTVTDNHGATATDDVVV
TINGTNDAPTITSGIDDVSFTEDVSASAQDLTEGGTLSFNDLDTNDVIDV
TYSVKSGAAWSGATTGFAMPSGLAAQLEAGFAISGEDEAAPGSVSWSYGV
SDANLDFIAEGEQVTLSYTVTVTDNHGATATDDVVVTINGTNDAPTITSG
IDDVSFTEDVSASAQDLTEGGTLSFDDLDTNDVIDVKYSVKNGATWSGAT
TGVAMPSGLAAQLEAGFAISGEDEAAPGSVSWSYDVSDANLDFIAEGEQV
TLSYTVTVTDNHGATATDDVVITINGTNDAPTITTGIADHSFTEVSGDSS
AQDLVTGGTLSFNDLDTNDVIDVTYSVKNAAVWSGATGSVAMPSGLAAQL
GAGFAISATDVAAPGSVSWSYGVTDANLDFIAEGEQVTLSYTVTVTDNHG
ATATDDVVITINGTNDAPTITSGIDDVSFTEDVSASAQDLTEGGTLSFND
LDTNDVIDVKYSVKNGATWSGATASVAMPSGLAAQLEASFAISGEDEAAP
GSVSWSYGVSDANLDFIAEGEQVTLSYTVTVTDNHGATATDDVVITINGT
NDAPTITSGIDDVSFTEDVSASAQDLTEGGTLSFNDLDTNDVIDVKYSVK
SGAAWSGATASVAMPSGLAAQLEAGFAISGEDEAAPGSVSWSYGVSDANL
DFIAEGEQVTLSYTVTVTDNHGATATDDVVITINGTNDAPTITSGIDDVS
FTEDVSASAQDLMEGGTLSFDDLDTNDVIDVKYSVKNGATWSGATTGVAM
PSGLAAQLEAGFAISGEDEAAPGSVSWSYDVSDANLDFIAEGEQVTLSYT
VTVTDNHGATATDDVVITINGTNDAPTITSGIDDVSFTEDVSASAQDLTE
GGTLSFNDLDTNDVIDVKYSVKSGAAWNGGTIDSTLKAALEAGFTISATD
VAAPGSVSWSYGVTDANLDFIAESEQVTLSYTVTVTDNHGATATDDVVIT
INGTNDAPTITSGIDDVSFTEDVSASAQDLMEGGTLSFDDLDTNDVIDVK
YSVKNGATWSGATTGVAMPSGLAAQLEAGFAISGEDEAAPGSVSWSYDVS
DANLDFIAEGEQVTLSYTVTVTDNHGATATDDVVITINGTNDAPTITTGI
ADHSFTEVSGDSSAQDLVTGGTLSFNDLDTNDVIDVTYSVKNAAVWSGAT
GSVAMPSGLAAQLGAGFAISATDVAAPGSVSWSYGVTDANLDFIAEGEQV
TLSYTVTVTDNHGATATDDVVITINGTNDLPSISALDVVGAVTEDSNTVS
DNPQTVGVENGSYLTESGSVMFSEVDDTDILTSTVALQGTPVASSGASVS
AGLGTALSEAVTIAQTGDNDGSIAWSFALDNSLVQYLAKDETVTATYRIT
VTDDSGAENNSQTQDVTVTITGTNDIPTIIIGSTDAVGAVTEDAAATTLS
DSGTITFNDVDLIDVHNASVVASNSNTLGGTLTFGSVTESASTESGSVSW
TYAVANSAVQYLAKNETATESFTVTVSDGQGGSVTETVAVTVTGTNDIPT
ITGTNTGDVTEDSNVTSDDYITTSGKLTISDTDQNQSFFTPHASYQAQYG
TFTLDANGNWTYSANNEQAAIQNLGAGQSLTDSFTAVSKDGSEQQTVTVT
IHGTTNASVSVGDATVNEAVETAIFTIYRLDDTYGDVYVNYATQDGTATA
GSDYVATNGTVHFADGETEKSVTVAITNDSLFEGSESFNVVLSNPVPSAV
TVSKVSGVVTIEDNDTPPTVSVSSVTVGESSPYAVVAVTLSNPTTQAVSF
TPSLHNGVENEQSKAATIGQDTTPIDNTTGVLQYYNGTAWTNVSEAVTIN
AGATSVLLRVGIYNGTLYEGSESFTIATGEITGTVTNNASLAGTVTIIDD
GSSSNGFTPTNTTGQATSKPEFANDDRPTISIHNLTVSEAQSHAIVTVSL
SNASTQAISFTPSLVSGTATVGTDTGVLEYFNVNKVNGAGWDTVNGAITI
DAGKFSVQLRTSLVHDQEFTEGAERFTITTGAITGTVANSTGVTSTVTIT
DVTPLSAPIITDVTEETSGDPTPDDLLTGDTTQVVQLTGEAGCTVTLYKV
GQVEPIKIFAPQQDSLATTYTLDLTDISLSHGDYVVQLSKNDYESKVSNS
FTIDSTPGLFDIIERREVVMLTDTDAVTTGTVAGMDQNRQQAKWDSVNSQ
WIDSDGEIIHFSFGTSSSLNIESTTDGFKLTLVNGSTLQLNTQTGEYTYN
PAEGAVLDKFTIYASDGTYNSSLTLTFDAKDTLDRDGISAVVENKLATLA
NPTSDVLGDLNNDGIADEHQNAVATLAWITSANFEDAKSAGDTGDFSQIK
PESVISLQVVEAAANTADGTTTKETVDATSQLTDVKVLDDTKVEALTGGS
KPIGAEWDPIQFTVESLQSTGLVDIYPDYIFPLRTDKQIRLLIDISRAGQ
VEGSFVGYEKYVSTDTINAAKEWATDTNKVVDQDKLLKDLDGNLITTAGW
YDFMQRSTNPDGTKPDGARFIVDPITKIITAIELILTDNAFGDSDMTEGR
ITDPGVPITLNSVERSTVDPAIVDFYGFITQTSPLQQELKRWYNPITGDY
FYGVDASQVPYNCYESPTTGYGYVLGANNATGIYKVNLYLNSEGDTQLVG
ESRANELGLLANGYRNLGAVFASAPHLDGTNPNPIFAGIDEATNVSVSDD
IIIPFDERITQGAGDVNTNIQLINKTTGLPVAAKISFVGDKLVINPDADL
DANTGYYATIANSAVLDYGGNAYAGTNTVTNEDYDFTTGADPYAGVNDDD
LSTGEILGGVAALGLLAWLVL
>Cag_0347 Enolase
MPIISKVVARQILDSRGNPTVEVDVYTESSFGRAAVPSGASTGVHEAVEL
RDGDASVYLGKGVLKAVENVNTVISDALEGMLVTEQEEIDEMLLALDGTP
NKSNLGANALLGVSLACARAGAEYTGLPLYRYIGGTMANTLPVPMMNVLN
GGAHADNTVDFQEFMIMPAGFTSFSDALRAGAEIFHSLKALLKSKGLSTA
VGDEGGFAPNLRSNEEAIELVIEAIGKAGYKVGSPTDKGGLGDAQVMIAL
DPASSEFYDAAKKRYVFKKSSKQELTSTEMAEYWERWVNTYPIISIEDGM
AEDDWEGWKLLTDKVGSRVQLVGDDLFVTNSIRLADGINRKVANSILIKV
NQIGTLTETLQAINLARLNGYTSVISHRSGETEDSTIAQIAVATNAGQIK
TGSMSRSDRMAKYNELLRIEEELGSQARYPGKAAFRV
>Cag_0900 polyphosphate kinase
MPKEKSVQRGSLDPTLFVNRELSWMYFNQRVLDEAITPGLHPLLERVKFI
AIFTSNLDEYFMIRVAGIEEQYQAGIQERTIDGFSPAEQLEQLRAMVIEQ
LTMRNHCFYQDIVPALKHEGIEFVRVDDLQPSEQKALSHYFRREIFPVLT
PLAFDTGHPFPFMSNLSLNLALELEDEESGLLKFARVKVPSILPRLLRLN
HIKGFTANDGVIRFVWLEDVIQHNLAHLFPDMKIVQSHLFRLIRDADIEI
EEDEAGDLLQTIEQGLLHSSRYGKVVRIDVSPTMPNSMRELLIKNLEIAE
RNVYEIEGALGLGCLMELLKVERPNLKDEPFVPFNRLEHGQQGDIFTVIR
NRDLLLTHPYDSFQPVVDFLHQAALDPNVLSIKQVLYRVGSNSPIVEALM
RAAEEGKQVAVLVELKARFDEENNIVWAKALEDVGVHVVYGLPRLKTHAK
VTMVVRREQHRLKRYLHLGTGNYNPVTSKIYTDYSFFTCNEQLANEVAEL
FNALTGYSRHSGYKKLIVSPINTRKRMLEMIERECDQVSAGNEGRIIMKM
NSLVDSRIIRALYKASSKGVKIDLLIRGICSLKPNIIGISDTIRVISIIG
RFLEHSRAYYFYNGGKEELYLGSADMMPRNLDNRVEVIFPITDKSLVRQV
KGDLELMLTDTTKAWQMLPNGTYKRIANDENNINSQTFFMQRALQKKNSP
KFKLNTL
>Cag_1258 Cysteine synthase K/M/A
MNSPRHTTIKKKRMTMANIAHTLTDLIGNTPLLELGNFTKTNNANATILA
KLEYFNPGGSIKDRIGYAMLVDAEQRGELKSDTTIIEPTSGNTGIGLALT
AAAKGYRLILTMPETMSVERRNLLKALGAELVLTPGSEGMNGAIKKAEEL
HQQNPNSFMPQQFKNPANPAIHRATTAEEIWRDTDGNIDALVCGVGTGGT
ITGVGEVLKARNPHLHVVAVEPADSPVLSGGKAGAHKIQGIGAGFVPDIL
NTSIVDEIITVTNDNAFATSRQLARTEGLIVGISSGAAAWAALQLAQRSA
FNGKRIVVVLPDTGERYLSTPLFQFEE
>Cag_0507 hypothetical protein
MSREVKIHLISESKSLVNSFFKKTIHNLPNNDSLISSGITITSIKTSDLK
FFPQNSALENTVLHFWDLSIDSSIPQSIYPLFMTPNSVYLLLLDNFNQNE
KFWLKLIKTHGKSSPIMILKDKSKGVFSIEEKSLNLEFPLIDNQFINVNF
DDDDDKGIVDFMENFAQLLMFKQNKCIIELQNSWLLIKEAIFKETEQVKF
ISKKKYKQVCYDKGVYNQTDSELLFEYLKNLSIILFFKEIPFADIYIINF
SDNSSNLCWLIDGINRILTSKKINNGCLYWRDLDFMLEDDEEKNIYDTKD
LYYILELLILLNVCYEIDKGCYLFPNKMPSDFVLSLPTNRSQTCFIMQYN
YLPFDIISRLMIMMKKDIIDDQYWVYGILLKSHNFSKLHASINPEAVNDV
TALIIADPDNKQIRITVYGPDRYRRHYFQVIWNHLHDINKKYDDLEVKEL
VPLPDRPDKLINYQDLLGYELHNIKKYPVLQSYRNYLVSDLLDTVIDNKK
VNKKEIVINNIVNNQNSESQVEYEKLQKDVLKLINAISEKISTLPNNEDE
KKLKSILNNTSNDLENIESPTYKKQLRQFIEMLQSNPHITNLVKFIANGP
DKVNAIIDLYNHIM
>Cag_0762 hypothetical protein
MTTDQLPTTQQPDDYDSPWKEAIEHYFPEFMAFYFPNAYTAIDWSTPYHF
LDQELRTIVPQSAQGKRVVDKLVKVQLLDGKERWLYIHIEVQGRREANFP
RRVFICNYRIFDQYGVPVASFVILTDTHYNWRPTSYSYEFAGCKHTLEFP
IVKLLDYEPRMEELLASDNAFGLITAAHLLTQKTKNQSKPRYEAKKLLMQ
LLLQRQWDQERIEELLRVIDWFLRLPKALRKKLKTEIHNMEEAQKMKYIT
SFERDAMEEGIEKGKELGVLEGIEMGKAEGLEEGLMKGRLEVAQRLVAGG
MSKAEAASFAGVSVDLL
>Cag_1100 hypothetical protein
MALIRECQPKTIQEWEEWYFKNATTAGKNNFKITRESLQELGERLYEKIT
EVVIPEWQEAFNALTIEDCYNYIFNLTINRTFDGYLREKSVVNDGLAKEF
PQIRFDESPSELDHAGDIDYLGFVSENKAFGIQIKPVTAQSNFGNYSVSE
RMKASFHSFKEEFGGNVFIVFSLDGEIANTQVIEQIQMEIERLQSEQ
>Cag_1243 hypothetical protein
MANMQLPPNQRDNYDSPWKEAIEHYFPEFMAFYFPNAHKAIDWAKGYHIL
NEELRSLIPDAEISNRVVDKLVQVHLLDGNESWLYIHIEVQSFWEADFPK
RVFIYFYRIFDKYGKAVANFVVLADMNPHWLPTSYNMETIGSKLTLDFSV
VKLLDFEPHMQELLASNNVFGLITAAHLLTQRTHRNYEARLEAKKLLIQL
LLQRQWEQERIEQLIRVIDWFLSLPKELRQKLKTEIYQMEEEQKMKYITS
FERDAKEEGILEGREIGVLEGMENGKLEVAMRLLGIGMSVEQVAELTGVT
ANVLTAKLKS
>Cag_1878 alanine dehydrogenase
MNIVLGIPKERAQDERRVALSPAGVQILGEHGIQVIIESNAGMYCNFTDL
AYSDAGASIATSPEELYEQSNVIVKVSPPTLEELPLFQHGQLLLSALHLG
TITPHLIELLIEKNITALAFEFIETRDGELPIVRTLSEIAGSLAIQTAAK
YLETNYGGSGILLGGIAGVPPAHVTIIGAGTVGLFAAQDALGLGAQVTVL
DKEINRLRRFEAFFNRHLVTAIANEHYIAELAKKSDVVIGALSPKQKVIK
PLVSEQVVQSMKTGSVIIDVSIDQGACFATSRHTSHSNPIYVKHGVTHYC
VPNIPSAVAKTASFALTNTLLPFLLKLSGHETMSAILWKSHSLRRGTYIF
RGYVTQKVVAELTGAPFREIEMLLAAS
>Cag_1948 DsrM protein
MTKKVLMPLLMVFVLCLIPYIGVKYAGLTTLFAVIIPLASLSILLVGFAL
RLVDWLRRPVPFRIPTTCGQEKSFNWVKHDQLDNPHNWWQVVLRVLGEVL
LFRSLFRNKKAELHDGQKLTYGSSKWLWFAGLLFHWSMLIIVLRHTKFFF
VTEPAFATFLDHADRFFEITLPAFFATDALILVGLSILFIRRLWDSKLRF
ISLQTDFFPLFLLLGIVLVGISMRYIAKVNVMPVHDQMIAMMDGNFSVMG
DIDPLFYIHLFLVSVLVAYFPFSKLMHMGAIFLSPTRNLGNTSRVKRHVN
PWNPEMKIRTYAEYEDDYREKMKKAGLPVEKQ
>Cag_0526 conserved hypothetical protein
METTNSSNQKTKEPGFGIWLGITLLWGSVFFWSSVLALQFVTGWMGEGMF
QPAGSGLMRVYGVHVMVLVLFALLAMIFKRMVDPGATRQATRRQEIDAGK
GERIFISLLGSIATSFFFTLLTALTFALAAGAVGVPVALTLPVVFVAGLF
NIVAGLAASLLVGILFIVAKVGKK
>Cag_2027 putative phosphohistidine phosphatase, SixA
MKTLYLIRHAKASSGNNFGGDFERPLHATGIQGARFMGELLKTNGVLPDA
VITSSALRTRSTATILCEILGFPAERIEERMEIYEGGAMRLLSIIQHISE
SCSTAMLIGHNPTITELTSVLSGRAQGGMATGGVAHLQFQVEHWSEVVAG
CGTLVAYQFPQAYQ
>Cag_0438 conserved hypothetical protein
MSIGIKSRFFNNARKMAVESIRNPEKMRRLIASALELTTKAGRNAKLQAL
SNKVQTLIRFVQASISREYNVMPWRSLILSVAALIYFVNAFDAIFDFIPL
LGFVDDAAVLTAVLTSINNDLAKFIEWENSVKPQRTNVVDAEFEEVKEGS
LQ
>Cag_0907 Ketopantoate hydroxymethyltransferase
MPHRDQPKTAHVTTRRLLDMKQQGEKISMLTAYDYTMARILDRAGVDVLL
VGDSASNVFSGHSTTLPITIEEMIYHAKAVVRGVHDETGRAMVVIDMPFM
SYQLSSEEALRNAGKIMKEHECDAVKMEGGHVIVESVKRITDVGIPVMGH
LGLMPQSIYKYGSYRVRAQEEQEAEQLLRDAELLEKAGAFAIVLEKIPSA
LAAQVTASLTIPTIGIGAGVACDGQVLVINDILGLNREFHPRFVRQYVDL
NSMIEQVARQYVTDVRGCDFPSSNESY
>Cag_1067 hypothetical protein
MVKKQKYMPEISRFLGIIISMYFDEHNPPHIHVQYNEYRAAMDIYDFNII
AGSLPAKVRGLVAEWMELHSEELLKMWETKEFHRITPLV
>Cag_1628 C-5 cytosine-specific DNA methylase
MKMQNNISAIDLFCGIGGLTYGLKKSGIQVKAGIDIDESCRYSFEENCGT
KFINKDIQKLQKEELNSIYGNAEIKILVGCAPCQPFSSYTYKKDKNKDKK
WQLLYDFSRLIKETKPAIISMENVPTLLNFKKAPVFYDFIQELTANSYKV
WFNIVYSPDYGIPQKRRRLVLLASKLGDIELLPPTHNPDNYITVKDAIGN
LEAIKSGETSQNDFIHKAAQLSEINLSRIKQSIPGGSWKKDWDDELKLVC
HTKEKGKTYVSVYGRMMWNEPSPTMTTFCTGIGNGRFGHPEQNRAISLRE
AAILQSFPADYKFAENEATLKFGKTSKHIGNAVPPKLGEIIGKSILQHLE
KYNYGKENK
>Cag_0948 hypothetical protein
MKYFIEKASQHDYADILDIMQYWNMHHIPSVEMEELDLSCFFVARISNII
GGAGGYKVLSQKTGKTTLLGIRPEFLGMGIGKSLQEAMLVAMFNAGVKHV
ITNTDRTETILWYKKHYGYYEIGQLKKQCDFSLSDVDSWTTLEMNLEEFI
QKKLQR
>Cag_1439 Coenzyme A biosynthesis protein
MKQKAIYPGTFDPFTNGHLDVLERALTIFEEVIVVIAENSQKRALFSIEE
RKMMTEQIVSNVDGARVEVLAHGLLANYARDVGARAIVRGVRQVKDFEYE
FQMSLLNRQLCPEITTVFLMPNVKYTYVASSIIKEVAMLGGDVSNFVHPI
VLSMINDKRALCER
>Cag_0594 conserved hypothetical protein
MNEFGITDSHLHIIRSIFKQYQAINKVLIYGSRAKGNYSERSDVDLVICD
TTFDRKTIGKILLAINNSDFPYTVDLQIMENIKNKNLQEHIKRVGKEFYT
KM
>Cag_1006 Protein of unknown function DUF83
MYPESDFIAISALQHFAFCPRQCALIHLEQIWSENMYTAEGRELHERVDE
GKTSYKSGVRITRSEPLRNATLGIAGVADVIEWHKQPNGKELPFPVEYKR
GKPKKHNADKIQLCAQALCLEEMLGIHIPSGALFYGETMHRLEVEFTPPL
REQTRGAAEGIHELFERGLTPPPDYSAKCKQCSLLEVCQPNLLAQHNTAR
NYLASLVQTLSAEDA
>Cag_1953 conserved hypothetical protein
MIEQLIQQVEAAIVAAEKWAETGWSATFGVRNNEVSSLKQAEALPRNAVY
RLEAINYWKQVLLAGEDTARFGRKALEALKDNNLAVANDALYFCQFVEKP
FCDFTNTWLPLYEAIQKRAA
>Cag_0246 peptidyl-prolyl cis-trans isomerase, FKBP-type
MATVKQGDTVKVHYAGKLDDGTLFDTSAGREPLQFTVGGGQVIPGFDNAM
IEMAIGDKKEVVIAVEEAYGPHSDELVTAVPRERFPADLELEIGQQLQVG
LENGQQAIVMVVDITDEAVTLDANHPLAGQELTFEIELVEIV
>Cag_0473 sigma-24 (FecI-like)
MTRNNNKEAAFKRLVEQHMEMVVNTCYRFVMNREDAEDIAQEVFIEMHRS
LDSFREESKLSTWIYRIAVTKSLDHLRYMKRKKRFSSLKRLIGMEGEEDP
TENLPDVSQATPHEVMEQAHRAELLQRALDSLPEAQKAALLLSKQDGYGN
QEIAEILGTSISAVESLIHRAKKTLHKKLYDKYSKE
>Cag_0462 conserved hypothetical protein
MQKIQVDILGLSTSPHANGAYALILYEVDGKRKLPIIIGGFEAQAIALKL
ENIKPPRPFTHDLFKSVADVFDLHVSEVIIDELHHETFYAKVVVEMDGEV
HEVDARPSDAIAIAVRFRAPIYVTDDIMEEAGIQEEQTVPRSAAGPVAAV
LSSPTSATAQHLRAEQRKATLKELQAHLEEAINNEAYEEAARLRDEIARL
KP
>Cag_0149 NusG antitermination factor
MAEIITPQNWYAVYVRSRYEKKVYQALLEREVNSFLPLYETIRQWSDRKK
KVSEPLFRGYVFVNIAMQQESIKVLDTEGVVKFIGIGRKPSVIAEREIEW
LKKLVREPEAISRTVASLPAGQKVRVIAGPFKDMEGVVVKEGRESQIVIY
FDSIMQGVEVSIYPDLLQPIGKERQPLPTHSSEDELVSVEKHLLRVS
>Cag_0791 Cell wall-associated hydrolases (invasion-associated proteins)-like
MSQKTICPSTHWMGCPYSLWQRFSRAVAALATLSTLSCIAPSSIVLANPI
APPQSTAVDVVAENPSTPIALDPPTPSKLEQLMGNMGNYFGIRYRFGGQT
PAGFDCSGFVRYMFEKVYNIKLPHSSREMSSLGDRISREELKPGDLVFFH
SGKNRINHVGIYIGNDAFIHSSLSKGITEDKLQHRYYDKRYAGAVRILPD
ITLPFSTPRQEDAQLEIIKPS
>Cag_1787 carbamoyl-phosphate synthase, large subunit
MSTITPDQTVLTLAKKLSAEQLFAAKQNGFSDLQLATIFKTSDTVIRELR
RHYGIASVFKTVDTCAAEFDAKTPYHYSTYEEENESVCSDRKKVIILGGG
PNRIGQGIEFDYCCVQAVFALREAGYETIMVNCNPETVSTDYDIADKLYF
EPLTFEDTIRIIEHEKPLGVIVSFGGQTPLKLSTRLHEAGVKILGTSSKG
IDLAEDRKKFGALLVELGIPHPAYGTAISLEEAKAITQRIGYPALVRPSY
VLGGRAMKIVYNDDSLKEYIDQALFISEKYPLLIDRFLETAVEFDIDALA
DSTDCVISGIMQHVEAAGIHSGDSTSILPYHNISKQAIAAMKEYTRMLAK
SLNVIGLMNVQYAVQNDTVYVIEVNPRASRTVPFVGKATAIPVVKIATRV
MLGEKLCDLRNEYNLKDCDELGMKHMAIKEPVFPFSKFVKSGVYLGPEMR
STGEAMSLANDFPEAFAKAYQAANMQLPLSGAVFISVNDQDKNHRMLAIA
RSLYDMDFDLVATAGTWQFLTDNGIECKKVYKVGEEGRPNIFDSIKHGKV
DFVINTPRGEKALHDEEAIGAASVLSNVPFVTTIEAAEASVQAIGCIRHQ
EFGVKSLQEYAAYRDTATATC
>Cag_1549 N-(5'-phosphoribosyl)-anthranilate isomerase
MTHIQPKIKICGITRLEDALAATFAGADALGFNFSHTSARYIAPNNAAAI
IKQLPPFVQTVGIFVEQSPSEINAIAQTCNLHYAQLHNDLYGVKEALAIT
TLPVIKVFRPNENFDVQEVKAFIGESHVTTYLFDAYRPDAHGGTGERIEA
TLAERIFQAMGNECYAILAGGLTPNNVAEAIRRIRPYGVDTASGVEKAPG
IKDVAKMRAFVTAAQNA
>Cag_0162 Beta-hydroxyacyl-(acyl-carrier-protein) dehydratase FabZ
MLIHQRTIKSEVSLCGTGLHTGEQCTITFKPAPVNYGYRFIRNDVENSPE
IPALIDNVTDVLRGTTIGVNGVKVHTTEHVLAALYGLQIDNCRIEMSGPE
PPVMDGSSHPFAEVLLQAGFVEQNEPKNYLVIDETIEYHDSKNSVDIVAL
PLDGFRATVMVDYKNPALGSQHSGLFDLDKEFFAEFAPCRTFCFLSEVEA
LANQGIIKGGDVDNALVIVDKRMEKIELTDLGKKLGINGENLSLGSNGIL
NNRQLRFNNEPARHKLLDMLGDIALLGMPIKAQLLAARPGHASNVEFVKQ
LKKYADRNQLARKYQHEKKAGVIFDINAIQNILPHRYPFLLIDKITEFKL
DEKIVSIKNVTMNEPFFQGHFPGNPIMPGVLILEAMAQTGGIMMLNGNDN
IKESVVYFMGIDKARFRKPVLPGDTLVIEAVMTNKRRTVCQFDAKAYVQG
ELVCEASLMATVVAKNK
>Cag_0424 drug:proton antiporter
MALASSPIVSFTFLRLLSFRFLIVLSYQMLAIVAGWHIYELTHNALALGF
IGLAEVIPYFASALFAGHAVDHYSRRLFGVMASVMVMASALMLTAVSAGM
VVGNPVWWLYGAIAFNGLARAFISPSYSAMFALVLPREAYAKASGIGSSV
FQLGLVTGPALGGLLAGWFGNTAGYAVAAVLAFGAALALFSVRVKEPPSA
ESMPIFASIASGIRFVFGNQIILGAQSLDMFAVLFGGAVALLPAFIKDVF
HFGPEAFGLLRAMPAIGAVITGLYLARHPLNHHAGRWLLGAVAGFGVCII
GFALSTTIWMAGLLLLLSGICDGVSVVMRTAIMQLLTPDDMRGRVAAING
IFIGSSNELGAFESGVAAHVMGLVPSVIFGGFMTLGVVAVTAKLAPKLRR
LDLQQLY
>Cag_1604 conserved hypothetical protein
MALRKNIGESYTPTYFLASLGNGGLAVTFFMFLMFMIPHKGRPMPVFEDI
VAALQSTLPIQFLTIVSLVGIIWFSAQHYRMLIWNIRQYLAFKHTPAFNR
FQTTDAQVQLMAIPLTYAMAINVMFILGAVFVPQLWSVVEYLFPMAMGAF
FIVGIYSISIFYTFFSRVIAHGGFDCEKNNSLSQMLSIFTFSMVAVGFAA
PGAMSHNVIVSGVGIIMATFFLALVTTLGVIKIVLGFRSMLAHGINYEAS
VSLWIVIPILTLVGITIYRIAMGLVHNFDAVIHPWAHVIMFTALCGIQIF
FGLLGYGVMKELGYFNEFIHGESKSAVSFAAICPGVAFVVLGNFFINRGL
VAAGLIEMFSVAYFVLYIPLLAIQAQTIIVLMRLTRKLLKA
>Cag_0965 Dephospho-CoA kinase
MESKLPLLVGVTGGLGSGKSMVCRYLASMGCALFEADVVAKELQVRDSKV
IEGITALFGKEVYSYNPKGELQLNRKDIAQVVFSNQEKLGALNRLIHPRV
AVAFQQACDNAARSNVAILVKEAAILFESGAHAGLDVVVVVQAATELRVE
RAVQKGLGTREEILRRLAVQWAPEKLAALADVVIDNNGTPEALYEKTKQL
YEQLLQQAMLRR
>Cag_1159 L-aspartate oxidase
MSEAITTDVLVIGSGIGGLYFALHAAEYASVIIITKKESFTSNTNWAQGG
IAATIDSTDSPDLHIADTLDAGAGLCNREMVSLMVHEGPKHISHLQELGV
NFTTADQQHLHLGKEGGHSRHRIVHAQDLTGREVEMALLERINHHPNIIL
LEHHYAIELLTEHHLGIKTNDIKCYGAYALDSVNNKTKKILARITMLATG
GLGQIYPYTTNPDIATGDGVAMGYRAGAEIANMEFIQFHPTALYHPACST
FLISEAVRGFGGILKLKDGTEFMHKYDPRQNLAPRDIVARAIDSEIKKSG
EQCVYLDVSHLEAQATKEHFPNIYETCLGYGIDMTREMIPVVPAAHYACG
GIKTDHQGRTTIERLYACGEASCTGVHGANRLASNSLLEALVFAWRAAED
IRTTLSNYHNGTHFPEWDDSGTVNPEEWILVSHNKREAQQVMNDYVGIVR
SDLRLQRARRRIDFLKEETEAYFKKTKVTTQILELRNIIKVASLVIEGAL
KRRESRGLHYTTDYPQKDNRHFLNDTLLRSF
>Cag_1285 hypothetical protein
MVFEISFHLFIISKLMKLTNYQNNQISIAMKKLFAFLFLLSSVSFVGCAK
KAEEAPVEEPAAVEAPAAPAAEAPAAEAPAAPAAEAPAAEAPAAPAK
>Cag_1778 DNA polymerase III, delta subunit, putative
MSWSSIIGQQQQLRVLQHALETGRFAHAYLFMGAEGCGKEAVAFEIAALL
NCRNASASPQVGACNTCPDCEKVHALNHPNVEYIFPVEAVLLEGGGDLAK
KENKRFTEAKERYDALIERKKENPYFAPAMERSMGILTEQILSLQQKALF
MPSVGSKKIFIISQAERLHPSAANKLLKLLEEPPEHVLFILISSRPEALL
PTIRSRCQAVKFSRITTMQLREWLAQHRPDIVEPERSFVVNFSRGNLRLA
WDLLSNRSSDMAEAPALQLRNQALDYLRYVLTPNRFHEAIVACEQYAKSL
SRRELTLFLAALLLFFQDACHRRINPSVADLNNPDLSDNVNRFAKNFPNT
NYFALSQAIEDAISSLERNVAPLLVMATLTTELRQQLQRRG
>Cag_0730 Argininosuccinate lyase
MSSKNKELLWQSRFSLPFDRDALRFSSSVHVDKALYREDIQGSIAHVTML
AEQQIVSHEEAQAIIAGLQEIELELRDGNIVPHWEDEDIHTVIENRLKEK
IGAVAGKLHSGRSRNDQVATDTRLYLRRNIGSLQAALSELLTVLLQKAEH
YCDTIIFGYTHLQRAQPISAGHYYMAYFTMFLRDRDRLQDLLKRVNISPL
GAAAFAGSTLPLNPARSAELLGFEGIFSNSIDAVSDRDILIEFLSDCAMV
MMHLSRFAEDVILWSSYEFGYLEISDAFATGSSLMPQKKNADIAELVRGK
AGRVYGNLMAMLTIMKGLPLSYNRDMQEDKQPLFDSTETAINSVTIFSKM
LENTRLKEERLAKLTSEDLSLATEIAEYLVQKHIPFRDAHRITGKIVSWS
IESGVALPHIPLEQFKSFAAEFDEGIYACLKADASVRAKKTHGSCSFESV
KQQINAAKERI
>Cag_0582 hypothetical protein
MKSGSVYQVHDRVRFRVRERKPEAVVMRDGRYFKLIIDGFDEPLICVQIV
EPGRRSSSGATTSNVIHSYIDGDFEGWEGETIFKLDNGQIWQQSSYAYMY
HYAYHPEVMIINDGGTWKMKVEDVDEMIEVTRLK
>Cag_0018 prephenate dehydrogenase
MEPSSVSTIAIVGLGLIGMSLVRAFHHSPFMQEQQVRLIGYDPHFSESDC
QCALELGLHSFESNPETLYRAEIVILAAPVEVNIALLESVRNCVASHTLV
TDVSSTKRDIALRAKQLQLPFVGMHPMAGKEEKGYQASHEELLHGKRMIF
CDDDNLLATPQGEFLQQAIASIGCTTLFMTSEEHDAVVARVSHLPQLLST
LLMEHCGDAMQASGPGFATLTRLSGSSWEIWHDIVATNQMNIATELTRFS
SKLLELSQEIQMGNFEQVAERFNHANQLYQALKAMNQP
>Cag_0700 thiolredoxin peroxidase
MSVLVGRKAPEFDVAAVVNGSQFVDSCKLSDFKGKYVVLFFYPLDFTFVC
PTELHAFQEKIEEFKKRNVEVLGCSIDSKFSHFAWLRTPRSQGGIEGVTY
TLLSDINKTVAADYDVLLEDEGVALRGLFLIDRDGVVQHQVINNLSLGRN
VDEVLRLIDALQFTEEFGEVCPANWNKGDKAMKPTQGGLEEFYKEG
>Cag_1815 Light-independent protochlorophyllide reductase, N subunit
MMQGIGGDIQDSQLIREDNVTHSFCGLACVGWMYQKIKDSFFLILGTHTC
AHFLQNALGMMIFAKPRFGIALIEEGDLAKNEPTLQEVVSEIKADHHPSV
IFLLSSCTPEVMKVDFKGLADHLSTPETPVLFVPASGLVYNFTQAEDSIL
SALVPYCPQAPAGEKKVVFLGSVNDSTAADIRADAEELGIPVGGFLPESR
FDKMPAIGPDTVLAPIQPYLSRVALKLSRERGSTVLHSLFPYGPDGSRVF
WEDLAREFGITVDLREREAAAWEKIRKQTDKLKGKKVFLTADTMLELPLA
RFLRHAGADIMECSSAYIIKKFHAKELEALQGVRVVEQPNFHRQLEDIRR
ERPDLIVTSLMTANPFVGNGFVVKWSMEFMLMPIHSWSGVVSLANLFVAP
LQRRSMLPAFDESVWLEGAMPSAEVHA
>Cag_0344 photosystem P840 reaction center iron-sulfur protein
MADPVEKTATPAAPAAKASPAAPKAAAPKAGAPAAKEGKPKPKESPKDKS
GKPRFEPLNINLGRTGLPQESALPIKKAAPKPAPAAAGAKPAPGAKPVPG
AKPAAPAAKAAPASPSAAPAGGQVVKKAAPKPKAHYFIIENLCVGCGLCL
DKCPPKVNAIGYKFYGDVQEGGFRCYIDQDACISCSACFSTDECPSGALI
EVQPDGEVLDFTYTPPDRLDFDLRFLHRFHREVR
>Cag_1076 conserved hypothetical protein
MSTITRPMTTMILKQRFPFRFGTSSYIIPADIIPNVEYLKDKVDDIELVL
FESDEFSNLPSAEDIQTLKQLAEEWALTYCVHLPLDVYLGHTDRAERERS
VGKCLRIVELTRTLPTSGYVVHFEAGNGVDINGFNDADQQQFTDSLRDSL
AMLLAGANVPAAHFCVENLNYPYELVWAIVQEFGLSVTLDVGHLEYYGFP
TADYLKRYLSKAKVLHVHGTVDGKDHNSLCYMKPATLAILMQALAASPNP
QRVFTMEIFSEEDFLSSCKVMEGYVFLPPT
>Cag_2009 conserved hypothetical protein
MKQRKEPKPFGLLISGVYRSLGLEEPYQQFKALQVWREVVGEAIAEVTTL
ERFTAGQLYIKVNNAAWRLELNFRKRDIIQRLNKELGSPLVQEIIFR
>Cag_0501 periplasmic phosphate binding protein
MRITQFWKHATMALAFVGLASGSLEAREQIRIVGSSTVFPFASYVAEEFG
KTTGNQTPVIESTGSGGGHKLFGESDAITTPDITNSSRRMKKAEFDRAKQ
NGIQAIHEVVIGYDGIVIANAKKATTLQLTRRDLFFALAEEVPMKGQLVK
NPYTKWSQIRKGLPNQKILVYGPPTSSGTRDAFDEMVMEASSKSITEYGA
LAGKYKKIRQDGVFVPSGENDNLIVQRIVKDKAAVGVFGYSFLEENADRI
KGATIDGVAPVPANITSGKYPVSRDLYFYVKGSHIAQVKGLKEYVDLFVG
EKMIGDYGYLKKIGLIPLPKKEREAIRANWNARKMLTGTSLD
>Cag_0604 conserved hypothetical protein
MLAKIKLTPKQLLKPLLQIVVTAIALYVVFQKIDIAQLTTLIRTAHPLYL
LCALLFFNLSKVINAFRLNRMFKAIGIELSTTYSLKLYYLGMFYNLFLPG
GVGGDGYKVYILQKNYGIRMLNVFHAVLWDRIGGIFALTVLALALLLPSN
FATLYPMLIPWAWGGLLLLYPIAWLVNKLFYKQFLHLFAVTSFDSMLVQV
TQTIAAWFILEAMNLPAHHIDYLAIFLLSSVATILPITVGGAGAREITFL
YLLNVVELNTNPGVALSLIFFVISAISSLAGILSRIKHEKADVVQ
>Cag_0816 Thiamine monophosphate synthase
MINPILPPLRHIPTSPFLCALTDATLPPLPFVKAVLAGGATMIQLRVKEA
TTRELIAWGVEIRTLCHAAGACFIVNDRVDVALACGADGVHLGQQDMPCS
LARKLFGKDMLIGVSASTVAEALQAEQDGADYIGFGHIYPTNSKEKPHSP
VGLSTLQQVAKAISLPIVAIGGISGENASAVIAAGASGIAVIAALSKVEN
PTVAATKLCAAMQGVAL
>Cag_1234 universal stress protein family
MFKIHVILCPIDFSDASRKAVQYAREFALSMNAKVQLLAIVEPHPVSVDM
NLNYIPVEQDIEQAILRDTEAIAEDLRAANVQVTCSVELGTPADVILEYI
QEMDVNMVIMGSHGKTGLSRLLMGSVAESVMRKAQCPVLIVKAEEREFIG
EE
>Cag_0551 conserved hypothetical protein
MALHGEEQRPTSPTRLPKRHPARRGYHLDMTPMVDVAFLLLTFFMLATTF
TPFYSMELSQPQKQRRLAVQAEEVLTMQVANNGIVHYRLGSNALSSMPLY
QETATTQEPSATPMLNPALHHFLRSLKAEQPNITVVLSMNNQARYRDLVA
VVDLLNSLQITRFSLGEIDGEEKQKAGVK
>Cag_0197 conserved hypothetical protein
MTHPPFHRLIAFLLFLGAEVLYLATMAPTFSFWDCGEFIATAVTLGIPHP
PGAPLYLLLGRLFAMIPFVSDIGARVNVISTLASSATVMLTYLITVRFIT
LYRKHPINEWSRSEQIAAYGGAAVGALALAFSDSFWFNAVEAEVYALSSL
FTALVVWLMLRWHEEAPKAGNERWLLLVMYIIGLSIGVHLLSLLAVFAVA
LAYYFKKHQVTLVSFGWLVVVSLALFFLIYVVIIKGLPVLFQVASWWGLL
AGLLLLCAAIWYSQRHRKAMMNTLLMSLLLLVIGYTSYGMIYVRAQANPP
INENNPSTTESFYAYLNRDQYGDMPLFPRRWSPEPIHQYFYEQYSSDFDY
FTRYQMQKMYLRYFGWQFIGREHDMEGAGVDWSVLWGLPFLVGLVGAISH
FRRDWQMGVVVTALFVLTGAALVIYLNQTEPQPRERDYSYAGSFFAFALW
IGIGVESLWQWLAGRMKASSEKLPVVALSVVGLALLLVDGRMLMANYRTH
DRSGNYVSWDWAWNMLQSCERNAILFTNGDNDTFPLWYLQEVEGIRRDVR
VVNLSLANTGWYLEQLKNSSPRGAKPVNFSMSDGELATISYMPIDSVDAI
LPSSTARRSLLRDTWRSGNNLPSAPLDTMVWPLKPGLTYDGQGYLRPQDL
AVYDIVMSNFEDRPIYFALTVDPESMIGLDTFLRLDGLVCKVVPVKSSDP
MSYTDPLILYQRLMQVYRYRNLANKHVYLEETSLRLSSNYTPLFVRLALE
LATQPEETLAVTDGNGVPRLVRRGALALQVLDSSERFMPLSRYPVNPELA
ASIIALYVQLGEKQKSSPYISYLEALSHTNSPLMEPRLFLILARAYYSLG
REAEAKAIVQQLARELDQPELLKTFETTKK
>Cag_0330 aldehyde dehydrogenase family protein
MITTINPATEERLATYPTMNAEELAGVLEATQRAAAVWRKLSFEERTAPM
RKVATLMREQKERHATLMSLEMGKPFSQALVEVEKSAWVCDFYADHAANY
LTAEEHDLGDGVRGMVKFEPLGVIFGVMPWNFPFWQVFRFAAPTLMAGNG
VVVKHSPNVTGSAIAIEELFREAGFPTNLYRTVHIALDEVDALSGFIIEH
PAIQGISITGSTGAGRAVAAKAGKAIKPSVLELGGSDPYLVLDDADINRA
VNLCAAARLLNSGQTCIAAKRFLVHHSVMAQFRELFLQRLQNAVMGDPFD
QTVEIGPMARLDLRDQLHDQVMRSIAAGAELLCGGVIPDRAGFFYPPTLL
AGVTKDMPAYSEEFFGPVVTLIEVADDAEAIHIANDTSFGLGAAVFSQNI
ERALRIADQLETGNCFINSGVKSDPRMPFGGIKESGYGRELAAYGIRAFV
NIKTICV
>Cag_0361 Elongation factor P
MPSISTVSKGSIIRFKGEPHSIESLIHRTPGNLRAFYQANMKNLKTGRNV
EYRFSSSETVDVIVTERKQYQYLYRDGSDFVMMDNNTFEQINVPEVALGE
GANFMKDNINVTIVFSDDGSILQVELPTFVEVEVTDTNPASKDDRATSGT
KPAIVETGAEVNVPMFIQIGSVIRVDTRTGEYIERVKK
>Cag_2018 conserved hypothetical protein
MEKKKVLVAGASGYLGRYVVKAFAEQGYSVRALVRSPKKLAEEGANLEPA
IAGLIDEVILADATNTALFKDACKGVDVVFSCMGLTKPEPNITNEQVDYL
GNKALLDDALQHGVKKFIYISVFNADKMMDVAVVKAHELFVQALQSSTMP
YTVIRPTGFFSDMGMFFSMARSGHMFLLGDGTNHVNPIHGADLAQVCVNA
VEKNEHEINVGGPDTYTFYETMTLAFTVLGKNPWITSVPMWIGDAALFVT
GLFSQQLAGMMAFAVTVSKIDSVAPAHGTHHLVDFYRALAAKQA
>Cag_1766 peptide ABC transporter, permease protein
MLLYILKRLFVAVPLIFGVLTLTFFIIRLAPGDPAAFFIQPGVSPEVAGQ
IRAQYGLDEPLVVQYFKWLGNVLHGDFGRSFSRAQQPVFEVIASALPVTV
TIALLTMIANFVFGIVIGVISAVRQNSLLDRTLTVGALFFYSMPEFWFAL
MMIILFALQFPLFPVSGLNEVGAEGYGFFGFWLDRLWHLVLPVTVLSING
ASGIARYVRGSMLEVIRQDYIRTARAKGLPERVVIVRHALRNALLPIITL
MGSSLPFIFSGALFIEVIFAFPGMGRVTVEAIFARDYPLIIANTFISGTL
IVLGNLLADVLYAVADPRIKL
>Cag_0722 HDIG
MKALPAITPFAIPDILCRIGDIADREDAACYLVGGYVRDLMMKRPCTDVD
IMITGDPVPFAHMVQEELQGRNFVLFERFRTAQLELDDPNLGTFKVELVG
ARKESYNSDSRKPITLVGTLEDDLMRRDFTINALAMRLNREERYTVTDLF
NGMEDMEAKILRTPLEPLQTFSDDPLRMMRAARFAAQLDFTLLPNALEAI
RAMHERIHIVSMERVSHEFLKIMRTHRPSVGLLILHETGLLKELLPEVSA
MAGIEQVDGMGHKDTLFHTMQVVDTVATYSDNLWLRIAALLHDVAKPVTK
RFSKQSGWTFHGHDAVGIKMVAKIFRTMRWSHEHLEYVQKLVRLHHRPIP
LSKEEITDSAVRRLMFDAGEDLRDLMTLCRSDVTSKNPRKVARIMSNFNH
VEAKIAEVGEKDQLAKWRPPISGQDIMEALGLAEGKAVGKIKKQMENAVI
DGIIAYDREAALAYVRKVYEEMKGET
>Cag_1450 Aspartate decarboxylase
MYLHLLKSKIHNAIVTSGDLEYEGSITIDSELLDLAGMIPNEKVLVVNNN
NGERFETYIIKGDHGSREIQLNGAAARCALPGDEIIIMTFAVMEEAEARN
YKPMVLIVDRHNNPKSRHLVGEQSEPLSN
>Cag_1047 hypothetical protein
MNCPYYNRPPTSSLPFPTAVETTHALSLQSIKKFPTLTLYNLHAFFHVNL
LTIMATNSSSPLVRRELFIFDASVSNLSTLSSALSANSSYFVLDSTRDGL
VQIADLLAGQTDIDSLHIFSHGSAGSLQLGNSSLSLVNLNNYELPLSVIG
SSLSSSGDILLYGCNVGAGDEGLAFVDKLAKMTGADVAASDDLTGATALG
GDCELEVESGVIDEASFYYAPEYAGLLGAVGPEFHVDTSDIQVWSYEPSV
AALANGGFVVTWISETLETLSSDTHTDIHGQLYNSEGAMVGSEFQVNTYT
QYGQYTPSITALADGGFVVTWISETLETLSSDTHTDIHGQLYNSEGAMVG
SEFQVNTYTQYGQYTPSITALADGGFVIIWRCVNNDDYNCNYIHGQRYNA
DGIMVGSEFQVNTYTQIGAYEPSVAALADGGFVVTWESGIVTTWKSGYQD
TSNSDIYGQIFNVDGAMVGSEFRINTYTKGFQGCPSVTSLTN
>Cag_0309 conserved hypothetical protein
MADKAYITAKVFKWARESAKMTEEIAASKVAVSIDKFKDWENGEDFPTIR
QAQTLAKAYRRPFALFFLPDVPTDFQPLQDFRKTGSKELSTSSIFIIREI
QQKQAWISEVNEDNNENRVPFIGRFNIKDNPVLVAKDILATLNINPLNYK
SNNPIIEWIDKAESNGIFISRTSFIHSRLKLDSNEIQGFAIADDFAPFIF
INSDDWNAPQLFTLVHELSHLWIAETGISNDVEPSIKNVGDYNPIELFCN
EVAANVLMPKEFIDSLDSKAFDNAKEVFKNAKMIGVSSFALLVRALNLNI
ISLSTYKQLKQLADIEYNEFLKREEAKKIKQKENEKPGGPNYFLLQLNRN
SRLFTQTVLDAFRGGVIEPSLASNLLNVQVNKFPKLEAQMYR
>Cag_0175 Ribulose-phosphate 3-epimerase
MPTILAPSVLSADFTRLADSLTIANESGAEWMHCDIMDGNFVPNISFGPF
IVKAVASCTSMIIDTHLMITDPDRFIEEFVKAGSHQITVHQEACPHLHRT
IQLIKSFGVKAGVSINPGTPVSALEAVLTDLDLVLLMSVNPGFGGQKFIP
SAVGKVQQLHEMRMKLNPEMVIAVDGGITEENAQSVVSAGADALIAGTAF
FKAPDPAKTAATIRAMSR
>Cag_1071 ATP-binding region, ATPase-like
MSSTENNGTAVPLQEFEYKAEMKQLLDLIIHSLYTHPEIYLRELISNASD
ALSKARFNALTDQDMLDKDAELAIRLTLNAEEKSVVIEDSGIGMTEEELI
ANLGTVAKSGTLGFMQSLKEQQQQLDGNLIGQFGVGFYSVFMVTEEVTVE
TRSFHSNAQGYRWRSSGQGTYTIEQIDKATRGTRISFKLKEEHQEFAEEY
RVEQIIKKYSNFVDFPIYLGERQLNSMTALWQRPKSELKDEEVNEFYKFV
ASDFNDPLDYLSISVEGMVSFKALLFLPKEAPPELMYRQSELENRGPQLY
VKKVLIQNECRDLLPEYLRFIAGVVDTEDLSLNVSREMVQSSPVMAKIRQ
ILTGKILGWFEMLAKEQPEKFKTFYKAFGPIIKIGLNTDFTNREKLIELL
RFESTKTEEGEFVTLREYVERMGSEQKEIYYHSGNNRAQLLAHPNIEYFR
EHGIEVLLLSDPVDMFVIPSIHEFDKKPLKSIEKADCDFSQQSSNKAEPV
APNLLNPVLQAFKEALSNEVEDVVESRRLVSSPVTLVSGKDAIDSQLERM
MKMMNTPMPPAKRILEVNSSHPIVRNIAGMIMADANNPLIKTVARQLYEG
ALFLEGSLEDATSYVTRMNELIEAATLTR
>Cag_0293 putative cytoplasmic protein
MSKNKTSHVSTNKEPANIRSSAAEYLTYIAAIGERATSVEMRYEAENIWL
TQKMMATLYDVSVPAINQHLKRIFDDNELTREATVKQYLIVQTEGNRQVE
RMVDHYNLQAIIAVSFKIENERAVQFRKWANQIVKDYTIQGWVMDVERLK
HGGTILNNEFFERQLEQIREIRLSERKFYQKITDIYATALDYDPSATASK
RFFAAVQNKIHYAIHGLTAAEVIVNRADHRKNNMGLTHWEGAPSGKIHKY
DVSIAKNYLSEFEIAQMERIVSAYLDMAELQTMRKIPMTMEDWEKRLAGF
LTLWDREILQDAGKVSAELAKVHAESEFEKYRIIQDRLYESDFDRLLKQI
EHCNTTQAEKQ
>Cag_1908 ExbD/TolR family protein
MMTNGGKSRLMADINVTPFVDVMLVLLIIFMVTAPMMTHGVKVDTPQTTH
EKIDVDPRSVMVSLDGSGNLFVNDAKIPRSEIRERLPQLLNVKEVHEVYL
KADKSLPYGVVMEVMASIRDAGIEKIGMVTEPSVPAPQAGE
>Cag_1877 conserved hypothetical protein
MEYQPSGMRALDSVERAKLGMKVFNLPFDEAEGVIDDYVSGGNYDPASVE
LFKDQLDTQRHIQEKAYELFDTGAQILRLVVGAVLKNMPSPLDDDKSTSR
E
>Cag_1424 Negative regulator of class I heat shock protein
MDYRELTARERQILGIIIQSYVVSAAPVGSKYIARHYNLGLSDATIRNVM
AELEELGFISQPHTSAGRVPTDKGYRYYVDLIMTVKTLDEQEKQRFEQHI
TPLERKGTSADVLLSAVKVLGTISRQLSVVLSPTLSNALFEKLDMVLLSS
TRMMVIISIQSLFVKTIVMELHMQVSRQMLDEVVDVLNERLSGLTLSEIR
RSINQRLADCSCDNELKNLIVRSAGTLFDEMPVFERLYISGTEYLVEQPE
FQQPEKVRDLITMLEDKFSVATLVEQHHANNPDVTITIGKEHGKRQAEDL
TVLSAPYYVGDMVGTVGILGPKRMDYEHAVRILHYMAGSLSSTLSIQN
>Cag_1427 hypothetical protein
MPLFAFTKYAQVREFRKKSLHFTLGMVLWILNIGFWMGATARVARTGLSV
ETPKLGVFTTYGMVGRRRGEIHFRPHERADIESAPTRDGSCVGTLNPKET
LRSLFSDREKPHSPARHH
>Cag_1441 conserved hypothetical protein
MVAFIHQKTNWPYFTWNNDEIVNALSEARNLQGRVIGKMESLGFDLRNEA
LLDTLTLDVLKSSEIEGEYLNPEQVRSSIARRLGMEIAGSVESDRNVDGV
VEMMLDATQNCFKPLTVERLFDWHAALFPTGRSGMLKITVGDWRKDTTGP
MQVVSGALGKEKVHFQAPDSIVVEKEMNQFLEWINNNVKIDLVIKAAIAH
LWFVTIHPFEDGNGRITRALTDMLLAQSDNSNQRFYSMSAQIRIERKQYY
DILEKTQKGNLDITEWIQWFLNCLINALKSTDATLFNVLLKANFWSKHSK
TLINERQKKLLNKLLDGFDGKITSSKWAKIAKCSKDTAIRDINDLIEKNI
LQKEAGGGRSTNYELKI
>Cag_0902 putative transcriptional regulator, XRE family
MLQSQFIEAHLETDCPVFYDDDFVADVLPIMQRQHVSCAPVLSGGKPERL
VTLPDLLAAEQTTDSDTLRLKELPLPQASGVDAGEHLFDIFRRLPHFPCD
VVPVADDKGMFAGVIDKQQVIEQVARIFHVGDDSLTLELEVPKSGVKLSE
IIALLERNEATILSFGMYTATSDNHESIILSFRLQTHDFFRLVQNLEHYG
YQVHYTSQMFNAEDEVLREKAREILYLIDL
>Cag_0389 Peptidase S14, ClpP
MANINFGFEHHAKKLYSGAIEQGISNSLVPMVIETSGRGERAFDIFSRLL
RERIIFLGTGIDEHVAGLIMAQLIFLESEDPERDIYIYINSPGGSVSAGL
GIYDTMQYIRPEISTVCVGMAASMGAFLLASGNKGKRASLPHSRIMIHQP
SGGAQGQETDIVIQAREIEKIRRLLEELLAKHTGQPVEKVREDSERDRWM
NPQEALEYGLIDAIFEKRPTPEKKD
>Cag_1845 ribosomal protein S3
MGQKVNPTGFRLGIIRDWTSRWYDDSPVISEKIKQDHIIRNYVLARLKKE
RAGIARINIERTTKHVKINIYAARPGAVVGRKGEEINNLSQELSRIIGKE
VKIDVVEVIKPEIEAQLIGENIAYQLENRVSFRRAMKQAIQQAMRAGAEG
VRIRCAGRLGGAEIARAEQYKEGKIPLHTIRANVDYASVTAHTIAGTIGI
KVWVYKGEVLVQRIDAIEEDEMKKMKDRRNDGGAKGRDSRDNRSKRRSRS
KRS
>Cag_1676 ABC 3 transport family protein
MMHEIFAFDFMRNALMAALLSSIACGIIGSYIVVKRIGFISGGIAHTAFG
GIGLGYYLGINPLLGVVPFTLFAALAVGLLGRKARVAEDTAIGTLWAMGM
ASGILFIGLRPGYAPDLFSYLFGNILTVPTADLWMIALLDSVIVSVVWLF
NKEFLAISFDEEYAQVSGVRVTLFYLLLLCLIALTVVMMVRVVGIVLVIA
LLTIPAALARFFSRSVTGMMVRAMLFAMLFSVVGLWLSYQLDIASGATII
LVAGTTFFAVQAWRIVASMVSPSSR
>Cag_1348 TPR repeat
MEQTAKPSAEEQLLYIVIKYKKALIGALVVIVTLGAGAFFGTRYQEQREQ
EAALQLSRVSPALEQGNFTLAINGTKQTAGLQKIANEYSGRFIGTPSGNM
AKLLLANAWYSFGKYETALQHFNEVTIAHEDLAAAALAGSGDCYLNLNKL
KEAAEAYQKAANKTDNRILKAQYLTHEATAYHYAKDFPKATELYKTVIAS
YPGTTAASIAQHGLWQLSGSL
>Cag_0765 transposase
MKDTVLFQQALCLPMPWFVKSSAFDIEQKRLTIQLDFQKGSTFSCPTCGQ
HDLKAYDTAEKQWRHLNFFQHECYLTARVPRISCPTCGVKAITDLPWARR
DSGFTLLFEAMIIALVPSMPCKTIANYVGEHDSRIWRIIHYYLDEALEQQ
DLSAVTKVGLDETASKRGHNYVTSFVDLESSKVLFVTEGKDATTVEKFHK
HLLAHKGKAENIKEICCDMSPAFIKGVTTNFPETHITFDKFHIIQVLTKA
VDEVRREEQKERPELAKSRYLWLKNQVHLNQSQQVKLEKLQLKKLNLKTA
RAYQVKLNFQEFFKQAPAYAQSFLNQWYYSASHSRLEPIKEAARTIKRHW
YGILRWFTSNITNGKLEGLNSMIQAAKARARGYRTTNNLIAMIYFIGSKF
EFTLPALTHSK
>Cag_0529 hypothetical protein
MILKAFNSAHFITALRGFNSVYYLGVKLAQLQATPSSGWDSSKTIDDLLT
TLKAEGFTPETHYMRYGYRENLAPNAFFNAAEYIQAKANQLVTVDHRYAS
VEAAKAAFLAAWDGDVYQHYLRYGAAENVNPSNAFDESAYYALKLAALRA
DPLTSAEWTPKSVADLQRYFKNAGFTALTHYEAYGKAEGIVVTPVLSSLT
PSLFNPTEYTQAKANQLFLQHAYDSVDAAKTAFLKAWNQNVYQHYLQYGA
AENVNPSNAFDESAYYALKLAALRADPLTTVEWTSKSVADLQRYFKNAGF
TALTHYEAYGKAEGIVVTPVPVGEKVADTLFAVTIDGAATPTVTITSSSS
ALKAGETATITFTFSADPGASFVATDIVTTGGTLGDLSGTGRVRTATFTP
TASLKFGSASITIAVRNYTDAAGNTGSAGTTPTITIDTLAPTVAITSSTS
ALKAGETATITFTFSEDPGTSFVATDIVTTGGTLEDLSGTGRVRTAKFTP
TANLNFGSASITIAVRNYSDTIGNTGGAGTTPKITIDTLAPTVAITSSTS
ALKAGETATITFTFSEDPGTSFVATDIVTTGGTLEDLSGTGRVRTAKFTP
TANLNFGSASITIAVRNYSDTIGNTGGAGTTPKITIDTLAPMVVITSSAS
ALKAGETATITFTFSEDPGTSFVATDIVTSGGTLGTLSGTGLVRTAMFTP
TANLANGSASITVAAGNYAGPAGNTGSAGTTPVVTIDTLAPTLSSSIPAD
NAMAVLVGANIVLNFSESVTAVAGKNIVLHNVTDSTTTTIAANDAQISIV
AGVVTINPTADFLNGKNYYVTVDAGAFIDGAGNDYAGIADATLLNFTITP
DVTAPTLSSSIPADNAVAVAVGANIVLNFSESVTAVAGKNVVLHNVTDST
ITTIAANDAQVSIVADVVIINPTADFLNGKDYYVTVDAGAFIDGAGNGYA
GITDATLLNFTITPDVTAPTLSSSIPADNAVAVAVGANIVLNFSESVTAV
AGKNVVLHNVTDSTTTTIAANDAQVSIVAGVVIINPTADLLNGKDYYVTV
DAGAFIDGAGNGYAGIADATLLNFTITSDVTAPTLSSSIPADNALAVAVD
ANIVLNFSESVTAVAGKSVVLHNVTDSSTTTIAANDAQVSIVAGVVIINP
TADFLNGKDYYVTVDAGAFIDGAGNSYAGIADAATLNFTTTPDVTAPTLS
SSIPADNAVAVAVGANIVLNFSESVTAVAGKNVVLHNLTDSTTTTIAAND
AQISIVGSVVTINPTADFLNGKNYYVTVDAGAFIDGAGNGYAGIADAVTF
NFTTTPDVTAPTLSSSVPADNATAVALGTNIVLNFNESITAVAGKSVVLH
NVTDSTTTTIAANDAQISIIGSVVTINPTADFLNGKDYYVTVDAGAFIDG
AGNSYAGIADAATLNFTTTPDVTAPTLSSSIPADNAVAVAVGANIVLNFN
ESVTAVAGKSVVLHNVTDSTITTIAANDAQISIVGSVVTINPTADFLNGK
DYYVTVDAGAFIDGAGNGYAGITSATALNFTTTPDVTAPTLSSSVPADNA
LAVALGANIVLNFSESVTAVAGKNIVLHNVTDSTITTIAANDAQISIVGS
VVTINPTTDFLNGKDYFVTVDAGAFIDGAGNGYAGITSATALNFTTTPDV
TAPTLSSSVPADNAVAVSVGANIVLNFNESVTAVAGKNIVLHNVTDSTTT
TIAANDAQISIIGSVVTINPTANFLNGKDYYVTVDAGAFIDGAGNGYAGI
TSATALNFTTTPDVTAPTLSSSVPADNALAVAVGANVVLNFNESVIAVAG
KNVVLHNVTDSTTTTITANDAQISIVGSVVTINPTANFLNGKDYYVTVDA
GAFIDGAGNGYAGIADAVTLNFTTTPDVTAPTLSSSVPADNALAVAVGAN
VVLNFNESVTAVAGKNVVLHNVTDSTITTIAANDAQVSIVAGVVTINPTA
DLLNGKDYYVTVDTGAFIDGAGNGYAGIADPTALNFTITPDVTAPTLSST
VPADNATAVALGANIVLNFSESVTAVAGKNVVLHNVTDSTTTTIATNDAQ
VSIVAGVVTINPTADFLNGKDYYVTVDAGAFIDGAGNGYAGIADTVTLNF
TTTPDVTAPTLSSSIPADNAAAVALGANIVLNFNESVTAVAGKNIVLHNV
TDSTTTTIAANDAQISIIGSVVTINPTANFLNGKDYYVTVDAGAFIDGAG
NGYAGITSATALNFTTTPDVTAPTLSSTVPADNAAAVALGANIVLNFNES
VIAVAGKNVVLHNVTDSAITTIAANDAQISIVGSVVTINPTANFLNGKDY
YVTVDAGAFIDGASNGYAGIADTVTLNFTTTPDVTAPTLASSIPADNAMA
VLVEANIVLNFSESVTAVAGKNIVLYNMTDSAITTIAANDAQISIIGSVV
TINPTADFLNGKDYYVTVDAGAFIDGAGNSYAGIADAATLNFTTFLVVPP
PDLIPPTLSSSVPADNAMAVLVGANIVLNFNESVTTVAGKNVVLHNVTDS
TITTIAANDAQISIVGSVVTVNPTTDFLNGKSYYVTVDAGAFIDGAGNSY
AGIADPTALNFTITPDVTAPTLSSTVPADNAVAVAVGANIVLNFNESVTA
VAGKNIVLHNVTDSAITTIAANDAQISIVAGVVTINPTADFLNGKDYYVS
VDAGAFIDGAGNGYAGIADTVTLNFTTTPDVTAPTLASSVPADNAAAVAM
GANIVLNFNESVTAVAGKNIVLHNVTDSTITTIAANDAQISIVAGVVTIN
PTADFLNGKDYYVTVDAGAFIDGAGNAYAGIADPTALNFTTTPDVTAPTL
ASSVPTDNAAAVAVGANIVLNFNESVTAVAGKNIVLHNVTDSTTTTIAAN
DAQISIVAGVVTINPTADFLNGKDYYVTVDAGAFIDGAGNGYTGIANAAT
LNFTTTPDVTAPTLSSSIPADNAVAVAVGANIVLNFSESVTAVAGKNIVL
HNVTDSAITTIAANDAQVSIIAGVVTINPAADFLNGKNYYVTVDAGAFID
GAGNGYAGIADPTALNFTTTPDVTAPTLASSVPTDNAAAVAVGANIVLNF
NESVTAVAGKNVVLHNVTDSTTTTIAANDAQISIVGSVVTINPTADFLNG
KDYYVTVDAGAFIDGAGNSYAGIADVATLNFTTTPDVTAPTLSSSVPADN
AAAVAVGANIVLNFNESVTAVAGKNVVLHNVTDSTITTIAANDAQISIVA
GVVTINPTADLLNGKDYYVTVDAGAFIDGAGNAYAGIADPTALNFTTTPD
VTAPTLSSTVPADNAAAVALGANIVLNFNESVTAVAGKNVVLHNVTDSAI
TTIAANDAQVSIIAGVVTINPAADFLNGKNYYVTVDAGAFIDGAGNGYAG
IADTVTLNFTTTPDVTAPMLSSSVPADNAAAVALGANIVLNFSESVTAVA
GKNIVLHNVTDSTTTTITANDAQVSIVAGIVTINPTTDFLNGKDYYVTVD
AGAFIDGAGNGYAGIADTVTLNFTTTPDVTAPMLSSSVPADNAAAVALGA
NIVLNFSESVTAVAGKNIVLHNVTDSTTTTIAANDAQISIVAGVVTINPT
TDFLNGKNYYVTVDSGAFIDGAGNGYTGITDPTALNFTTTPDVTAPTLSS
SVPADNAAAVAVGANIVLNFNESVTAVAGKNIVLHNVTDSTITTIAANDA
QISIVAGVVTINPTADFLNGKDYYVTVDAGAFIDGAGNGYAGIADAATLN
FTTTPDVTAPTLSSSVPADNALSVALGANIVLNFNESVTAVAGKNIVLHN
VTDSTTTTIAANDAKVSIVGGVVTINPTADFLNGKNYYVTVDAGAFIDGA
GNGYAGIADAATLNFTTTPDVTAPMLSSSVPADNAAAVAVGANIVLNFNE
SVTAVAGKNIVLHNVTDSTTNTIAANDTQVSIVAGVVTINPTADFLNGKN
YYVTVDAGAFIDGAGNGYAGMADTTLLNFTTTPDVTAPTLSSSVPADNAT
AVALGANIVLNFSESVTAVAGKSVVLHNVTDSITTTIAANDAQVSIVAGV
VIINPTADFLNGKDYYVTVDAGAFIDGAGNNYAGIADAATLNFTTTPDVT
APTLSSSIPADNALSVAVGANIVLNFNESVTAVAGKNVVLHNVTDSTTTT
IAANDAQISIMGSLVTINPTADFLNGKDYYVTVDAGAFIDGAGNGYAGIA
DPTLLNFITAPDVTAPTLTSSVPADNATAVSVEDNIVLNFSENVLANTGY
IVLKATADNAIIESFNTATGQGNHGGTVTVTGVSVTVDPMAYLTANTGYY
VTVDSTAVKDVVGNNYAGIVSSTELNFTTPTPTSYNLTTFADIAPAFVGT
VGDDIFNGTYGDGAGPYTLDATDVLNGGTGVDTLSITTGAEASTPPDSLW
ANKTNFEKVEFHSTGAGAQSITTGVNFNTAFAGHVDLIVETYNGATTIEM
QAFDGTSTLVATTTLDGAQTITTSNTHAAIVKAINSAAGAQTISGQFLTE
VQATINGAGAQTIGNALGGGSHLINVTATVLGAGDQTITTTSTGNATVNA
TCTTGTQRIVTGVGNDSVTAHSTTASNNVITTDAGNDTIIAGQGNDSITG
GLGSDSMTGGGGTDTFVFGANGSIVGASMDIITDFNNAGADILTFGGNTT
VLAADASVLVAGTNVQTSDGGLITFDVSDNTLAFKIAAVEADAQLDVAGS
VAMFVDSGNTYLYYAGIAAGNLDDQVIQLTGITTFITITGGPTTTII
>Cag_0403 conserved hypothetical protein
MGKSEQRYGKQLISVLSAQLTQKYGSGFSVTNLKYFRTFYVTYPDRFDTI
GYPMGSQLPQEEKSRPLGDQLPEAEKSYPIGSGFSPQLTWSHYRALMRVQ
NEKAREFYENEAIDGGWDKRTLERQIHTQYYERCLHSQQPEKIIAEGRKL
QKEVPLATDILKNPYVLEFLGYPNFAELRESDVERAIITHLQRFLLELGN
GFAFVARQKHIRIDEDDRFIDLVFYHCRLKFYLLIDLKLGKLTHADVGQM
DGYVRMFDGLFTALDDNPTIGLILCTEKCDTVARYSVLNDRKQIFASKYL
PSLPSEEQLQIEIEKERRLIEAALEEQKACKHE
>Cag_1752 conserved hypothetical protein
MNIEVRYHKLAENELHDAAKYYESRCSGLGRAFLTEITQAINQISAFPES
APMILDIVRQKVIHRFPYSIMYVNDDDGVMILAIANHHRRPFYWGNRISN
FHE
>Cag_1173 Protein of unknown function UPF0054
MSLELCNSTRQAIPNKRLLQAIRMVVQGEGYEIATITGVYCGNRMSQRIN
RDYLNHDYPTDTITFCYSEGKAIEGEFYISLDVIRCNARHFNVTFEEELL
RVTIHSALHLTGMNDYLPEERVAMQAKEDYYLQLLKTQKSISPQKSTDNA
IFSCNS
>Cag_1983 Alpha amylase, catalytic subdomain
MYQPEPLWYKDAIIYEAHVKTFYDSDNDGIGDFQGLRQKLGYLQSLGITA
IWLLPFYPSPLRDDGYDIADYMTVNPDYGTMDDFRAFLEEAHSLGLKVIT
ELVVNHTSDQHAWFQRARHAPKDSPERNFYVWSDDPNKYSETRIIFQDFE
ASNWTYDSVAGQYYWHRFYHHQPDLNFENPAVHAALLHVLDFWLGMGVDG
LRLDAVPYLYEEEGTNCENLPRTYQYLRDLRSYIDEKYPNRMLLAEANQW
PEDSAAYLGNGDMCHMNFHFPLMPRMYMALATEDRFPILDILEQTPEIPE
SCQWASFLRNHDELTLEMVTDEERDYMRRVYANDPRARINLGIRRRLAPL
MSNDRRKIELMNIMLLSLPGTPVLYYGDEIGMGDNFYLGDRDGVRTPMQW
NADRNAGFSRANPQRLQLPVIIDPEYHYEAVNVEVQESNIHSLLWWMRQT
ISTAHRHKAFSRGTIEFLPVKNSKVLSFIRQYEDETMLCVINLSKNAQAV
TVDLSRFNGYTPEEVFSLNRFPKIRSTPYMLALGAYGYFWLKLIKEEKEV
DRHALLDGSVVSVNRWQSLFIGKNREKLETAVFSSYYMAARWFGGKARTI
IRISITDTIPIANVANTKLLVTEVRYSSGENENYQLPVTFVPLANLQPSD
EYFSKQVIARITVGDEEGYLCDATFTPAFLQELYSVATAKGSWQGKQGVV
NGSSAPKLAAFLANVADAAPELMGAEQSNTSIRYADNLCLKLYRRIESGV
SPEVEMCSALSERTSFTNLPTYLGTVNYSRSRSSRCSIGILQTYVPNQGD
AWQLSLDQARRYFDAIHSALPNALAMPALPALSGNPAPLPELMQELIGGH
YLGMIEKLAERTAEMHLALATLESDPAFAPEAFTSLYQRSIYQAMCEQVK
RSVILIRELLPSLNGEQQTLATQFVQKQKQILQQFDPIRTEKIEALKIRI
HGDYHLGQVLFTGKDFTIIDFEGEPARPLSERKIKRSVFRDVAGMLRSFD
YAAFSALRQIAPTLRPDELPMLDAWAERWSFYVGQHFINRYFEATNGSSI
VPVEAPQREHLLRGYLMNKAIYELNYELNNRPDWAAIPLRGILKLIEQ
>Cag_1962 conserved hypothetical protein
MSEQSHADKVQLAYAAVLGKTSTIGIGLIVVGYALYVMQILPATASPEVV
ASHWHLRASELHQAINVPNGWDWLGNMGYGDVLSFASLAYLATVTTICLI
TVIPVLLKENDKIYAVITTLQVLVLLFAAAGIVSGGH
>Cag_1166 rod shape-determining protein MreC, putative
MQKFFTLLIKYNAYLLLALYCSIALLVIKLQEEEIFVELRNNGLEFSAAV
NEQFMSIAYFLHLSHENSRLMRLNTDLLNKILHYDNILLEERRYKALFSD
STFNASPYIKAQVVDRKFSATDNMLIIDSGWRHGVKKDMAVLVPQGLVGR
VVAVSENYAKVMPVIHPDFRVCVVADSSNCSGILVWNGGREWIANVDHIP
ISSRLRLNEQFRTADFSTFALRGIPVGRIVRIVPDKLFYTVDVRLAVDFS
ALHYVLVAPQKVEAEKVRIVSDSSAALLRTPQL
>Cag_0899 5-formyltetrahydrofolate cyclo-ligase, putative
MTPQPPSKQALRLAMREKRLALAESERMAMSVAIAEQVVALPMVQEARQV
HCYLSIAALGEVSTEPLLQALHAMGKKLLVPVVQGKELLSVCYEPSMPLR
IGRFGQPEPERAIPADAAQLDVVVLPLLAFDEKGQRLGYGKGYYDRMLER
FTLHHVTPYCIGLAFSLQQVAALPADFWDKPLDAVVHEKGLLSF
>Cag_0593 Nucleotidyltransferase substrate binding protein, HI0074
MQQDIRWKQRLQNYSRAIKLLQEVPELDREKLSFLEKEGIIQRFEYTLEL
AWKTLKDKMEEDGIILDKISPKMVLKEAYKAKYIDNIELWIEMVNDRNLL
SHTYDFETFEEIIIDIQYRYTQLLSDLYINLIESQL
>Cag_1408 Membrane-fusion protein-like
MDEHMNANLDASLSALRALRDFKGREGEFWLSMATHISRLFQAERVVLLR
RAEVGWNPLSFWPLASARSAIPPSAQLAELATTAKQTAVAYAPLPEQLGT
TLQHAVSTAAEQSSEKKADNISGNTLLAFHLHTESADGEIMALLWRAHDS
AALRNGDLLKRELLVDLPLQYRQSNTTRPLTSATPESLDVVLSMNEHTTF
TGAAMSLCNELAFRLSCSRVSIGWKDGEYIRLQAVSHTEKFDRKMSVARA
LEVVMEECFDQDEELLVPEVAGTSTTIIREHRAFVAKQGVGAILSLPLRL
GNEVVAVLSCERDKPFSADDIRSLRIICDQVTRRLGDLKHFDRWFGAVAL
DKVRNWASSLIGTDKTVHKIYAVVGSILLLFLLFGKMEYKVEAPFILRTH
DLALLSAPFDGYIERVSRKPGDLVTTGDPLILLDTRQLLLEESRSAADVL
RYQQEEKKAMAQNALAEMKVAEALRRQADSRYQMIRYNLQHADIRAPFSG
IVVEGDLEKLLGAPVRKGDVLLKVAKLEKLYIEIKVAERDIQEFKVGQEG
EVAFISQPSKKYTVVVDRIEPMAVTEQKGNVFLVLGHITEARDAWWRPGM
SGLAKVSVGERHILWIWLHRTLDFFSMKLWW
>Cag_1774 LemA family protein
MRYMKSTHYALVKIFMALLLISSLSGCGYNSIQQNEEAVNRAWGDLESQL
QRRADLVPNLVATVKGAANFEKETLTAVIEARSKATSIQLSPEMLSDPAA
MAKFRAAQGALTSSLSRLMVAVERYPDLKANQNFLDLQNQLEGTENRISV
ARQRYNGAVEIFNVSIRQFPNSLTNSVLLQLKAKEYFKADEAAKAVPDVK
F
>Cag_1524 DNA-damage-inducible protein D
MQSQEIQQLKEQFDALSHTIPDEDVEFWFARDLMEPLGYTRWENFMTAIK
RALESCETTGYAVDDHFRGVTKMIGIGKGGQRPVEDFMLTRYACYLIAQN
GDPRKEAIAFAQSYFAILTRKQELLEDRMRLQARLDARERLRESEKTLSQ
NIYERGVDDAGFGRIRSKGDAALFGGHTTQAMKERYGITQTRPLADFLPT
LTIAAKNLATEMTNHNVSQDDLHGEHAITREHVQNNQSVRTMLSQRGIKP
EQLPPEEDIKKLERRIKTEEKQLVKHSGKLPVAKNQD
>Cag_1869 HPrNtr
MIVQEVTIKNKAGLHTRPAAAVVKLASRFKSEFYIEMQGVAVNAKSIIGV
MSLAAPKGTKLKLKLDGPDEVDAARQLVEFFEQGFGEV
>Cag_1684 Rare lipoprotein A
MKHEKHFFRLSYCVIATALVAFSSSVFSLPSEAATRRSAATYRSKALSEG
TASFYSTQFHGRKTANGETFNMNQLTAAHPSLPFGTLVKVTNMDNGKNVV
VRINDRGPYVKGRIIDLSKSAAIKIGILKEGVAQVKVEPVKPTINTQVAG
>Cag_1157 peptide ABC transporter, permease protein, putative
MSTAFTQHTKTPHHKELLLLIPIALFLGIVWYNGEMVWLSYRSLFNALRL
FVIDNTQLQAIFSAEIIEFWIASLYVVIAPIVLLAIMFRARTRRQKSKRS
DESQSPGKVSMKAFMQNRIAQVASTIIFTLYSTAFLAPFLAPFSPYDQQD
FLVTAYQPPFTRVTALVLQEPKNLIIPKQAGSGTGVDIANSVIGDVQHLK
NRNQPHAIKYVNSYRIEGDNVLYQQGIRNRTIAKAALLTNSNGNPVIESR
IFLLGTDQYGRDILSRVIYGSKISLSIGFLVVLISVTLGTIIGVSSAYFG
GWIDAIAMRIVDVLIAFPALFLILIIIATFGNSIYLIVLTLSFTGWMGVA
RMVRSQVLSLKEQEFILAAQSLGFSTTRIIFRHLVPNSLTPVIIAATLRI
GSIILTEAGLSFLGLGVQPPTPSWGNIINEGRDSLLNYWWISTFPGLAIL
TTVVCFNLIGDGVRDALDPRMRV
>Cag_1107 ComEC/Rec2-related protein
MTEPSKPHSAVKPKRAIGLSLAPYPAVRLLFFVIIGIVVGVVAPFSLTEW
LWSVALSFALLLLTWLYERIRYHQAAVPHFGMAIMYCFVVVSVFATLSAY
RLHYAPRNGLTQYAGRTVILYGSIESRPERSKGGASWVMEVQELFEHGKT
VTLRDRTKVFMRMSADAHLAVQKGDMVRVKGKLDLLPEAANAGEFNPRHY
GAMQQISVQLYAAGPWQVLYEGEKRLHPFEQYMVQPTYRYIMQALAALLP
DGEERKLAAGVLTGERETMSEEVFEAFKRTGTAHILAVSGMNVGLLALII
QVFLQRLKITPFGRWTAFLLFVFLLILYSNVTGNSASVTRAAFMALVLIA
GETVGQKTYPLNSLAVADLIILLINPLDLLNPGFLMTNGAVLALFLVYPL
LHFPRPKNRTLLLSIVWFLLDSIIITLAASIGVSPVIAYYFGTFSLISFV
ANIPVVFFSTLLMYALVPMLVVYGLSQALASVFAAGAFWLARMTLQSALW
FSNFSFASIPLKLDAVEVWLYYIVLAAVLLLATRKAWSRVAITFLLGVNL
FVWYSLLFRPNPIAPTLLTVNLGRNLATIVSNGSESVLIDVGKKPKDYQR
ISAQFERFGIVEPTAVVQFYSPDSLILATPTRHHFLRSDSLLRLSSMVIT
RPDEKMVKLWSRNQSYFLASGTSRLKAGEPYCGDVACIWIYRFGEKQRIE
LERWLTATKPKEALLVPSSFLSRVQLVALHRFAAAYPHVEVRSKTKQVVV
NGGER
>Cag_0688 MoaA/NifB/PqqE family protein
MQHIFGPVSSKRLGQSLGVDLLPHKSCSWNCVYCQLGRTKTYSTERQEFF
AREEILAEISQALQQHPSLDWITFVGSGETMLYRGIGWLIAEVKKLTSVP
IAVITNGSLFHLPEVRHELLQADAVLPSLNAGSEALHQRICRPAEGFTFQ
QHLAGLQAFRQEYCGRLWLEVMLLGGINDTDEALHDLAEAIRTINPNMVH
LVLPTRPAPEHEVQLPTNERLERAIAILSTVATVMNPLKGSMDLRNTPDL
LEAISAIVSRHPVQQRELHKAIADRYLNDNAQADIVLQELFATGRFQHIE
HNGELYWVMESK
>Cag_1038 hypothetical protein
MKKTRWLIAGLMGVILGSLPLSAQEGFCMSPKKSDTSIVINSQQSFIELP
DQGFSVSVGSPYDIINYDNRYYIYQDGSWYRSSNYRGTWTVIRDSDLPDR
IRRHRPEDIRRLRDNESRRYENDNRQYRRDENNRK
>Cag_1984 Alpha amylase, catalytic subdomain
MTLHSTFEHTLPLFCQQAFDGRQRVIIENISPEIDGGSHPAKAVAGELVE
VEADIFVDGADTISAMLCARPMGSSEWQQSAMQPLVNDRWQGAFRVGEAG
VFEYTITAWVDHFQTWRKGFIKKVDAGQNVSLELQIGTIHLEKAALRATK
SDAELLHVLVQRISTADDAEAIALITSDALAQVMERNPDTSLATTYQKVL
RVTVEQTKAGCSAWYEFFPRSWSEIPGKHGTFNECLRLLPLIAGMGFDVI
YLPPIHPIGYAKRKGRNNSLVALPDDPGSCWAIGNSDGGHKAVHPELGTL
EDFTAFVQAAEAQGISVALDIAFQCSPDHPYVQEHPQWFTWRPDGTVQFA
ENPPKRYEDILPINFENDDWQNLWIELRSIFLFWVERGVKIFRVDNPHTK
AFPFWEWAIRTIRAEHPDTVFLAEAFTRPKLMARLAKIGYSQSYSYFTWR
NTKHELQEYVTELTSEPLKHVMRANFWPNTPDILHDEFHNGEREKFIIRL
VLAATLSANYGMYGPAYELCEHVPIAHGKEEYLDSEKYEIKQWDMDRPGN
IRAEITAINRIRKENPALQQTADISFLHIDASPGNEHNMLMAYVKRSEND
ANIILVVVNLDPITTQRGWLRFPLEQFGLTHLHRFHVEDLLSGQCHTWHG
EWNYVELNPHVMPAHIFKISL
>Cag_0522 conserved hypothetical protein
MADETKTTGQGGVKGDFATILVGVGTILDNTIEPLSKILVQTLDSLTVVA
KQILEGVNSSLGCKK
>Cag_0854 hypothetical protein
MLCACKLCLMHKEWLVESEMSGKAYNKMSYQPFLFIYSFFFSGNNCKFAN
FTCILILNAFFSVICNGSAAAQA
>Cag_1264 trans-sulfuration enzyme family protein
MHFQTLAIHDGQTPDPLTGAVSVPIYQTSTFGRESLEYKGGFVYSRIGNP
TRQALESALALLENGKHGLAFASGAAASMAALHLLKPNDHIVSSSDIYGG
TYRIFEQLLRPWGIHTSYAATEATESYEACITDATRMIWIESPSNPLLQL
SDIQALSALAKKRGIVLVVDNTFASPYFQSPLELGTDVVVHSTTKYLGGH
SDVIGGALVTSNEEFYATMKTYQAAAGAVPAPWDCWLILRGLKTLKIRMK
EHEATALHLATLLEKHHAVERVLYPGLPSHPQHELAKKQMSGFGGMITIA
LKNGLPAVERFLAKLKLFVLADSLGGVESLVASPAKMTLWALSQDERNRR
RCTDNLLRLSVGLEDADDLAEDLLEALNACNDF
>Cag_0499 Phosphate transport system permease protein 2
MNNVKYVIMDNVSIKDERRKQQRLARERRFRFYGISAIVLALSFLVFFFV
DIIQKGYKAFAVTEIRSEVVYSPDALDIPQLAFQEDVRDFVSYSVVRMIP
SQLEKEPTLMGKSNVQWVVANDEVDQYCKGSDTQLSEEEQQQINALRDAG
RVRVTFNALFFTAGDSKLPEVAGILSAIVGTLYVITLTVLFAVPIGVMSA
LYLQEFAPDNAFTRLIEVNISNLAAIPSIIFGLLGLSIFINFFGVPRSSA
LAGGLTLALMTLPVIIISTRASLAAVPDSIRHGAYAVGASRLQMVFHHLL
PLAVPGITTGTIIGVARALGETAPLLIVGMMAYIPDMPTSITEAATVLPA
QIYTWSSASQRAFVEKTAAGIMVLLAVMLVLNAIAFFLRKKYEIKW
>Cag_1198 RND efflux system, outer membrane lipoprotein, NodT
MKRHPNETGENDIVPQITWFPTIMPHTKKKTRLLLAATQLAIATTIVGCS
APKESMPPDVAMPDAYRGAATVAAPSDSTIAQMPYQNFFADTALTALIEQ
TLAHNADLQSALKNIELAEQTLDAAKVVWLPSLNLSAQTIRNESSEHGVR
RTPKEFTAAVSASWEVDVWGKIKNRKQSVLANYLKSQEAVKALKTRLVAD
VASGYYNLLMLDEQLAVARKNLALADKTLAMMQLQYQAGQFTHLAIRQQE
AARQQLAATIPQIEQAVAVQENALSVLSGSMPNAITRNPSLLQVKPTNTF
AVGIPAAMLHNRPDVQAAEFALKAATADMKESGAAFYPSFTITAQKGVSA
LQSSDWFNVPSSLFSVVQGTMLQPIFQRGQLEVTYKQSQVKRDQAALAFR
QSVVKAVAEVSDALVRIEKLQTQEQLAEERVATLQQAVRNADMLFRSGLA
TYVEVMSVQSNAHNAELTLADLRRQRLTATAELYRALGGGWR
>Cag_1626 conserved hypothetical protein
MEQQDKFYKGLFWGAALGAAMGTVMGLLFAPYRGRETQQRISGKVKSMLD
KATDLYESSEHDGMYNNDAKTRAQGVIDTARDEAKKILSEADSLMRDIKG
HPAKTRES
>Cag_0573 Glutamate racemase
MNTENPIGIFDSGIGGLTVVKAIQAALPSERIIYFGDTARVPYGPKSQVT
IRKYAADDSAILMRYQPKLIIVACNTVSALALDVVEKCCGSVPVLGVLKA
GASLAVQASRNGRIGVIGTQATVCSNAYACAINQQAPDYEVTSKACPLFV
PLAEEGFIEHDATALIAEHYLAPLRTHNIDTLVLGCTHYPILRNVIERTI
GSNIRIIDSAEAVALQAGEMLRERNLLNQSPDKKTPHLLVSDLPQKFSTL
YQLFMGSELPDVELVEV
>Cag_0319 conserved hypothetical protein
MVELQEKNGSVCIAVRAQPRSSKSMVSGEWNGALKVHLQSPPVDDAANEE
CCRLLARLFQVPPSRVHLVAGHSSRNKRVMVEGVSAAMATELLQPFLHT
>Cag_0984 hypothetical protein
MLCIPASYVFQMPTVLIVSASPLDQDRLRLNAEFRDIRHALQRSRNREDW
VIESNEAVTVDDLRRALLDFRPTIVHFSGHGGGLDGLCFESTEGRTNSAD
AESLAKLFHHFKDDLKCVVLNACYSKVQGDVIRQEIDYVVGMSSAVEDKS
AANFAVAFYDAVFAGTDFRTAFDLGCTALDLNNLPDADVPIFMTGSHLKP
TDLHDSAYIAEIEKVLYSYINTPFSERWRYTTTGELLRAVMEKHYAGNMH
RLVPKVSVISMKQIADEHWVVAVDVVSSLMYMRIKNRSVSVEWEASVGLW
SVPVKTYLALGSREPLLARVNAELDTYYNFEFANEQHRFQSVSLCAVSGP
MLHGYVERGSKVYEELIDILSDGNEHAITLEIEQATDHTDMPLIKRVLSR
TWICSQPQDTKNA
>Cag_0815 thiamine-phosphate pyrophosphorylase, putative
MTSAPSMSSPLPRLFIISSGTERVHSTTFQIATPPPTLLLQQLALIPPHL
RCMVQLREKALNHAQLLHLAKLAKGANSAGNVRFVVNGSLEIARLAELDG
VHLPEGSALLTTMRQAAPTMLLGCSVHSLEAAQYAAENGADYLYFSPIFD
TPSKRQYGAPQGLQALETVCKRLSLPVYALGGITVERIAQCRNCGAYGIA
ALSLFQALDKLPHLLEQCNNLLES
>Cag_0343 photosystem P840 reaction center, large subunit
MAEQVNPAGVKPKGTVPPPKGNAPSPKGNGAAGAPSVIKEQDAAKMRRFL
FQRTETRSTKWYQIFDTDRVDDEQVVGAHLALLGFLGYLMAIYYISGVQV
FPWGAPGFQDNWFYLTIKPRMVSLGIDTYSTKTEDLQWAASNLLGWALFH
IISGSILIFGGWRHWTHNLTNPFTGRAGNFRDFRFLGKFGDVVFSGTSAK
SYKDALGPHTVYMSLLFLGWGFVMWFVLGFAPIPDFQTINSETFMSFIFA
VIFFAAGIYWWNNPPNAATHLNDDMKAAFSVHLTAIGYINIALGCIAFVA
FQQPSFADYYKALDSLVFYIYGEPFNRVSYDYVEAGGRIVSGSKEFADFP
AYAILPKDGAAFGMSRTVINLIVFNHIICGVLYVFAGVYHGGQYLLKIQL
NGLYSQIKSVFITKGRDQEVQVKILGTVMALCFATMLSVYAVIVWNTICE
LNLFGTNIMMSFYWLKPLPMFQWMFNDPSINDWVMVHVITAGSLFSLIAL
VRIAFFSHTSPLWDDLGLKKNSYSFPCLGPVYGGTCGVSIQDQLWFAMLW
GIKGLSAVCWYIDGAWIASMMYGVPAADAKAWDSVAGLQHHYTSGIFYYF
WTETVTIFSSPHLSTILMIGHLVWFISFAVWFEDRGSRLEGADIQTRTIR
WLGKKFLNRDVTFRFPVLTISDSKMAGTFLYFGGVFMLVFLFIGNGFYQT
DSPLPPQVGDASVSGQMMLTQVVDYLLKLIA
>Cag_0990 ABC transporter efflux protein
MLFREIIQQALLSLSVNKLRSGLTIMGVAVGVFSIIAVMTALDAIDASIE
SGLSSLGASTFQMQKNPPTTLGRAHGRNLYANRRDISWQEAQRFKKEMEG
KSKTVGLVIQSKAHQANYGNLSTNPDVSLVGGDEHFALANGYTIAEGRNV
VESDIRSQRNVVVLGSEVAATLFPAGNSALQKKIRLNGEVLTVIGVFAKK
GSAFGQSQDNMALLPITRYLSHINEKSSIAITVEATSQADYARTMDRAIG
AMRLARGLTIQQSNDFEILSNESLLDSFRDIKQAVTTGAFLISMTALLTA
GVGIMNIMLVSVTERTREIGIRKSVGAPRTSILRQFLLEALFLSLAGGAI
GIVTGLGAGNLVALQFNLPPLFPWLWIMIALVVCSTVGIAFGIFPAWKAA
TLNPVDALRRR
>Cag_1740 aminopeptidase P
MDSLTLQLQHYRQSSYQHIVQKMVNLALDAFIVTELPIIRWLTGFSGSSA
RLLITREKVWLFTDFRYQEQVRHEVTLAETVIVAEGFIAELLLGNYPCGT
TIALQAEHITWQEANRLRDKVFHAQQVMPIEGFFNEFRIIKQAVELDYMQ
RAAALSEAALEAVLPMISPNVTELDIAAELSYQQKKRGASGDSFSPIVAS
GARAAMPHATPTNAHFVQGELILLDFGCMYEGYASDQTRTVALGKPSKQA
STIYNIVRKAQQLGLERAQCGMKARKLDEVVRRFITKHGYGEQFGHALGH
GIGLEVHEEPRISSRSETILQEMMLFTIEPGIYLPNCCGVRIEDTVVMGT
QGAMPLQQFSKELIVL
>Cag_1125 propionate--CoA ligase
MSLSFDALYRQSIEQPEAFWEEAATKLHWFRKWDKVLDASNPPFYRWFAG
GTTNTCYNALDRHVDEGRGNQLAVIYDSPVTGTKQRYTYREFRDIVALFA
GALKSRGVHKGDRVIIYMPMIPEAVVAMLACARIGAIHSVVFGGFASHEL
AIRIDDCKPKVIISASCGIEHNKVIDYKRLLDFAIELAHFKPEICIIKQR
EQLRAELNEERDLTWQQSLLGAEPVPCVPVESSDPLYILYTSGTTGKPKG
IVRDNGGHMVAMQWSMKHVYNVEPGEVFWAASDVGWVVGHSYIVYAPLLQ
GCTTLLFEGKPVGTPDPGTFWRIISEYNVSVLFTAPTAFRAIKKEDPNGN
YIRQYSFPNFRALFLAGERADPDTVRWAEEKLKVPVVDHWWQTETGWAIA
ANCQGMEPGPTKYGSASRAVPGYNVQVVNAEMEQLPAGQMGDIVVKLPLP
PGTMLTLWKADTRFVETYMKTYPGYYQTSDAGYIDEDNYLYIMSRTDDVI
NVAGHRLSTGAIEGALCEHPDVAESAVIGVADDLKGEVPLGFLVLKNNVD
TPHSQIIKHVIEYVRENIGPVASFKHAVIVNRLPKTRSGKILRGTMKKIA
NSEEYNMPATIDDPTILSEITAALKTIGYANTSHPPTIAS
>Cag_1326 release factors family protein
MLQITDMLAIPDEEIELQTMRSQGAGGQNVNKVETAVHLRFSILASSLPN
DLKQKLLSRNDHRLTKEGIIIIKAQRYRSQEKNRADALLRLQALLAKAAE
PEKPRKRTRPTKSSQLKRREEKLKRSEVKALRGKIET
>Cag_0756 conserved hypothetical protein
MHTLNLKPKEHYRLQKGHLWVFSNELAQIPRDIASGETIKLLSHDGKFLG
IGFFNPHSLIAVRLLTRRDEAIDHAFFKRKFAEAIALRTKLYSKEVTNAM
RLVHGEADGLPGLVVDRFNHAIVVQTFSAGMEIHLPLICNVLQELLEPRV
IIVRNESPLRELEGLTLYKEVVRGDAAEAIQQIYDYGVNYRVDLLEGQKT
GFFLDQRENRRMVRAFAAGASVLDVFTNDGGFALNALAAGASSAMLVDAS
KEALVRADYNGQLNKFSNYSLVAADAFDTLETMVEAKESFGLVVLDPPGF
TKSRKNLPGALKAYKRLNKLGLQLVQSGGFLATASCSHHVSEEDFLGVIQ
QAALAAGCNIRLIHKNTQPFDHPVLLAMPETSYLKFACFYVTR
>Cag_0606 conserved hypothetical protein
MPLQTQKNSNPAVQLAVLALLIGVSFFATLGATPLFDVDEGAFSEATREM
LISKNYLTTYLNGAPRFDKPILIYWLQLLSVQILGINEVAFRLPSAIASA
VWALLLFFFVKKESDSQQALIASGLLVLSLQVSVIAKAAIADALLNCLLA
LSMFAVMRYYKNNSKTALLTAFAAIGLGTLTKGPIAIIIPLAVTFLFSVL
EGTLKKWFRMVFYLPGIALFCVIALPWYLLEYQDQGMAFIEGFFFKHNIS
RFNTSLEGHSGSLFYYFPVLIVGMMPFTALLFATMWRLPKLLSTPTNRFL
LIWFGFVFIFFSLSGTKLPHYMIYGYTPLFIVMARIVPQLKHPTRLVILP
ALLLVLLAATPMLAEQALPMFDDLYIQSLLIALAQESGMECTIVSLTTLA
ALIIMQLPRKLSIEAKILTTGLFFCLYMNAYLAPLVGSVLQQPIKEAALL
AKQQGYKVVMWKTYNPSFLVYSESLVEKRQPQAGDVVLTTVKHIAKLNVT
TLLYSKHGIVLAKVPY
>Cag_0160 TPR repeat
MNILLNKLVIIFEFVVGVTTLLLGADFLGVEWKQYPFLQRLDGQQPLLLL
LTFCSLLSILVIKLKNNSPAFEAVQTKDGTTIVKATLSGFEQELDDAADE
IKKKAQEYFEAGERDFANEQYAQAASNYKASFELLPTMAASLNAGIAFSI
TSDYQAAEELYQQGLRIAQHKRDKEYEAAFLGNIGVGYRSSEKFADALDY
QQKAFKLNQTIRNQRGQANQLCNIGLIYNDKGLLDAALGYFKQSLELYET
IGDTSGQAQNLRSIGIIYRRKGKLNEALSCHQQALNIDKANKNASGEAEN
LNNIGIIYKGKKEFDKALNCCYQALDICKRIGYKFGEGLALSAIGVIYYS
KGELDNALDYCNQALKLYKSIDSHLIAQAENLNNIGLIYQDKGQLDQALI
YLGQSYLLFQKTGVTLQLKKTEANIQEVKRLQAAKGKN
>Cag_0935 Ribonuclease BN
MLMKKVPHIIHRANQSGGKRYERMVAFVAFFRTNLLHDRIFISAGSLAFQ
TLLSIVPVLAVVLSVLNLFEVFTPFQHSLETFLVENFMPATGRLLHGYLL
EFVGKTGNIPLLGSLLLFVIALSLLSTVDQTLNDIWGIRAPRKALQGFTL
YWTVLTLGPLLIVSSLAASSYVWYTIFTDEGALFELKTRLLALFPFINSI
VAFFLLYMLVPKRRVRIAHAFAGALVASLLLELSKRWFLFYVTHVATFEH
IYGALSVVPMLFFWVYLAWVVVLVGAEFVYCLGAFAPSTPTAEAQGSLRY
LSLPLAVLATLHNAIEKGAPLSLKSLSRELSTLPFNLLRDMVDLLLDRQV
LHLSTRGELALSRNLHAMSLYELYQIIPQPINATEKSLLFSESKSVHLAP
LSLEVEACLQERMATPIAELLQHSFLKASTVD
>Cag_0517 UDP-glucose 4-epimerase
MKILVIGGAGYIGSHVARAFLDKGYEVTVFDNLSTGMRENLFAEARFVHG
DILHPAQLHAVMAEGFDGCIYLAALKAAGQSMLHPDAYAEANIGGAINIL
NQAAATGLGTIIFSSSAAVYGSPNYLPIDEAHPTAPENFYGYTKLAIEQL
LAWYDKLKNIRYAAIRYFNAAGYDPDGRVKGLELNPENLLPIVMEVAAGI
RPKLNIYGNDYITRDGSCIRDYVHVSDLATAHVSAFEYIQRTKQSLTVNL
GSEQGVSVLEMVERARAITGRPIPADIVERRAGDPANLVASSSKARELLG
WVPQYSDVDTLIASTWQMYQRFVKK
>Cag_1117 conserved hypothetical protein
MTIKELVPLLQTAIGPMILVSGLGLLLLSMTNRLGRIIDRSRTLLGCIEA
SAEPQVHRINREVAILWQRAHYIRLSILLACVACFGASMLILLLFLSALL
MLEVSLVLATIFVLTMLCLSCSLLFFFLEVNMTLSALKIEMEHYDKKHQL
MESMEWGR
>Cag_1398 conserved hypothetical protein
MLVSYFTPQIAAASAVHLPPVWLVAPFFLLLLAIASGPLLFHRFWERRYP
IIASFAGAIVAIYYALFMDHGYQQLWHALEEYFSFIILIGSLFVATGGIL
VTIDRRSTPLLNGGLLFFGAIIANLVGTTGASMLLIRPYLRMNEGRVKAF
HLVFFIIIVSNIGGGLTPIGDPPLFLGFLKGVPFFWVMEHLFLPWVIAIS
FLLALFIFLDSRVEKARGTIELKSGKITIQGRRNVIVLAVMIVAVFLDPV
ILPWVPDIRHIFHLPFGLREVILLALAIVSYMLANKKALKGNEFSFEPVR
EVAFLFLGIFFTMIPALQLAGYYAATHAESLGVSHFYWFTGVLSGVLDNA
PTYLNFLAGAMGKFGLEISSPMDLKHFAEGIPSPIVGDMPSHLYLMAISL
AAVFFGAMTYIGNAPNFMVKNIAAQSGVETPDFVEYVVKYAIPILLPLFI
VLWLLFFHYEI
>Cag_1737 hypothetical protein
MFINKDLLEFFGSMMEIKKQKRDIFNALAADVDDPEIRNTLLRIGADEQR
HVDQIQQSINLVNSGSTAEPMVPEAAPAPQVAPAPAPPAPTIAPATLQPA
IAIAQPAIVRPEPPQPVAPMPVPEPVAAPPTYVQPVQPIAQPITQQVVVS
EPVAPTVSQLQQPAPPQPLTYLTPTIATPAAPAEPAYSEPAEQSISSFAS
PLSSGTQRYPVQPPTSKTFENMTTLHHPLGEVFGFAATDQSPKAQRYRSH
RHCPFNNKSPNCTNSHTENPLGVCSILHNNKAIITCPIRFREDWLITDDA
ASFFFEPGVRWSSLTDVRLADANGTSAGNMDVMLVAYDKEGKIIDFGAIQ
IQTAHIDGNVREPFECYMKDPKTNAMMDWTRQPNYPEPDFLSAMRTSVVP
ELLYKGGILHSWNKKMAIAINKSMFETLPPLTRVKKDEADIAWLLYELEA
VNDGEKEAYQLKKSEVVYTAFQPTLLALTAIAPGNVNDFMKFIPELGA
>Cag_1580 transposase
MKDTVLFQQALCLPAPWFVKSSAFDIEQKRLTIQLDFQKGSTFSCPTCGQ
HNLKAYDTAEKQWRHLNFFQHECYLTARVPRISCPTCGVKAITDLPWARR
DSGFTLLFEAMIIALVPSMPCKTIANYVGEYDSRIWRIIHHYLDEVLEQQ
DLSSVTKVGLDETASKRGHNYVTSFVDLESSKVLFVTEGKDATTVEKFHK
HLLAHKGKAENIKEICCDMSPAFIKGVTTNFPEAHITFDKFHIVQVLTKA
VDEVRREEQKERPELAKSRYLWLKNQVHLNQSQQVKLEKLQLKKLNLKTA
RAYQVKLNFQEFFKQAPAYAQSFLNQWYYSASHSRLEPIKEAARTIKRHW
YGILRWFTSNITNGKLEGLNSMIQAAKARARGYRTTNNLIAMIYFIGGKF
EFALPALTHSK
>Cag_0646 conserved hypothetical protein F56H9.1
MKKKIISIGTIAATCVASPIFAVSNFSDNTASQGAVVAAQAATNALTVLN
FNFTPPVVAPPIVPVVPVITTLPPVQLPGGFTVTTSVQPPVNGVSTSTAV
TTNNINNTTVSTTVTTTAPTPSGGTTTTAITTDVNGATTTTTTVRDATGN
VLSTETN
>Cag_0568 rod shape-determining protein RodA
MLSNKRQKLDLWYVASIAGLILMGFMAVFSATYGSGDSTLFYRQVAWGGI
GLIIMAFMYFNDARVIRDNAYIFYAIGLVLLVAVLIFGKKIAGQTSWMRI
GFFSFQPSEIVKLTTIFGLARFLSDDNTDITNIPHLVMAFAIAFVPVLLI
MLQPDMGTMLTIMPFIAVMLIMAGFDLYILILLAFPIVLMISGFFNVWVV
VALAVVLLIALIMQRQKVQLHQFLVIGGGLAAGLFMHRFASEILKPHQLK
RIQTFIDPMSDPQGAGYNALQAKIAITSGGLFGKGFLEGTQTQLRFIPAQ
WTDFIFCVIAEELGFIGAALLLSFYLIFILKLIATIFAIHNKFIQLTLSG
FVSLIFIHVLINVGMTIGLIPVIGVPLPFVSYGGTSLVGNMIMAGLALNY
ARNKRSLGYYAREEVIRKSAP
>Cag_1020 YjeF-related protein-like
MQPVLTAQEMQAADRAAIETLHISEARLMELAGRECLRLILDMLERKKLD
GCGFLILCGKGNNGGDGFVLARHLLNYGAAVDVVLLYPPSILQGVNREGF
ATLQAYEAEQAPLRIFEGIEEALPFVEENHYTMLIDAMTGTGLRLARRGM
ELAPPLSDGIELLNRMRHESNATTLAIDIPSGLEATTGFAAQPVVEADVT
VTMAFLKRGFLLNDGPECAGDVKVAEISIPTFLTESASCRLIDQEFAAEH
FLLREPSSAKQHNGKVLMIVGSQSAQHSMLGAAILAAKAAIKSGIGYLCC
SLPQELVGAMHLAVPEAVLIGRDVDVLTEKIAWADSVLIGCGLGRNAEAL
ELVEMLLQSETLQSKKLILDADALFALSTLDAITALQKCNHVLLTPHYGE
LSRLCNIPIADIAANPIEIAHECACNFSATMLLKGNPTVIANGKYPILLN
NSGTEALATAGSGDVLAGLIASLAAKGATLPHAAAAATWFHGRAGDLAHD
VASLVTATMVADAIAQAIGEVFEVE
>Cag_0614 Parallel beta-helix repeat
MKPRFYIEQLEPRILLSGDILSELVPLLSSREASQMQSDYLLEHPEARRV
APLSAVEAARACMVVVQSEAPSLLTEDGLMYPFEVGVGEERSSEANAEPT
LAADFSADYTFSKSEWDALEDGWRNLSSMVGDTLLDENLVAVESLLSGGS
RLYGGDELAALLQQPIDEYGSVFAQSSKGVLEALTQEWRNGDLVVVGKVL
GGYNQSTNEVRFDLSLQTVQHGRTFIDGKEVQVEGMSVTLSGDSSFSIRA
ELEVSFGINLLTNSFFADEVEGLVEFTVEASELEATVTYAAPTGEVVSDG
DGNLRLEASFMVASSVEHVVADNAVPALTATPQASGMELTLSFAGDEVHQ
GIEGLTIIDADLFTDNALEVTLQGGHLHPWESGLTAANLYVGTGDVLAGS
GTFAGDLYNAGGIVAPGNSPGRESVSTFTQLAGGTLLIELQGKSTAGVDY
DWLDIAGAANFGGTLQVELLNGYKPTVGDTFDIITFGGSASGIFTNLSGL
YGFDSDHYFDVVQSANKIQLVTKEIIAGDTFSFATDALGSAYNSQLGMLL
NASYLTSAAPTSVSLSGDLNLGDGFQLGGSFTFAKETVPSTITLSDSSTV
SATSLKISGENLHALFGTPAEGAGVSFSDVDFAFARFTPVSTSDSRSWIV
TKGSVGTDGDASFVNLGDLSITAGTISFDISQGLGAGNTTVANLSSSPIT
LGSVTLNSNGSRGEYFDVAASGFGFSVADTVAVTGDFLFSSDGARLAAVG
SHVSAHFGTAEMYVGVTDATVALMSSKTQGTVLQALGGFAASLGSDISLS
ATSSSILWKEAGTTVLTDVNKTLTIGSSSFTFSQELVDAITLQEVRVSGA
ELRVGNFVAASGDLAFHKTTTNVYVSGTTTAVHADVLTLGASGLSIFVGT
NGADEANRMGLSVSGADFALAIIREQAGAARQWTALKGSAESAAALGMPD
VTLSGSTISLELNLQASDASVLDFSTQSLAVATGVGSSVTLDFDEALVRA
SGEFDIAIADFLELSGSMAFEKQSQQLAVQAADGTSSNILMDLLTLGGAG
IEAFVGMNGGAAEQVGLVLAGTRFALLMAQEKAVAKRKWSALKADADSAA
LVGIENLTLAGSELSVAINQSNSDGSLLDFSTTSYAVPVGPSETLALDFN
ADAGELLQASGNLEIDLFGFVQLSGDVAFQKSISTVTLADEASTSVAVEM
VTFGGHDLNAFAGINGGSEEAIGLELGGVEFGLALMTSTSDAARTWTSLQ
ASAESAAFVGVEASGGLTLSGDTLSVVINQSSLDGDSVVDYSAGKTDLTI
ATGTSTSMQFAMEGSQGEMLAASGNLSLDVFGFFQAEGGFAIEKRTDTLL
LSDATDTTPASQIAVDLLTIGGSGINAFAGLNGGSDEALGLALGDVNFGL
MLATEQGGAQRQFTSLKADAGSIEFVGLDGFVASGTNVVVEINRGVAGVG
EAAAVVADHGAMPLLVTTSPDSSIELDMDGAKGEITRASGAIDLNLYNFF
SLHGELAFEQSTSSITLANNPDTTDVNEAASPVTVNLLTIGGKNVSAFVG
MNGARDSEGALSDDAFGIDLDTATFGVAVMTEKGGAARSWSSVQATASGL
SFVGIDGLTVAGSELSVAINQSAADGSVVDYSSGKTEMTIATGSDDTSTL
SLNMDGSKGDTIQASGRLDINLFDFFTVEGYFAFEKSRGAVTLSDGDVIE
QADLLTLGGNDVSAFVGINGGSADELGLELGTADFALALITDSADATRKF
TSLQASAALASFVGVDGLKVEAKDLAVNINKGITLPATPEVITKVNTILS
LELPASLIGKLTLSKGSDTADVALNGKQSSEEIIALLTAAFASLEGIGAD
NVQVSGNSIDGYKVEFVGELAGVDVTGITVNATAAPITTSVNTVSEAQNG
VTEVKQIVVESLVGEQVPVTVDVSQVTQGVAGQSEINSIIFTNPSTSGSY
SVFLSANGTVTNGSSAVAGVNGVQRLTLDALGGEPSSATATITEEVAASS
ISTSEILAINFTVPNNNSGKYTLSTATRSVDINFVGNDVTNNARYLREGL
AKLLKTSEINISVSFDKTFYEDGGTLHTNIGHSYNIYFRGALATTDIPTI
SVNKGTVSGDVILTAKQQGGPARSETQRVSLETSGEGTFILSLLYEGKTY
QTKSLAFGATADKVQVALNAALMNISGTTKVTLDAATGDYLVTFGGNLQN
KNINMLSATLQPASTAPEGSFTLLLGGVTSSAITYSSNATTMASRLQSAL
AAMSNVASGNVTVAVDAAHSVGSATAFSITFKGALAGSNVEVLAINDGAL
SGVDATMQTVTNGVASVGETQRILVGSHPQSVGYTLALEYNGRTYNSGSI
ASGATQSAVQAALTAGFSTLSGADVQISSWTSATDYTIRFGGSLAGKDVA
LVAIKPNVEPTTAAVTGTGNFVVGNTAQNVANLKAAYATMLSTNQANISV
SYDPTYSGGGERYLVSFVDAFANTDLPDKSFLYSSNAIGYKLIQDGSAPI
AEVQRVAVDKGTSTGTFVLQFTHNSTTVTTSALAFDASAATVQSALNTAL
ANISGATASVALDSDGAYLVTFGGTLLGKNVANLKATNIAVDPILPSGNF
TIELNGLNGQMSSAIAYSTNNATLAASLQSALEGLGDIGAGNVSVSYSAT
ESTSKKSVFTITFKGEKVATNIPDITAHFGDLERATVTPYRITEGQEVTA
EVQRVTLDTIAEEGSFILSLTHGGSTYKTASIALQATKDEVQAAVSAAFA
GLASAEVTVESWTQEELTLSFGGSSLAGQNIAPIVVNATVAPVSAALASV
QAGYTEVQEAEPIRTLVVDYSAGKTDLTVTTGPSSSMKLTMDGSKGELLQ
ASGDLTLDVYGFFMVEGNLALEKSESSVTLNDSEVAADGTVTKPASQVNV
NLLTIGGSGLKAFAGINGAYDEDGELVADAVGLSLTDTSFGLVMAGEQAT
GLEPAGTTMRKWTSLQAEVGGASFEGIEGLKVSVDTLGVEINRVAMDGSL
IDYKAQNVAINTGTEANPSSMSLTMDGSEGALLRATGNLNVDIFGFVQVS
GSFGIEKKSGAVTLADIEATEDVDESLAPVSVDMLLLGASGVDAFVGAGD
VGLALSDVNIGLALLTEQLPTGSTALARKWTSVEAEVGSAGLVGIEASGG
LTAQVEELSVSINRAAVDTSVVDYSLKAGSTTVRKTDLTILTGPLSDMAL
TMDGSRGALLEANGRLVLDVFGFVQAEGEFAIEKASALETITLSNATTTE
AEVLRLGAHDLRAFAGINGGTDDAIGLELTGVDFALALVSEKPATGSTTQ
PRSWTSLQATAESAAFVGVDGLTAQADTIAITVNKASTDGLVVDYSLKTE
GGTERKTAMTVRTGISEESAITFDMDGAEGNLLRASANLELDLFGFFQVS
GGFAIEKKTAEVVLNDGVVSEDATKAKAPTELSVDLLTIGGSGVDAFAGM
NGGTADAIGLQLSDVEFGLALMTEQVEEGSTAAARKFTTLKANAGEISFV
GVDGITASATDLSVEINRGIAGTAGNPDVVVDFGYRQLEVLSSPDSTIVL
DSDGSLGQLMRASGTLDFNLYNFVSLNGNFAIESSSKEIHLAGADNATAG
EVVQANMLAIGGSEVNAFVGINGGTDDAIGLQLAKAEFGLALLSDKDDAT
RSWTTLEANAEELSFVGIEGLTASAKDITISINQAGKLNDKVVDYVGTGA
TATDLTIKTGQSSDLKLSQEGSEGETLKAAGNLDIDMFGFFSVKGGFAVE
QRSQEVTLSDGTVIKNADLITIGANEVDAFAGVNGGFDDKTGELNGDAMG
VSLGDVNFALALISDPSDKARSFTSLQATAADVGVEGIEGLTMQVNEMLV
NINHGITVQAEPAKTIKVNTQLKLNVPVDLIGTLTFNRTAGTGYAADSAV
VNVTANMTNDALITALTTGIESFDGIGAGNVQVTGNRYDGYVIEFIGTLS
GINIDDITVSAAGAGVTYGVTTTTAANAGVNEVKELTVQALREAPAPVAI
TIGTESDGRAGVNEANEIIFTTPKSAGTYTVYFVTDGLVQQTTAGVTGVS
EVQRLSLTGDTTAAGGSGSVTVTTVTEGSGSAVVNERYVETFTKEFGRQG
FKLFFVDNPKLSVTWDYTNYAEDTSATIGDLKSAYAELLNGYQSKTVTVS
DIQVSIDNSYKGSGHRYNVEFVGALAGVDVKAIGMRSEAGSFSHVNKQDG
ISGTSEVQKVVLNATGSGYFTLSLTYNSKSFTTDGIAFGASAATVRYALN
AALGRDGSVEVSSPAKGEYLISFGGKLAGQNISALTGSTLSEAPSGNFTL
SFGGQTTRSISYTTDGSALASRVQTELARLSNIGSGNVKVSYNASQSNDA
LLGLDIRFTGTLANQNVNAITVDGSNLANAGGSVRTITQGVANINQVQTI
TLGTDAVAKGYRLSLSYLGETYTTNLIAGNASATAIQSAINSAFGVISGA
SFSVSKSGTQVQLTVGGSLSGQSLNLINLQAEGATASGSQVTKNFVVSNV
TTNVANLKAAFAELLATDAANISVTYDSKYKSGERYVVSFVGALAGTDVA
NKGISISGTSISWKLLSDGTPAVSEDQTITVDREVTTNGVFRLSLQHNNK
LYTTADIALGATTEAVQTALRAAKASDNSVLSSLGTITVSGTTDNYTVSF
GGALAGTNVATMQQAALEVDQELPTGTFKISYLDTEGVRQYTGNIQYSAD
QTTLKTNIQTALNTLFGANNVVVALDATQSEGRKAVFALTFENGLACQNI
ANITSHFSELDCAVVTPMNLTQGEERTGEVQRISATSDATDIGYTLSLTH
SGKTATSATIESGMSQEEVQAILNTIMTSLNTAVGGGFAATATVDFWSGK
ALEVRFGGSLVGVDVADLVVTNVARTYESAVTQEQEGSTTNIEAKPQRTL
VVDYGFKEGSTTERKTALTVATSSTTSIAMSMEGAKGELLQAAGHLTLDV
YGFFMVEGNLALEKSESSVTLNDGTETTPASQVNVNLLTIGGSGLKAFAG
INGAYDEDGELVEDAVGLSLTDTSFGLVMAGEKATGLEPASTTMRKWTSL
QAEVGGASFEGIEGLTVSVDTLGVEINRAASDKTLIDYKAQNVAINTGTE
ANPSSMSLTMDGSEGALLRATGNLNVDIFGFVQVSGGFGIEKKSGAVTLA
DIEATEDVDESLAPVSVDMLLLGASSVNAFVGAGDVGLALSDVNIGLALL
GEQLSAADVKAGKVARKWTSVEAEVGSAGLVGIDDLTAQVEELSVSINRA
ALDTSVVDYSLKDGSTTVRKTDLTILTGPSSDMALTMDGSRGALLEANGR
LVLDVFGFVQAEGEFAIEKASALETITLSNASTTKAEVLRLGAHDLRAFA
GINGGSDDAIGLELTGVDFALALVSEKPATGSTTQPRSWTSLQATAESAA
FIGVDGLTAQADTIAITVNKASTDGLVVDYSLKTESETERKTAITVRTGI
DEASSITFDMDGTEGNLLRASANLELDMFGFFQVSGGFAFEKKSAEVVLN
DGVVSEDATKAKAPTELSVDLLTIGGSGVDAFAGMNGGTADAIGLQLSDV
EFGLALMTEQVEEGSTAAARKFTTLKANAGEISFVGVDGITASATDLSVE
INRGIAGTAGAADVVVDFSYRQLEVLSGTDSTIVLDSDGSLGQLMRASGT
LDFNLYNFVSLNGNFAIESSSKEIHLVGTDDSTETVQANMLAIGGSEVNA
FVGINGGTDDAIGLQLDKAEFGLALLSSKADATRSWTTLEANAEKLSFVG
IEGLTASADSITISINQAGKLNDKVVDYVGTGATDLTIKTGNTTDLKLSQ
EGSEGETLKASGNLDIDMFGFFSVKGGFAVEQRSQEVTLSDGTVIKNADL
ITIGANEVDAFAGVNGGFDDTTGDLNSDAMGVSLGDVNFALALISDPADK
ARSFTSLQATAAEVGVVGIEGLTMQVNEMLVNINHGITVQAEPAKTIKVN
TQLKLNVPVDLIGTLTFNRASDSAVVKVTAGMTNDALITALTTGIESLDG
IGTGNVQVSGNRYDGYVIEFIATLSGVNINDITVSAAGAGVTYGVTTTTA
ANGGVNEVKELTVQALREAPAPVTITIGTENDGRAGVNEANEIIFTTPKS
AGTYTVYFVTDGLVQQTTGGVTGVSEVQRLSLTGDTTAVGGSGSVTVSTM
TEGSGSAVVNEGYLVTFNENYGYQGFKLFFVADPVLPTTWTYRSSAASTS
DTINNLKGAYADLLDGYQGKEVTVDDIKVTVDTKYKESGYRYKVEFVGSL
AGVNIASIGMRAETGKISNVHATHGVSGTSEVQKVVVSSTGSGYFTLSLT
HNSKTYTTTGIAYGSNAATVRYALNAALGRDGSVEVSTPSKGEYLISFGG
KLAGANVAALTGSTLSEAPSGNFTLSFGGQTTRSISYTTDGTTLASRVQS
ALKALSTIGSGNVQVNYNAGQSNDGLIGLDIRFTGMLANQNVNAITLTPS
LSNASATIRTVTSGVANINQVQTISLGTDAVAKGYRLSLSYLGETYTTNL
IAGNASATAIQSAITSAFGVISGASFSVSKSGTQVQLTVGGSLSGQSLNL
VNLQAEGATASSSQVMKSFVVSNVSANIANLKAAFTDLLKTDAANISVTY
DSKYKSGERYVVSFVGALAGTDVANKGISISGTSISWKLLRDGTPAFSEE
QTITVDRASDTDGVFRLSLQHNTKLYTTGDIALGADAAIVQSALRAAKAS
DNSVLSSLGTITVSGTTDNYTVSFGGALAGTNVAALQQAALEVDQELPSG
TFQISYLDAEGVRQYSSDITYSSNQTTLKSNIQTALNTLFGTGNVTVTLD
ATQSEGRKAVFALSFKNGLAYQNIANITSHFSELDSAVVTPINLTQGEER
TGEVQRISATSDATDIGYTLSLTHSSKTATSATIESGMSQEEVQTILNTM
MTSLNTAVGGGFAATATVDFWSGKALAVRFGGSLVGVDVAALAVTNVART
YASTITQEQEGSTTNIEAKPQRTLVVDYSLKEGSTTERKTALSVSTSSTT
SMKLTMDGAKGELLQASGDLTLDVYGFFMVEGKLALEKSESSVTLNDSVV
DAEGKVTKPASQVNVNLLTIGGSGLKAFAGINGAYDKDGKLVDDAVGLSL
TDTSFGLVMAGEKATGLESAGTVMRKWTSLQAEVGGVEFVGVDGLTVSVD
TLGVEINRAASDKTLIDYKAQNVAINTGTEANPSSMSLTMDGSEGALLRA
TGNLNVDIFGFVQVSGGFGIEKKSGAVTLANITSTPANESLTPVNVDMLL
IGASGVNAFVGAGDVGLALSDVNIGLAMLGEQLSEADVKAGKVARKWTSV
EAEVGSAGLVGIDDLTAQVEELSVSINRAAVDTSVVDYSLKDGSETDRVT
DLTILTGPSSDMALTMDGSRGALLEANGRLVLDVFGFVQAEGEFAIEKAS
ALQTIALSNATTVQAEVLRLGAHDLRAFAGINGGTDDAIGLELTGVDFAL
ALVSEKPATGSTTQPRSWTSLQATAESAAFVGVDGLTAQADTIAITVNKA
STDGLVVDYSLKAESETERKTAITVRTGTDEASSITFDMDGAEGNLLRAS
ANLELDMFGFFQVSGGFAIEKKTAEVVLNDGVVSEDAKKAKAPTELSVDL
LTIGGSGVDAFAGMNGGTADAIGLQLSDVEFGLALMTEQVEEGSTAAARK
FTTLKANAGEISFVGVDGITASATDLSVEINRGIAGTAGAADVVVDFGYR
QLEVLSGPESTIVLDSDGSLGQLMRASGTLDFNLYNFVSLNGNFAIESSS
KEIHLVGTDDSTENVQANMLAIGVSEVNAFVGINGGTEDAIGLQLDKAEF
GLALLSSKSDTTRSWTTLEANAEELSFVGIEGLEASAKNITISINQAGKV
DDKVVDYVGTGATDLTIKTGNTTDLKLSQEGSEGETLKASGNLDIDMFGF
FSVKGGFAVEQRSQEVTLSDGTVIKNADLITIGANKVDAFAGVNGGYDDE
SGELSDNAMGVSLGDVNFALALISDPADKTRSFTSLQATAAEVGVEGIEG
LTMQVNEMLVNINHGITVAAEPAKTIKVNTQLKLNVPVDLIGTLTFHRAA
DNAVVNVTAGMTNEALITALTTGIESLDGIGAGNVQVTGNRYDGYVIEFI
ATLSGVNINDITVSAAGAGVTYGVTTTTAANGGVNEVKELTVQALREAPA
PVTITIGTENDGRAGVNEANEIIFTTPKSAGTYTVYFVTDGLVTETTQAV
TGVSEVQRLSLTGDTTAAGGSGSVTVSTITDGSGSAVVNERYVETFTKEF
GRQGFKLFFVDNPKLSVTWDYTNYAEDTSATIGDLKAAYAELLDGYQGKM
VTVSDIQVSVDNSYKGSGHRYIVEFVGALAGVDVKAIGMRSEAGSFSHVN
KQDGISGTSEVQKVVVSSTGAGNFTLSLTHNSKTYTTTGIAYGSNAATVR
YALNAALGRDGSVEVSTPSKGEYLISFGGKLAGANVAALTGSTLSEAPSG
NFTLSFGGETTGAISYTTDGTTLASRVQSALKALSTIGSGNVQVSYNAAQ
SNDALIGLDIRFTGTLANQNVNAITLTPSLSNASATIRNITQGVANINQV
QTISLGTDSVAKGYRLSLTYLGETYTTNLIAGNASATAIQTAVNSAFGVI
SGASFSVSKSGTQVQLTVGGSLSGQSLNLVNLQAEGATASGSQVMKSFVV
SNVSANIANLKAAFTDLLKTDAANISVTYDSKYKSGERYVVSFVGALAGT
DVANKGISISGTSISWKLLSDGTPAVSEEQTITVDRVSDTNGVFRLSLQH
NTKLYTTGDIALGATTEAVQTALRAAKASDNSLLSSLGTITVSGTTDNYM
VSFGGALTGMNVAALQQAALEVDQELPSGTFQISYLDAEGVRQYSSDITY
NADQAILKTNIQTALNGLFGANNVIVSLDATQSEGRKAVFALSFKNDLAY
QNIANITAHFGELDRAVVTPMNLTQGEEQTGEVQRISATSNATDIGYTLS
LTHSGKTATSATIESGMSQEEVQAILNTMIVNLDSNAKATVDFWSGKELE
VRFGGSLVGVDVAALAVTNVARSYESAVTQEQEGSTTEIAAKPQRTLVVD
YSTGKTELTVVTGYDDVNKKNTFITLAMDGSKGELLQASGDLTLDVYGFF
MVEGNLALEKSESSVTLNDGTETTPASQVNVNLLTIGGSGLKAFAGINGA
YDAKGELVDDAVGLSLTDTSFGLVMAGEKATGLEAAGTTMRKWTSLQAEV
GGASFEGIEGLKVSVDTLGVEINRAASDKTLIDYKAQTVAINTGTEANPS
SMSLTMDGSEGALLRATGNLNVDIFGFVQVSGGFGIEKKSGAVTLADVTT
TTTIHEDASPVNVDMLLLGGSGLDAFVGAGDVGLALSDVNIGLALLTEQL
PTGSTVVARKWTSVEAEVGSAGLVGIDDLTAQVEELSVSINRAAVDTSVV
DYSLKAGSTTVRKTDLTILTGPSSDMSLTMDGSRGALMQANGRLALDVFG
FVQAEGEFAIEKASALETITLSNAATTEAEVLRLGAHDLRAFAGINGGTD
DAIGLELTGVDFALALVSEKPATGSATQPRSWTSLQATAESAAFVGVDGL
TAQADTIAITVNKASTDGLVVDYSLKADSETERKTGLTVRTGIDEASSIT
FDMDGTEGNLLRASANLELDMFGFFQVSGGFAFEKKSAEVVLNDGVVSED
ATKAKAPKELSVDLLTIGGSGVDAFAGMNGGTVDAIGLQLSDVEFGLALM
TEQVEEGSTAAARKFTTLKANAGEISFVGVDGITASATDLSVEINRGIAG
TAGAADVVVDFGYRQLEVLSSPDSTIVLDSDGSLGQLMRASGTLDFNLYN
FVSLNGNFAIESSSKEIHLAGADKDTAGEVVQANMLAIGGSEVNAFVGIN
GGTDDAIGLQLAKAEFGLALLSDKDDATRSWTTLEANAEELSFVGIEGLT
ASAKNITISINQAGKLNDKVVDYKGTGATDLTIKTGNTTDLKLSQEGSEG
ETLKAAGNLDIDMFGFFSVKGGFAVEQRSQEVTLSDGTVIKNADLITIGA
NEVDAFAGVNGGFDDKTGELNGNAMGVSLGDVNFALALISDPSDKTRSFT
SLQATAADVGVEGIEGLTMQVNEMLVNINHGITVQAEPAKTIKVNTQLKL
NVPVDLIGTLTFNRTAGTGYAADRAEVSITANMTNAELITALTTGIESLD
GIGAGNVQVTGNRYDGYVIEFIGTLSGVNINDITVSAAGAGVTYGVTTTT
AANGGVNEVKELTVQALREAPAPVTITIGTENDGIAGVNEANEIIFTTPK
SAGTYTVYFVTDGLVQQTISGVTGVNEVQRLSLTGDTTAAGGSGSVTVST
ITEGSGSAVVNERYVETFTKEFGRQGFKLFFVDNPKLSVTWDYTNYAEST
SSTIGDLKSAYAELLNNYQGKTVTTSDIQVNVDTSYKGSGHRYIVEFVGV
LAGVDVKAIGMRSEAGSFSHLNKQDGISGTSEAQKVVVNATGSGYFTLSL
THNSKTYTTDGIAFGSSATTVRYALNAALGRDGSVEVSTPSKGEYLISFG
GKLAGSNVATLSGEILSEAPSGNFTLSFGGQTTRSISYTTDGTTLANRVQ
TELARLTNIGSGNVKVSYNAGQSNDALIGLDIRFTGMLANQNVNAITLTP
SLSNASATIRTVTSGGANINQVQEITLGTDAVAKGYRLSLTYLGETYTTN
FIAGNASAIAIQSAINSAFGVISGASFSVSKSGTQVQLTVGGSLSGQSLN
LVNLQAEGATASSSQVMKSFVVSNVTNNVANLKAAFADLLKTDAANISVT
YDSKYKSGERYVVSFVGELAGTDVANKGISISGTSISWKLLSDGTPAVSE
EQTITVDRASDTNGVFRLSLQHNNKLYTTGDIALGADAATVQSALRAAKA
SDNSVLSSVGTITVSGTTDNYTVSFGGALAGTNVATMQQAALEVDQELPT
GTFQISYLDAAGVRQYSSDIMYSSNQTTLKSNIQTALNTLFGAGNVTVTL
DATQSEGRKAVFALTFQNTLAYQNIANITSHFGKLDRAVVTPMNFMQGEE
RTGKVQRISATSDATDIGYTLSLTHSGKTATSATIESGMSKEEVQTILNT
MMTSLNTAVGSGFAATATVEFWSGKELEVRFGGSLVGVDVAALVVTNVAR
SYASAITQEQEGSTTNIEAKPQRTLVVDYGFTKDAEGKPTTTRATVLNVA
TSGSTSIKMSMEGSKGELLQAAGHLTLDVYGFFVVEGDLALEKSESSVTL
NDSEVAADGTVTKPASQVNVNLLTIGGSGLKAFAGINGAYDEDGELVADA
VGLSLTDTSFGLVMAGEKATGLEAAGTTMRKWTSLQAEVGSVEFVGIKDL
KIAATDLQVEINKAAKTTDGKSTVIDYAANPFEVITGKDKSITLSMDGRE
GDLISARGELEIDLFGFFQVSSGFAFEKKTETVQIRTGDTVTATEVNVLT
IGARGVNAFAGLNGGTEDEIGLKLKTTDEASTDFALVLASEKPAVSATPG
APVTPVRKWTSLQAEVGSVEFVGVKDLTIAATDLKVEINKAYVNPTTKVT
SIIDYAYKDTEGNSGLEVATGPDSSITLSMDGSEGDLISARGELEIDLFG
FFQVSSGFAFEKKTETVQIRTGDTVTATEVNVLTIGARGVNAFAGLNGGT
EDEIGLKLKTTDEASTDFALVLASEKPAVSATPGAPVTPVRKWTSLQAEV
GSVEFVGVKDLTIAATDLKVEINKAYINPTTKVTSIIDYAYKDTEGNSGL
EVATGPDSSITLSMDGREGDLISARGELEIDLFGFFQVSSGFAFEKKTET
VQIRTGDTVTATEVNVLTIGARGVNAFAGLNGGTEDEIGLKLKTTEEAST
DFALVLASEKPAVSATPGAPVTPVRKWTSLQAEVGSVEFVGVKDLTIAAT
DLKVEINKAYINPTTKVTSIIDYAYKDTEGNSGLEVATGPDSSITLSMDG
REGDLISARGELEIDLFGFFQVSSGFAFEKKTETVQIRTGDTVTATEVNV
LTIGARGVNAFAGLNGGTEDEIGLKLKTTEEASTDFALVLASEKPAVSAT
PGAPVTPVRKWTSLQAEVGSVEFVGVKDLTIAATDLKVEINKAYINPTTK
VTSIIDYAYKDTEGNSGLEVATGPDSSITLSMDGREGDLISARGELEIDL
FGFFQVSSGFAFEKKTETVQIRTGTTVTPTDVNVLTIGASNVNAFVGLNG
GTTEELGLKLEKTEFALVLASEKPATPTSTAPLQKWTSLQASVGKVSFVG
VKDLTIAASDLQVQINKAYVNPTTKATSIIDYAASPLDILTGYDAEKGED
TFITLDMDGSKGELLKASGHLTIDVFGFVHVEGDLALTKQTEQVVLAKKL
GQSTGEKVNVNLLAIGGSNIDVFAGVNYGTSDAIGLSLVDIKFGLALMTS
QTTPTRKWTALSASAASVGIAGLDDLLPTIKNLTVEVNQATKVGDQVVDF
ETKPLDIATGTGSFLTLDMSGSMGKLLRAEGDVTFTISEFAYFKGRIGFE
KYTPTKQLTLKNITTPVALPSYATATSMMAITGSNITAFVGYADGGFDTT
KTLEAQKDNLYGFGVEGVNFGIVKTKTASGAYTAIKADMSSAAIYGFDED
DFQLSAENLRFAYASADASGNVIDYEASFDGGLALGSGGDVVIDFTAKKL
GVYTEKATLSISKFLYFSGAIGFEQADYGSTLKVGALGLPVTGAKGFSIG
GSNITAFVGYAEDGIDSTQTLQAQKDSLYGFGVEGVNFGFLSLKDSSGKT
YKALKAHADNMAVYGFDPNDFQLSVSNLNIEYNSASVVGNELNFTMLTDG
KLAIPTGDPENPVELDFSGKRLGVFAGNVTLQISQFLYVQGAIGFQKADF
TNLIAGAMPVATIATPATGFTIGGSNIDVFVGYAADGLDLAKPFSEQVKY
ETEDGKEADNLFGFGAEGIDFGLIQVKTKTGISYTAAKAHADEVALYGFD
PEDFQLSLSGIDFKINTVNNQAMPALNFQASHGEAGFAVPTGNPDEPIML
DMTGKTIGASLQNATLRISEFVYVSGSFAFEKGGTQILSAKTVGGIHAPV
LADAFTLGASHVQAFVGVGGPYRLDSNEDGKITVDDEVANPNAIGLVIDD
FTFGLGLFQDQATKIKYTALKASAAKIGMNGLGEVIEFSLDDVVVAINRS
SNPLFVLDFTQNGDGSASEGLAIDTGDPDSPVVLDFDGELIQASVGHATA
QIAGILSLEGGFTFQKRMIDNIGFHGLGMNLELGAEALIIAGQDVYAFAG
INGPYKTDTNKDGKVDESDPINEDAIGFALSDLDFALALVAPSLGGGAKL
PINFFGLRASAGYAGLVGTDPYLTLNAQDLIFEFNGAIAQANGKFIPLPS
VYADFSVIDNGVDEEGNALPTGLNIPIGSDATAINMNYSSNFLRVGLTGE
LGIFDLFTIKPPRLDFTFELPAIDLGVGFSLKDIALPEFKMPSLPAFDAN
MLLPSISLKQLTQLAEDALNTTSAPDWLKDGMKYLEGIDIRIGLAGVTGT
ITIPDLKIDLGGFVHLEGDFKLTLGKTFVADMATGIDPTLAGIVTTTIDA
IGSTFMPEGITPTGLLTKLFKVSEDFSTFEKVTFRGLAFGASDVNLFVGV
GDPDFSNPTSNPLSNQDLVGFGLQDIDLAIGYFKADLPEWLGAESVFSFT
AHAGQMGVYGLGDILKIVTSDVTVDVNMGGNTKVKAGQTPILSARPIYNS
IVNEDGSKGLKIDTGGTPVLLTFAGSEIMGVDIGLAEIVVADFLHLRGSL
AFRKGELYDVAVDAGGLAPALSKIGSAAGVTLNPIPLQVETLTLAGANLV
GFAGVGGPYRYGADADGDLIPDKINESAIGVEVTNVDFGLAIMTPTLVTA
IPGMAEYAPKFISAKAYVGSASLVGVDPNILEVRAEDIEVNINTFVIPKA
PWPVNAAIQLFGPPSINYKLSTSFKNYTEDTNGDGVLSVSEDKDIDLILD
PGEDANGNGVLDLSEDRNQNGRIDRAGFMVPAGGNNGIFLDFAEEIIQVK
IGYAEINLAGFIRMSASMALTKKGSETVTLSDGDTTIVTSLALGINDAYG
FVGVPTFEDGKPRSYFWDSNNDGRVDERDSVNEGAIGLAIENLDLGLILA
KELVIDPSGVSIGVYIAGRATVDMIGLVGVPGVTMKAEEIAIEINTGLRA
TLEVGNIVKDEKTGAVSYKPGFSASFGFTTIDFSKSQWMDTSLAVNKDSD
DSNDLYYDGYAIDTGNPSEPIVLLYDEQYLRVFGRAEVNLFDMAKMHGAI
DFRFSESDGLTAFADVEVQIGPDGFNMKREGRGLLVINKGGVALRLDVGA
SLELGPVATITADMTLVLNSFGKKIVYTVPESFRQLLTSSKYPNWQYTID
EFPPGKDESWTGMYAAVVGKGELDLFDGALNLKGDFAVILSQVDSHVTLE
LGITAVLDLPIFEPLGVTGTIGLVIDVAPTGNQTGLYGSLEVGGANADSL
LIDGGSIFSLKGHFLLQINTTSVDQKVRGRDPVTGSFYDKDGKAVQVTVP
KQSLRISGSASLEIASVLTMEGSADLVIDKSGIQAALTMTLGLGGLGDVE
VKGAAAFGVDGNNTPFFAMRLELSVSLGVSVIGIDANALLQINTSHNNYT
TLHGDTIAGNTIFDMSLDGEIHLLAFDVDFSGRMSIVNSVFKLEFDGRLN
FFNALDINVGGFVSSDGSFEFRGKAEINIYLGPLHLNAGMSVLFSSQPRF
AAAAWGSLDFEIDLGLFEIDFTLAGFRAEIDITPASAYLAARVTVMGITV
SGSYCWRWGAPPNISHLSSDGTLYLHMGDNSGRYGSGDLYDDTIHESFNI
DQNDSTITVRSLGETDTYSATSVKKIVAVGGKGNDFIYVSKGVKADLDFD
GGDGNDSFMILGGGANSVVRGGAGKDEFSTGDYTKGYFLGGEGDDKFVGG
EGDDIIDLGAGNNSIFSGGGNDLIRVSGASDTVDAGGGDDTILASRGGYL
NLTTGSGEDQLYFGNFTATKPTLTLKEANITSTEAIELKESQITVTAQSS
GTRTVVFDTTLEVVNLNDKAATTTIKAATNANWHNTDLIIDAAGLLDVRT
ANFVAPDALVSISAKGIKGELTTNVAELSVINLGTAGDAADRAIVVREAN
DLLIVAAGRSNGGLYAAKGAVTIDVAEREAELTLQSGVIFAGAGADLTIY
ADDIDFESGNNKVSGTGKLIIRTKADAQNYRIGGAGQSRFGADRSPGVNT
GFMELGMKDLSALANGFSSITIGHNVAGSTMQIGDVEDATVGTYPFSARL
DDVATFIADLITITGDVQSSERLTLNARALEVQRQNNSNPLGAPDSGITA
PEIYLNVKEQMLLTGWVIADNLVDIKVTNSIGEGGFVTYGTEINSFTADK
GSTLQTLKSNSKIILTTSHSVYSATGIYAGVTNGTGASITINAGTGFTVL
EGGTVATRYANGVIDVTAGKYIHILSGGAVVSGATLNTTTKQYELTGTGS
TMKLSTTGEMTLAGSLTAAGAMTLKASEVQDEFADYFNNLAGKGITEVKD
ASTVSTIAAALAAGTIPSELTALFTGGNLTLDTGSDVLVSVANYIPFAEL
PEATKKAIAESKGYTEFAVDAATKVSGYFNAATGKFFTTISDGPVIGYTI
NDINWGSVTKPTTGTAFSVLSAAQQTVIANALGYTRHAGTVYFNPLAEAG
EEVKTTFIQGISADYNNAHIDWVAAGVAVPAANATFEQLTPVQKLVVANS
LGYMYDYSLIQPEAWSNQPFDYSTITKAEWIEASTLNYRVAISNDQWGEV
TKPTSGATYASLTAAQKEVVDKVVQFVPAQTDRLTLSGSFAENNTLALTI
NETVITYTVKTADIGATSEATLANVAKGLAQLINDTTSVSGAVSASDSTS
AAIAMVRMVSSATPAAEVTVVSTDSARLSVSDIAGGKAVTLFGKFNDGNV
LSLTVNGTTVSYTVKAADIGVTDDITRSRIATALASAMSVAGVTVTSQLI
APVTQVLISATATNTKFTLSATDGHVAIEHNPFDLLNQDQQQLVASKLRP
SLQTHYKDLTAGQKQAVATALSTAAKIEFFNYAAQPGKKLVTTFTQGIIT
DYRNDQIAWGTVAEPSSNSATFESLTQAQKDVVAHSLGYDRYDGVHFLKA
DAAPEKRWVSGFTEGGGSFDLSTMDWGGVATPDAGTSFEQLTIAQRDVVL
KKLGYDVYEREVYVSADGNTIKGSFVEGATNADYDATTITLEEWGKVLPV
APGTAWLDLTFEQQEFLLDRLNYSRWNGLVFHNAASTTAPYRLTFKEGTA
TGADYKNDDIEWSKTAFPNSSAGVTLSTVKDAFKVGDVVTITVSLTENGG
SPTNTNVSFTVTAADVTAKAVAKGLADAINAHDTLKTKVVAKSSEGALYV
SSLSRNAEVTFSATDSRISSSQANPLGALTVEQKGLVLQQAGLTEYATTV
YYKADVPVGQQVVTSFTVDYSSDDAPSTPTKRWLLSDNAGHRYLIYAYDA
TNDGVIDEIHIQEPHKLVGQRGAGFLLTGTITTLQENADFIIDVKDDAIV
SGGRFYLMGAGSDLTIASDRSVFWQGEAEINGDITLIGRGVQGAGMPLDG
ISVYIHASSTLSSVAAGSDITIIGADAGDVELHGAVLAGAERSNTGTHYL
GADSTVSVTTGQQILVNNALAAAKSVILTTTQTPGSDDAYQSVILDTVAG
LTSAGWTSDWSGGLVKIDAVGSVTLSGMVLSGGTVSQTFNSDGRLTAETF
TWSNEPSQVIINAKGQLNLGVETLALSGNMVNVGARIRANQYVELSGIVG
GADDRFAVKLPESAVVAVSNPDGVIKITSGQDAWLMGQLVAGGEVVDSYD
TAGYYLGSKVNTFKGDSQLIIEADEQIRIGRDLMAGKLLDVRGGTVSGRP
AKTTTITTTSSDGLTTTTTTTYLPNAPIGTSTTTNSDGSTTTTQTTEIPW
ANDGIIIGGNVQLRTWQEYSTITLSAGGNMSVLTPAWTQELLADGFAEFA
DGHLSSAVKFHLIVETGTVDEERDITVAATRTNGNGLGFLKEDIQAAIDT
AFGWTTDTRKVTVRLDDGRLVFTSNYQVRIAAVTNGYAELLGFTQIATTP
AKLTTGATTSSRPYAIDASGRGSVVNLGKANNPSGAISIAGAIRGHSAVN
MYAGNNPSGGAAVSFLATSLIETLSGSMIMSPAGAVTLEGDFIARGAGSD
IIINATGTLNLKGNLTAQSDIIVTAASTVLAGEKSITTYGTSTFTTLDAD
SRIVLVGTNDVEINSVIGKGNPNLAQVQIGSTDGNLRVVQGSGWIETGAS
MSLSGKNVDIAGVIRSNKDTAVTYDREVTIQATEDVYLHGALGVMGSLFI
KAGDDITIENMAIAAQASGHSMELEAGDAITLGGTTVGTAVILEANKLLS
LKTTGLVTLYENAQLYTSGDNSALTVEGQMIEARGTLRVGANHPYTFNPA
SEVMAFNAADSVTYTGKGGSLTLKATKDVVLGNLTSGFGGVLFATGAIAV
QSGSGSSGVGFDMSTASQIKVDATGYGAWSESTVNASPWKIVDGVTYTIA
IGGTTVATATATSESTINTVLNALVQQIEDHASYAAVRAGDVITITNNDG
TLLSQTVTATTSNVPSVGTAATVVQGNSGSSTQATVDTSSWTIVAGGRYT
ITIGSGTSAKEYTLTAAKDYTLAMVTEELVRMIEEGALYTASNIGNVITV
QNTLGGAILDVTLAASATSASGSAVVTAGTAVLADGQLSIISDGDILLRG
AVSTVDVGSDMYIRSRSLINVGGLITAQQSITLRGGMDSTKVGIWVQELK
LDGSGNYLSGGTLDTASGGTIDMEAVDSVVITGVLGQRTITDGDLGGAKV
GDIRIESLSGDVQLLRNTNVRDTLTVAGSTIGVLSGSYVYATGMESSLYL
NARSSLTLSGRAVQAGLDPAIAKASRLVHMVAPTMTINGTIDVTSLTGRV
LLSAGSSVTVGGTVVSKGSIAIHAGVDLVRWDRARMEATVTRADLTGGSI
EVKGQGLLSAVGTITLLAGGNVTLDADASVASLENVLVPVYSTSEKEVQV
VVDTVKVSDGVVLVPEITWVPTEITEQVGTDLVVVGSRYETMDVQLSQIG
YFNPNAPDERKFVEVLIEGVHYLNDNTRAKTAPSYAVQVTWSNAGNEQVP
TRSATGVTGDYTSSNYRGFQQLSDAQKWAVFNSTGYMPLYEFGYTNWKLN
QTINGTASQLSEGYVGSDGKALYPSWKPNGVLNTKKVFYVDVANWRDKYV
YMPVGAQEAILSVASYGEATYLTDETATTDATGVTMTGDLDGSNSGASWK
TLDEVANPTGELVGRYYESADVKYVQKDSAFTSSTIMGGTDLDGKAASWE
VSYANNGKRVYEVSNGLTSLFSNPLDDFTLAQAPTWKLEAAASKYGDDNN
NTAGASGANGYINWVDWTSYTQNSGTNSYGYGSAGGVDVRFAGEGLGFKL
GDKSWSSGTYTDSNTSDSIGVNNRPSGSSDEYISFWGGNLRGAVYFSKPV
VNPVMAIMSLGQPGDPASLWFGDEEFDVVTGYTNGGWGSGSIHEGNGGYV
YNSGEGNGVILFKGVYDHIDWSISDTEHYGNFTIGFDGIAPSTTGLRATD
NIYAPGSYITELASNTQSITYNGYSYTRYNQKVGYAERLADFYAISGTVW
DDPGYGTGDEGWGYIQSYGYLPLNDTVDAINLWDNAHHITVWYHYDWNSA
TYTFIPGNTGVEADGDVPNVNTATGSDEWGNNMSMYMVWGNRFEDFKDYG
YNWVSKWNPMYDQRIQLSYHLSTQAEDIYDYRPVYKTSTQMVKVEKMKSV
TVWRDEPVYAMQMQLVTDVTYEAVVGRSNNGAADSLSATNIVIDAGGDVN
ISGKMSAKGNFTIDAQKQFTLQGKSVGGAPLTSTLKATTLDVYAKGVMLL
ADSAKLVAAQVGATVLLDSDQGLTIGGIIGSQSGTTFATVALSSDRNINL
SGSIDAGAITVTAGAGSSGDGAVVADSETVLRAATGNITVSTGQYGGNIA
LDRVTLIASGATATISLTATSGALEQTKIDGTANGVAVKVIGGMVRAANL
VAKAETSIVLQTEVSTITEVSLNGSGNVEIVNSGNLVVTKAEATDGTIDI
RTFGTMNVVAATTLGNSNSNDIVLTTIQAPVTSGSKSAATLTLGNITNAG
RGDVVVTAEGAVLQSSGTQLVADELTITASMVGLLPIPEKLIDLATKVTT
LIITTAGEGDVRISQDAAAASVAVARTLTVDNTTISDGDLAITAKGSVVL
LDVTLAANKDTNELTVDADGNITVDYATVGIYLASTAAAKGSDYNDDGDY
TDSVKESDITGVGAPAEGATDLRVKRDLNNDGDTLDTLVEVAPPTVVSSA
GNITLTADGTIGQNVTDTAIDLIANTLTLRAGSGISLMQVAINQLNAETA
AGDIVISDADGELETSVGMEVMRAVTAKTATANTLTTTVDITVYSQLRVS
ADGLVQGDKVCLTSTTASVAVAKPTSWSAASPATNNSIIHNGGVAFVAEE
DVQLYQFFNAEKWMEYRAGNSFTFGVANYDAAGNFVDKGTVTEKLPTTLS
ADTLILETGGTLSLTGTLTAGKHLELVAGEDVLISGTIAAGYSKSVIDEL
KITAKGERTLLRGNDLNNNGVISGSVAEMLVGLDANKDGDYADTFMEKGF
SFDLNGNGNKTDLVSEVTLNRDINGDGDKLDSIHESAVDLNQDGDYTDTT
VSESDIYGTGDPAAGQPDRRVSRDLNGDGDALDTLSEVGFGLDLNCDGDA
LDTYSEALRGTDLNGDGDILDTINEVQGEQETGYITIQAKALPATNFELR
AKRDIYLALNDSSATVSNLTLTGFIGGLSAFDPAANVEMHIEGALTVLGG
IVRADAEHGGDIKVTAATVNVDGASVFIGDTLQVTATNGIVLNTLINTLV
ASSTDSGNIVVNEADGLTVESVIANNGAIKIFTAGNTYVGEIRNNVDASG
NNIEVRAAGALYVDLIEAASNAGAMKQYGSILLESTDNVQEWHTDQEAAD
LYGYAVKVIVPSTAVGSFTMPKKLTFSGDKGSGTEVEVRVISGAADAFTA
TTEERTLSDDDVRGKGGASVTITTATPITAAQPVKQLSHIRFNTMVTVGA
TYSVTVEGQTVTEVALATDDITALLQRLGDAIVTATQVSSVATLVATVDG
GIGTLYLEAVTANTPFTVGTVTVKEPYTENWTRNAVLNDGAVPEVAAGTS
VVQVSVVEFADSAAMVAGKRADVTLNSHLYSATYGANADVWITGADSKVV
TATVKEWDIASNLGSISPSSVTGTLGTASDRDVTYVTVDTANWNLANNNS
YTITLDTSNTFTFTAGGSSARADLIASLVQQINSNSSYKAAALDKLILIQ
NTADTLDTGNVGTSQTGSTPGTASIGTKTMVRVDASGWTLHNGARYTITL
NTSDNFSFDAVSGSTLIGLVSSVTTAIEYDGSYTATSHGNYILLSNNGYT
LDANNVTTSQTPPARTISLSGVTAEVGARYTVTFTPTSGQAKTFTVVHSD
TSKTVTQRLKEEIDGDGDLEAIFSGSVITVTGGSSLTDSITTKIERLTAT
NDAGKITVTKTTQGGQRVTFTGVDPTADRVFSILIDGTEVGTYTTASGNT
LDTVVTDLAADIDPYATYGATKTYANVTNWTSATAYLTATIQKGEPVAVT
VERPITFAANSSTTDTVAYVLGIGSGNSTTYNAPKNTSGQIIASEFKKQI
AIAPFNLTNTKVDGTTLWIASGAETSDITLTKGGTSQTVKTPEQSAQRLI
LTATSPNAAFVVNQVGITTLINLATVPVKTTTVAASGVKQESVVTFAGAT
PATDSRYSVTVNGNTYSTQTGENLLIKLSGIKRTDAISIKVLDEAGVVKS
DPLSVPAWVNNVRTTPWKDDTTTATPTDDGICLLDLSDMPIGPGYVFVLT
LGNFSYNYTVLGEDADNNSSTPNTLKAGESAAKVVTELVTAVNANSSTTK
IIATDVSFTVASDWAGVLGKLETLIEAGESSLLDSTNTTFDASAKTLTLA
AKTDNTSFTLSADVTATSAIVGNPAQSRTYAGANAYQKTIISFANDAVTT
DTYELTSGVEFSLTMTSQKNYSVTVGQANVTVTNADGTTSTVPTSAVEAT
WSSILSTLATAIQTTENSVVGRTTDIIITYNATADIVAGAPARSMTMVAD
TVNKSFSVSNVEVTYHGVRTVDNAVVESTTDKAVTTSRAQQSTISFASAT
LADTVTYTVSVAGSPYSVKVNDTVNSIPVTASWASILAALQYKIEAGNVV
TVSVSGTTLTLTAKQDNTPFTFNAYGMDAGINTLTQKGDFVLVVPDRPTG
PYSLKAIASSDGANDGSLTVVNLPTHSGESITLEAAKSLVVVSPLNVGSG
GTVTLTSGNGLTLGGTLVAGALTVTAGDDLSLKTQVGSMAINLSSAASNL
TIEQADDLRMVDFSGVTVPPAADYTLTVTLDGTAFTKSATSGTTLAATVS
ALRTLINNHANYDATIDDSNATQINVTKGAGTALITTTLGDGVTGTVSLT
AAVADLTITSLAMNGGDIDIVTDGSLIINGISGSVGKVRIVAGGSVTIKS
MASSANVTIQAGGDVTLGGTAEGWGTIEADTLSVTTSGAITVYEADALAI
SQLESTGKKAVFVQAGGALELSGAVETNSSDAITLKSGGALTVSNSLTCN
FSGAITLEAAAGMTLSDIADVNSKSGAVSLNAGTGALTMAASTMVDAGSG
QMTLSAGGAITLGALRSTYVGTFAITAQSLSAVQDGNVDIYAPNATLVVA
TADGFGSVENAVETKVASLTLSNTRTGTATGAINIEEADDLTITSIVQAA
IGSVTVRTLGGNITVVESGGSGIAATTGAVTLYASGGSVTLGGDGYGAAI
KTTGGAVTVTAESGNILLQSGIRVVGGQDEAGKDLVTGDIILRANSGAIK
VDSSATGWLKDGATFDADVNWAMNSGKFEVEKATGRVYVNNITAFEITQH
PPLVEDGTLRTAGEAYLQTTGGNLKLYAHGVIGEKFTGFTHSPLALFADA
ETIVARSNNRDNVSIITTFTVAVGSEDDGNSAGSRAGATQIITLAGIQSI
TSDMDAGGENMSLSGEDIEIDASIRSADAELKLTHKNPTGQTMVVGDADG
YADVWNLGTDDIGNIGAGFKTIIIGAEGGSNDIIFDSKSEGGLTFNAPLE
INSDGEGGSITINTDIKATSVVIRGSGETTTIQSSDITAEGGPVNIEDSL
RVSGEISITATGGSVGDINFTNGGNGFFIAGNSSSTGDYLILSASDDITF
DNEFAGADAESGADYLSGLTITSANDVTFNKAITIAGNLVIHASGDVIFE
NKVTLLNGGSLTITNANSITFTTTSNITLDNDSTIGNTLAAGGGNLLLEA
DEIDLYVGAAKVQGIGGTVTLRPKDHDMNIAVGSHDYATDDMLNLETTEL
LTLKEGFSKIVLGWQSGVTSHAESGATNVVLVGANTSTPGAALFKDDVEI
YGGFINVADYSSSSSVLRIAVDERLKLDAYSNIDIANDLEADSIELYSAM
GSIAQSDVSDDGLSDEQIRSLELVVTAATGINLDSIETAILDVQNTTSSN
VSLYVNDVRTSGSRLTQTHITGDVSVARLAQAGTGYLSLTTESGTITVLT
GNVSNASYLLLNGDATQSVVHAGSGSVSLTANGIGEDVLLQKDVSSVGNF
TITAADAITTESGVDITATGNASNITLTATAGNVVLGGNITAGTTDSGLI
SATAGGVLTMADGTTINATGTNGRVELQAQGRVSVSEINVGAAASIKSVT
ADIVDVLTAAVNPTANTGYNVDGDAAALTMEAATGVGSAGSAIQTRVAKL
EAHNSGEKGVFVVEATALSITDNPEATTPPVAMNIGGVTANSGVVSVVVT
TGNLTVTDEVKSTSTHANSGNVLLQTVAGSITVLDDVTSKSGNITVFASS
DISIGQAATAAIIKTEATGKTVELEATAGGVTMQAESGLTTTNGAMRVMA
ATDVVVGKLNAGATSGVVSINAGGSITDADADGGSQQVNVTASAVRFYGD
DGVGTSADDLELDVTTVAAKSVTSAGIYLSELSGVTVGTVNAVTINRVSA
AGITSSSVNVDGFVYGEEAALSGITTATGGAVTIDFATGEAGTLLVSQEI
TASGNGHVLLQSTGASSDVSIQNTVNAGTGSISVLTVGNQSYNASGQLLT
NGGTIDVHASGSGSTITMDAASLFDTNGGNIRVMAGNVNGSDATTTAGGA
IVLGVLDAQNSGDQATWGNVSVRSTGGLISDAASSDTTTNIYANNLRLEG
RTAGGVGTGVQHLEVEAIILSASAGTAGIYVADPSAVTVGNIAAQSVNRV
LSTGLLGTAQTDVALADFVSTGNVVLIATGALAINEGDADDMGVSASGNI
LLKSAGNTTDLDINADVESSAGNITLEATRDVLIDADINTTNKTIQITAA
HDVIQAQSVSPYTIGSAGGDVALTATSGSITFEKIDAKGVAATAGNVRLK
AFTTISDGDTTANETEVDIIANGLILEAGTGVGLGSNHIETTVATLSGKT
TTSGGFFVSDTDGITIGSVTVAVNSIGITGVTPTSTTNETVSGISTASGG
HIVVEAAGAFTVSNAVSAGGAGNILLKTTSGTLALNAAISSGTGNISILN
TTSAIEQGAFTVSTDGGTIDMEANGSITMVAGSSLDTAGTTHTAGGNIRL
KSGAGMTVTGIDAGTALVSLLSGGLIEDAGNTLTDVVAGTLRLQATTGVG
KKDDHLETSVGTLSANVGSEGFFVTEATAVTVGQTAAVTVKRVNADGSTL
TDTTDAAQSNLVSTGHLVLVTTAGSVTTLSTGALTATGNMLVQAVDAEAD
ITLGAAVSSSGAASANINVAHGSDTTYDRTIDLSNLSVQSNTTYSVIIDG
KALSYKSGTDDALNEIGEGLKDAIDADSYVATYDGNTHVITITSGAGTST
ISVSIAASISLNAGHDIFQNSTITTNGYAGTVDLRAGHDITMANGTSTST
NDGNILVYADGTVKLGLLDVRTTGDRSGNTLTKQSDASNPWGSVSVTSTA
GSILDNNGTAVNVYANELKLTATPAGTGAVGIGAEHLETEVATLSANVGS
GGLFITEATDITVGQTATVLTNRVSGTASVTDSTESATTQSNLTSAGHVV
VQTVAGSITTVVTNGNISAAGNILLNASETATATVAAMTLGGTVTTTAVS
NGSISLTAKDFIHQLETGDITAGDSGTVELSVSTTTSSGAITMDDGAVTA
STSGNIRYSAVSSMTIGSITTSGDVSLQATTITDSGTSDTDVSADELRIV
TTSNIAGAGVGTRTKHLQLGVTKLAASVAGTVDTTTTSPTYGSGGLFLTE
ADAITIDALSAITVNSVLATGSLSDPQLSDISLSDVVSAGHLVLVTTAGT
ITINEGDSDDTGISVAGNMLLQSGGNASDITLNADLRNSAGNISINAGRS
LLQNADIATSAANMTIDVVAAQAITMSQSGSNTVSITTNNSNILLQATAG
DITLETINADSGSVSITATAGSVFDGDANGDSDVDIIADKLLLKAGQTLG
WGENHLETTVPTLTANAESKNPAINAADSANVNSLRTVDFSGMVALTGGV
YSVTIGNTPAFTHTATAGQTLADIVGVLASKIHGATYTAVVDGSKINITA
GAGTVVISGAAVGGVYVTESNGVTFDTVTVNVNRQTNSLNSNYRRTIDFT
NVTVPSGAYTLAVTIDGTTFTKDAVSGATKEQTVAALATLINAHATYDAV
LATGNANQINVTKGAGIVAITTTVSDAVSGTVTVNNADNSNSNYRRTIDF
SGVTVPSGAYTLAVTIDGTTFTKDAVSGATKAQTVASLATLINDHATYDA
VIDTNNSSIINVTKGAGIVAITTTLNAAVSGSVAINNSADTGLEVTNNSA
QSNVSATNNGHVVVVSTTGDIVLTNNVSAAGSGNVLLQATAGALTLNNTV
DGGSGNVTLKASGLISQAAEGDISTTGGTIVVESTAGAITMADGAVAQTN
GGNIRYKASGSITVGVLDARDSNNRSAGTLSNQTSWGSVSIISGSSILDN
SETTTDIYANQLWMSAAVAIGAGDNHLETEVYRVSAKTTSSSIFVTEATA
VEIGATNAITVQDVGVDGTVTAHTDAAESKITSGGALVFQTKAGKLETIA
TSGEISAASNLLLRAGGVYTASVDSSNSKQINLTIGAGTEVISTEFGTGV
TGSSIISNAHATNLDTTRTINFSGVSAASGSASYTVKIGASTFTHTIAGQ
TIEQIVAALVAKINAVSASLCDVTLGALVSSAAGNITVIAGRTLRQSANI
SSSNADKKSIDLLAGNAITMDDGTSIMSGGGNIRLAAKSTITLGVVDART
STGRTGGNVTDQTNWGSISLVSGASILDASATENLATHFYANELRLNAAT
AIGFADNVVNHLETEVAKLSAEARSGGMFISEATAIEITQTAALTVKRVG
IDGTTLTDTTDVAQSDLTTTGSLSALVLQTLNGSITVNGGGDAAGISATG
NILLSAGEGESGDTEGSLVTANIILNASVTSSAGNISLLAKDSITQNATV
GDITASGSSKTIDLQADNAITMYDGAVTTTTNGNVRYEATAGNITVGEII
AGAAESSSAGKVALIATAGSILDISNDIAVDITASDLILTAGKAIGESGN
HLETTIVQLSTLSSNGGTWITESNGVTVTNLSFTVERVIASGALDTTKPS
ATQEDLTTVATTSTDSHLVLVATNGSITIYAGADAATTAVGVSATGNILL
SAGETAEATVANITLSASVTSTTGNISLLAKDSITQNANGDVTTSASSTT
IDVQADDSITMYDGALTTSMNGAIRYQAIMGSITVGEITTGAAASTTGKV
ALLAGGNILDLSSDTSTVDITASELILTAGGAIGKNGSLNALETAVAYLS
TLSSNGGTWITESDGVEVKRITLDVNRVLATGGLDTTKPSSTQEDLRTLG
TDSHLVLVTTNGSITIGGGNDKVGVSATGNILLSAGETLTAEEQAAQTTE
AATVANIILSASVTSTAGNISLLAKDSITQSATDGDITTSYAGKTIDLQA
DDAITMYDGALTTSTNGNVRYQTTTGNITVGEITTGAASTTTGKVALVAG
GNILDIANDTSTVDITASELILTAGGAIGKNGTTIDHLETAVAYLSTSSS
AGGTWITESDSVTVTNLSFNVERVLATGALQTSSKPSASQEDLRTLGENS
HLVLIATEGSITIGGGSDKEGVVAAGNLLLSAGEGATGDTTDTIVNANIL
LNASVKSTGGNISLLAEDSITQSLTDGDITTSADNKTIDVQADDAITMYD
GAVTSSKNGNIRYQAISGSIQVGEIKTGTASTTTGKVALIAGGSILDLGT
DTSSVDITASELILSAGAAIADKDNHLETSVAYITTSSSAGGTWITESDS
VEVKSLSLTVDRVIESGGLHTTQPSASQSNLSSSSHLVLVASGSITTNAT
GGALSAGSNILVKAGGTTSDITLGATVISSGGSISLDAGQNIQQNSTITV
SGGSGTVDLLAGGSIEMKQGSASISTSASNGNILLTATSGSITIETINAG
MGNVALYAANATNGIIYDGDAVDTSNTEIDIMASGILLNAGNAIGLGTNH
LEITVTTLTANAGDGGLFISAEEKVSEGDINRGITVDALTINVHRVDGKG
ETAFTNNNTQADITSTGEGNMVLRSKKGSIILNDGDTNGATVGGTNGFAV
KNTGSGNVLLETALATDDISAYADVVTSTGSVSVLAGHSVEFKTDADILT
QGTGSAGTIDVVAANGGDIVMSSNTVFASTNGAIRLLAATDIEVGVISTA
ASSTAGSGMVSLTATAGSIVDAQHLGNDNNTDATVNVTASGLRLSAGGGV
GQSVNHLETTVGTVSARATSGGIYLLESDGVTVGDVAVTVNRVTVDGSFN
NSTKTDNKQSDLHTTSGGGNIVLVSKTGNIVLNDGTAIADNTAISADGSG
NILLQTLSATGDILVNADLGASITNSISTGTGSVSLLAGHDVIFAATADI
RSQGVGSKGSIDVVAGNVANSSSVKAGSVKMAADSLFASTNGAIRVLAAD
SIEVGILTTAATTTANDGSGMVSLTANNGSIVDAQNLGNANNTDAIVNVT
ASGLRLSAKVGVGQTINHLETEVATVTARATSGGIYLKESDGVTVGDVAV
TVNRVKDDGSIPVALKQTDAKQSDLRTTSLGGSIVLTSTTGDIVLNDGTA
AADNTAISTDGTGNILLQTLSATGDILVNADLGTTTANSSSTGIGSISVL
AGNDVIFASGADIRSQGATSKGSIEVIAGNVVPTEGSANGSVKMASNTTF
ASTNGAIRVLAATNVELGIITTAASATLNDGSGMVSVTATNGSIVDAQNL
GNENNSDATVNVVASGLRLWAGVGVGQTINHLETTVDTVSARATSGGIYL
LESDGVTVSDVAVTVNRITVDGSFNNSTKTDDKQSDFRTTSGGGNIVLVS
KTGNIVLNDGTAIADNTAISADGSGNILIQTLSSTGDILVNADLGASTTN
SISTGTGSVSLLAGHDITFATNADIRSQGTDSKGSIDVVAGNIANGGVKA
GSVLMASNTLFASTNGAIRLLAADAIEVGIITTAATGTAGDGSGMVSLVT
TSATAGTITDAQALVNSANDTTVNVVASGLRLWAGVGIGETVDHLETTVD
TLSARATSGGIYLKETNALDVSDVAVTVNRVKDDGTVASSTQSDDKQSDV
AITSGNGSIVLTTGGNLTLYDGTGTTTGALPLNYVDKAINAIGTGNVRLD
VTGTLTLESAVDAGSGNVTILSTGNQSYEAAGDIFTTGGTIDVQATGVGS
TIGMDADTVFQTNGGNIRVMTGTVNSSGVTLTAGGSITVGVLDARTSADR
GLTTIDDDKRDDQIKTTGGWGSVSIVSTGGSILDNSETTVDVYANELRLT
AQAAIGALGDGTSNALETEVATVTASAGTGGINLLESSAITIGSVTAVAV
NRVATTGVAGSGDQTDVVVQAGVVTTANSSGSIVVVAGGAMTVSNVIIAN
GSGNVRLETTASTMAINAALSSGSGHITVVAKTNLTQLAAGDITTIGAGT
IDVEAGGSIQMTTGAVGADTAVSGAKDIRYQAKGGNLTVGSFSTGTDATT
GGTVVLIASGSIVDGDADVDVTANKLYMQSGSAHAIAGGSDHLEISVNTL
SLSAGSGGAFVTESNGVTVDTVALSTLKRVENTSILTTQSGSWEDLNAAT
TGNLVLDVTSGALVLNAGSNANYAVQAVSGNTRISTQSGALTLNARLDGG
SGNISIISSGNQSYGAAGDVVTTSGTIEVQATGVGSTIGMDVDTLFQTNG
RNIRVMAGTVNELGVTVNAGGAITIGVLDVRTAADRGAATRTDQTKTTGG
WGSISVTSTGGSIYDNDGDALVNVYANELKLSASASGKSVGKSNQHLETE
VATLSGNVGSGGFFVTEATDITVNQTAELTAKHVLLNGTITNANDSASVV
TDSAQNDLVSGGALVLQTKEGSITTAVTNGDIQAAGHILLNASETAVETE
AGITLGGTVTTTSASNGSISLTAKDFLYQLATGDITAGGTGTIDVEVSTG
TSSGAITMDDGAATASTSGNIRYVATTTLSLGTISTLGNVSLTATSITDS
TDDDSENVTDDVDVTAATLRVQTSANGFGEASKHIETTIGTLAATLGANG
NLFVTETDAITIDTVAAITVYRITSEGTAVSTSIQTDNALSDIATGSGHV
VIDATDITVKGGDTASGDNDVTTGIRTTGAGNILLNARSGNIIAQAIING
GTGNISLNAVGSNTTGNITLWNTTSDNNGTAFVGVLQTDNATIDVKAGNI
IDMKDGSTILSKGGDIRFEAVNNINVSYVDATTTTLAGDVALLSTSGSIL
DIDNGTALDVYGAGLLMEAATGVGVSTNHLDTTVTTLTALAGSGGMFINE
TNAVDVDTVTVVVNRVNDQAGTAVESKTLSDLVTISNGNMVLVAGGTITL
KEGDEDNTGVSAAGNMLLKATANDIVINSYVTSTGGHISLDAARDILQNA
NVEAQATTKSIDLVVGRDITMDNGTSTTSANGNILLYAGTGNIIIETITA
GNSTNGYGNVSITAAATSGNTVGKIIDRDVTVAEDSEFDITANGLILKAG
NAIGDGNNHIEVTVTTLTANAGVGGLFVTAKELVTTGTISRGVTVDKLTV
AVNRVGTDAAVPATASGTATVTQEDLSATGAGNIVLDVTSGALVINAGTS
NTNAVTAESGNIRLTAATGALTLNAKLDAGSGNVTLLASGLIEQKAAGDI
FTTAGTIDVESTADAITMVDGAVAQTNGGNIRYQASGNVTVGLLDARLAV
DRPAALTNQATWGSVSIVSTSGSILDNSENTVDVYAKELRLTATGAIGAL
GDGTSNALETEVVRVTAKAGVGGINLLESSALTIGTVTAVPVNRVATTGV
AGNGNQTDVSAQVGIVTTTGSNGSIVVVAGGAMTVSNGVTSDGSGNVRLD
VTGTLTLESAVDAGSGNVTILSTGNQSYEAAGDIFTTSGTIDVQATGVGS
TIGMDADTVFQTNGGNIRVMTGTVNSSGVTVNAGGSITVGVLDARTSADR
GEAISTDDKLLDQIKTTGGWGSVSIVSTGGSILDNSEATIDVYANELKLT
ATPTGSGAVGLYNQHLETEVAKVSANVGSAGLFITESTDMEVGRTVELVV
KRVKNDGTVGSVQTPSTASDPLQNNFVSKGTLVLVTTAGSIETLATGGAI
TATGNMLLQAGGSASDITLGAAVTNTATNGGNISIKAGQDILQNASIIGQ
ATDKSIDLVAGRHITMTDGTNTTTSTTSANGNILLYAGTGNITIETITAG
NSTNGYGNVSVTAAATSGSSVGKILDQDDAGDNGTNPDITANSLILKAGY
AIGLSDNHLETTVTKLTANAGAGGLFVTAKERVSGGMVTVESMTVSVKRV
DAEANVPADPSGTATVTQEDLSVTSGGHLVLDVTSGALVLNAGSNANYAV
QAVNGNTRISTQSGALTLNAKLDGGSGNVSVVSSGSQSYAAAGDVVTTDG
TIDVQATGVGSTIGMDVDTVLQTSGKNIRVMAGTVNELGVTVNAGGAITL
GVLDARSSTGRTNGGVSDQVNWGSVSVTSTGGSIYDNDDDVLVNVYAKEL
KLSASASGKAVGKSNQHLETEVATLSGNVGSGGFFVTEATDITVNQTAEL
IAKHVLLNGTIVNSDTTASVVTDSAQNDLVSGGALVLQTKEGCITTAVTN
GDIQAAGHILLNASETAVETEAAITLGGTVTTTSASNGSISLTAKDFVHQ
LSTGDITAGGSGTIDVEVSTSTSSGAITMDDGAATASTSGTIRYVATTTL
SLGTIATSGNVSLTATSITDSADDDAVQLPSLPDIDVTASSLRVQTSANG
FGEARKHIETTIGTLAATLGTIGNLFVTETNDITIDTVDTIEVNRVTDAG
SITNSIKTDNALSDIATGTGHVVIDATDITVKGGGDTTGITTTGLGNILL
NARSGNITAQAIINGGTGNISLNAVGTNLNGNVVLWNTTNATDGTAFVGV
LQTDNATIDVRSGNAIDMKDGSTILSKGGDIRFEAVNNINVSYVDAVNAS
PERAGDVAIISTSGSILDVDNNTTLDVYAAGLLMQAATGIGTSTNHLDTT
VTTLTASAGSGGMFISETDGVDVDTVTVVVNRVNDQAGTAVESKTLSDLL
TISNGNMVLVAGGTITLKEGDADNTGVSAAGNMLLKAKVDDIDIKSKVTS
TDGNISLDAARDILQNANVEAQEITKSIDFVAGRDITMDNGTSTTSANGN
ILLYAGTGNITIETITAGNSTNGYGNVSITAAAIPSGGNSDVGKILDRDG
TAAEDSEYDITANNLILKAGYAVGDGNNHIEETVTTLTANAGIGGLYVTA
KELVSGGNVTVDKLTVDVNRVGTDASVPTTATGTATVTQEDLIATGAGHI
VLDVTSGDVVLNAGTSGTNAVTAVSGNIRLIAAAGALTLNAKLDAGSGNV
TLLASGLIEQKAAGDIFTTAGTIDVESTAGAITMNADAVTQTNGGNIRYK
ANGTITVGLLDARVSDDRTPTAQLNAQSTWGSVSIISGASILDNSEATVD
VYAKELKLTATPAGTGAVGESTNHLETEVAKVSGEVGSAGIFITESTAIT
VGQTASLSVNRVLPTGLITKSDNTASVETDAAQDNFVSKGALVLVTTAGS
IESKATGGAITAAGNIFLQAKATQNATYDITIGAAVTSSNGSISLDASND
IKQNSTITVSGGSGTVDLLAGHDIVMQQTTSSISTSASNGNILLTATSGS
ITIENINAGSGNVALYAANATNGFIYDGDDAGDSEVDITANGLILKAGNA
IGSGTNHLETTVTTLTANAGVGGLYITAQEKVADSGITVDILTVNVNRVD
DKDATASTNNSAQVDLTSTNAGNIVLRSKDGSIILKDGDSNGFAVKNTGS
GNVLLQTTNSGSITANADVVSTSGNISVLAAQSVTFTANADIRTSSTSTI
TGTIDVVAGSGSITMSDSSLFTTSGTNGDIRLLASQNVIVGDIETTTADV
SITATAVSITDADALVGVANDNDLDITASGLRLNAGIGIGEVVDHLETTV
GTVSARATNGGIYLLESNGVTVGDVSVTTNRVGVTGATTTANSSDLAQSD
LRTTANNGNIVLVAGGDLVLNDGTATADNTAISANGSGNILLKTTSGLLD
INAAVKSGTGNITIWNTTGAIEQDAVTISTNGGTIDIEATNATNGSITMV
AGSTIVSDGTTTAGGNIRLKSGADMSITGINAGSANVSLLAGSFIKDIGE
TTTDVVANHLRIEAGSWVGEASGTNLGLLDISVTRLSVRAGNSMYINELS
DITVDTTDAITVQRVLADGSVLNSVKTDGKQSDLVTTANDGNIVLVAAGN
LTFNDGTDVLAGENVTEDNTNGEVVSANGNGNILLKTTSGTLAINSAVKS
GEGNISIINTTGAITQGAVTISTDGGTIDIEATAGAITLVSGSRIVSDGV
STTDGNIRIKSGADMSITGINAGAANVSLFAGSFIKDIGEAIVDVLANHL
RLEAGSVIGEALGTDNGLLDLSVAMVSVKAATAIFLKETNGITVGTTSEI
KVKRVGANGGTTDDNTFGAAQSDLQTTNNGNIVLVATAGDITLQAGAAVD
PQSNNFAVSANGIGNILVQAEVGSVIAEANADVVSGSGSISVIGKTNVSF
NSDGADIRTSGGGTIDVLAETGKIEQSATSLFTTGTGNIRLLAGTSVVVG
DITTGGSVSVIATTGSITDADSADETTADNDIQAVGLRLWAKSGIGTNSN
HLDTSVDNLSAYVDAGSMYLLESNGVTVQSVGVSVNRVVAAGTASVVDET
TDSAQSDLRTNSNSGNIVLRASAGNIELTDGIANSGTAGIAGTTVRANGS
GNILLDAISGSLAVKSDLSSTTGHITLHANDSISLTSDVDVTTATSGTIS
LQAKHGEISMVSDATVMASNSSVRLAAHQDILLGDVAAQNVSLISAMGSI
HSAASNIQNIAATNLRIEAQQAIGKSDLHIKTAVDTLTAKANGTVTSGTA
ETGIYLTEANSITVDTVSVSVTEFSATATTSIVKDSSQSDLVTGNNGNIV
LVADGKITLNDGTDIASPFEDNTDGKAVSADGSGSLLIDANSSNLLIYSD
IESGTGHITVKAAIGVEIGSSSATEVDISTATMGTISVDAEGGELKMAGD
AEIKATSSSVRLNAASDVTLGNIVATNVSVVADSGSIINAAGSSKNVTAT
NLRLEAKQAIGAPTNHLTTDVTTLTLFAAGTVASGTPLSGSYISEVSDVK
IDTVTVTVTEFTHVALTNDVIDAAQSDMVAGNNGNLVLTAGGTITVNDGS
DNDSLGVEAGGNIRLEATESNESVESNIKLSSGVVSNGGNITLLAKDDIA
MDVTGDITTQSDGKTIELQADGTIRMVDGTIIESNNGNVRLTALTDDITV
GEIKATTANVAISAKVGNIFAVDSSNKNILAKDLILNAGEAIGKNDNYLD
VSVTNMATASGSGATYVESNGVNVNLGGLSVLVQRVMATGSTEDSSTSTQ
NDFKAGDDIYLVATSGNIVITANNENALTQAKNIVLIAEQGNIIVNTGAA
NQGFSASESIKLIAEAGKITINSTDANSAGLVARKNILIDARETVEDTDA
TLVVNAKITSKEGYISLLADDSITMTVFGDVTTETSGNTIDIEANDSIAM
SDGSLVSTSNGTVRYQAFVGNITLGEINAGSGNVALLAGGSILDISNDTS
SVDITANELLLQAGAGIGTDGITVNHLETSVDRLSVKSTTGSAYVTENNS
VEVGVVTVTVSRVQENDTVQALSADTLSGGESDGNLVLVTNAGTIETLAG
GGTLTATGNILLDAKANLMLGAAVSSTGGNVSLVSGGNFEQGAVGDVSAA
AAGTVDVRVSGTMTMTDGAEITSGSGNIRLAVTSSLQLGALSTSGDVSIS
ASMITDAGSSTSDTVNISADEVYFSSTSNANGVGVGTGSNHIELNASKLA
ASVSGQGGMYITESDGLQVGALTAMNVKKVASDGSSTASTADTAQSNISS
DGNLVIVTNISNIETLATGGAINAAGNMLLDAKANLRFGAAVSSTGGNIT
MVSGGNFEQGAVGDVSAAAAGTVDVRVSGTMTMTDGAEITSGSGNIRLAV
TSSLQLGALSTGGDVSISASTITDAGSGASDTVNISADKVYLSSTSSANG
AGVGIGSNHIELNANKLAADVNGTGTGGLFITESDGLQVGALTAINVKKV
ANDGSSTVSTSDTAQSNISSDGNLVIVTNAGTIEMLVTGGTLKAAGNILL
DAKANLMLGAAISSTGGNISLVSGGNFEQSAAGDVSAAGAGTIDVRVSGS
MTMTDGAEITSGSGNIRLAVTSSLQFGALSTSDDVSISASTITDAGSGAS
DTVNISADKVYLSSTSSANGAGVGIGSNHLELNVNKLAADVNGTGTGGLF
ITENDGLQVGALTAINVKKVANDGSSTVSTTDSAQSNISSDGNLVIVTSA
GTIETLAVGGTLTAVGNILLDANGNLTLGAAVSSTSGNVSMVSGGNFEQG
AVMVSAAGAGTVDVRVSDAMTMTDGAEIKSGSGNIRLAVTSSLQLGVLST
SGDVSISASTITDAGSGASDTVNLSADEVYLSSTSSANGAGIGSGSSHLE
LNANKLAADVNGTGIGGLFITESDGLQVGALTAINVKKVASDGLSTVSTN
DVAQSNISSDSNLVIVTTAGSIETLMGGGTLTAAGNILLDAKANLTLGAT
VSSTAGNVSMVVSGNFEQSAAGDVSAAGAGTVDVRVSGTMTMADGAEIKS
DNSNIRLAVTSSLQLGALSTSGDVSISASTITDAGSGTSDTVNISADEVY
FSSTSSVNGAGIGTGSNHIELNASKLAADVNGTGVGGLFIIESNALQVGT
LNAINVNLVATDGTVSLVTQTTDAAQSNIVSDGNLVIVTTAGNIETLASG
GTITAAGNILLDAKTNLILGAAVSSTGGNVSMVSGGNFEQSAIGDISAAG
TGTVDVRVSGAMTMTDGAEITSGSGNIRLAVANALQLGALSTSGDISISA
STITDAGSGASDTVNLSADEFYLSSTSNANGAGVGTGSNPIELNVSKLAA
SVSGQGGMYITESDGLQVGALTAINVKNVASDGLSTVSTADAAKSSISSD
GNLVIVTTVGTIETLAIGGAINAAGNMLLDAKANLMLGAAVSSTGGNVSM
VSGGNFAQSAIGDLSAAGAGTVDVRVSGTMTMTDGAEITSGSGNIRLAVT
GSLQLGALSTGGDVSISASTITDAGAGTSDTVNISADEVYLSSTSSANGA
GVGTGSNHLELNVNKLAADVNGTGTGGLFITENDGLSTLSTSDIAQSNIS
SNGNLVIVTNAGNIETLATGGAITAAGNILLDAKANLMLGAAVSSTGGNV
SLVAGGNFEQSAAGDVSAAGAGTVDVRVSGTMTMTDGVEITSGSGNIRLA
VTSSLQLGALSTSGDVSISASTITDAGADASDTVNISADEVYLATTSTAV
GAGVGSGSNHLELNANKLAASVSGQGGLYITESDGLQVGALTAINVKKVA
NDGSSTASTADTAQSNISSAGNLVIVTSAGNIETLAIGGAINAAGNMLLD
AKANLVLGAAVSSTGGNISMVVSGNMSQSAVGDISAAGAGTIDVRVSGTM
TMTDGAEITSGSGNIRLAVTSSLQLGALGTSGDVSISVSTITDAGTGASD
TVNISADEVYLATTSTAVGAGVGSGSNHLELNANKLAASVSGQGGLYITE
SDGLHVGTLNAINVKNVANDGLSTVSTSDSAQSSISSAGNLVIVTNVGTI
ETLATGGAITAAGNILLDANGNLMLGAAVSSTGGNISMVSGGNFEQSAVM
VSAANAGTIDVRVSGTMTMNDGAEITSGSGNIRLAVTSSLQLGALSTSGD
VSISASTITDAGTGASDTVNISADEVYLATTSTAIGVGVGSGSNHLELNA
TKLAASVSGQGGLYITESDGLHVGTLNAINVKNVANDGLSTVSTSDSAQS
SISSAGNLVIVTNVGTIETLATGGAITAAGNILLDANGNLMLGAAVSSTG
GNISMVSGGNFEQSAVMVSAANAGTIDVRVSGTMTMTDGAEITSGSGNIR
LVVTSSLQLGALSTSDDVSISASTITDAGLSTSDTVNISADEVYLSSTSN
ANGAGVGTGSNHLELNATKLAASVSGQGSMYITESDGLQVGALTAINVKK
VASDGSSTVSTSDSAQSNISSDSTVVIVTNAGNIETLAAGGTLTAAGNIL
LDANGNLMLGAAVSSTGGNISMVSGGNFEQSAVMVSAANAGTIDVRVSGT
MTMTDGAEIKSGSGNIRLAVTSSLQLGAISTSGDVSISASTITDAGTGAS
DTVNISADEVYLATTSTAVGAGIGSGSNHIELNANKLAADVNGTGTGGMY
ITESDGLQVGALTAINVKKVASDGSSTASTSDTAQSNISSDGNFVIVTNA
GNIETLAAGGTLTAAGNILLDANGNLTLGTAVSSTGGNISLVSGGNFEQS
AVMVSATGSGTIDVRVSGSMTMVDGAEITSVSGNIRLTVTNGLQLGALST
SGDVSISASTITDAGTGVSDTVNISADEVYLSSTSSVNGAGIGTGSNHIE
LNASKLAASVSGQGGMYITESDGLQIGTLDAINVKKVSSDGLSTVSTADT
AQSNISSAGNLVIVITVGNIETLAVGGTLTAAGNMLLDAKANLMLGAAVS
STAGNVSMVVSGSMSQSAVGDVSAAGAGTVDVRVSGTMTMIDGAEIKSGS
GNIRLAVTSSLQLGVLSTSGDVSISASTITDAGAGASDTVNISADKVYLS
STSSANGAGIGTGSNHLELNANKLAADVNGTGTGGLFITESDGLQVGALT
AINVKKVANDGSSTVSTADAAQSNISSDSNVVIVTNVGTIEMLAVGGTLT
AAGNILLDANGSSSDVVIGADIKTPTGHITIKADDSIELASDVDITTATA
GTISVDAEGGTLRMAGNSTISAVGSSMRLAATGTVTVGNTTAEFVSIVSR
RGAIINAAGSTRNVTASDLRLQSYGSIGSANRHFTTQVVNLSIDPEEEGA
GIYLEELDDVVVTTVRVDVTEMTSVADTLGISDQSMADLVTSSNGTIVLV
TIDGSITLTDGDHNGVSISADGTGNVHLEANGADNNVIIEAAIQTDTGSI
TIVAAGDVEQQANIVTNGNLVSVQAEQGSITMDQNVQTITNNGTIEYRSY
EDVLLSLLHAESGSVAVYAETGSIENNTTSNTAPNVTSETALFKAGADVG
LREIQPVVISVERVAAEAVTGEMSLVNLGTVVIDVLEDADGNMVSGLSAG
DGISLESLQGSIVVAAPVDTKGTADALLTFSNGQLIGKSAYFDDAGTFLK
MQYKQFQFLWNGEGATIRQELLNMVVGRQVDSDIARYRESASERQTVSPA
RSTMPMRSYDPMESLRHVDVDVLEEQPGYVEVHNGYAFFRWAEVPGAQSY
LLVLERDKLEYASRWLEETAWAPFEELPEGIFEWSLYSWTTDGLQLVFGP
MQFTV
>Cag_1816 Adenylosuccinate synthetase
MESKNFKLPSPLATVIVGTQFGDEGKGKLVDYLSANYDIVVRYQGGANAG
HTICFDGKSVVLHLVPSGIFHEGCTCVIGNGVVIDPVALLEEIKTVENLG
YDVRGRLFISHNAHLIMPYHKLLDSLHESAQGDQKIGTTGRGIGPSYEDK
VARKGIRVVDLLNPELLKEKLRENLSAKNKLLRNIYDREEIDVEAMVQEY
EEFDKIIDPYVTNTQAYLTRELQAGKTVLLEGAQGCLLDVDHGTYPYVTS
SNPTSGGACTGSGIAPNYIGKVIGITKAYMTRVGNGAFPTELLDETGELL
GKIGHEFGATTGRKRRCGWLDLVALRYSLAINGVTEIALTKLDVLDSFEE
IKVCTAYLLDGKELHDFPTDHQTLAQVTPVYTTLKGWMASNAAARTFEEM
QLEAQRYVDFLEEQVQLPVTFISVGPGRDETVFR
>Cag_1975 putative 3-deoxy-D-manno-octulosonate 8-phosphate phosphatase
MCAASSFTLNHYKPLSGFQFFGNDPSQADPSSRIDQALKGIQALLFPVDG
ILNGSKITFDHSGNELCTISVRDAIAIKEAVKLGLRIGVLSSRNAEGYRP
MLEALGVQDLYLNGEHVFYSYDAFRHRHSLSNEECAYIGDDIGDIDVLAK
VGLPATSIDGADYLRNRVAYISGFEGGKGCIRELVEEILTRQGKWPYIER
PDEEDETAEE
>Cag_0962 conserved hypothetical protein
MKKAGLIILVLCGIVLLFDMLLMPLYTTQGRSERVPNVVGMEFEDAERKL
EMAGFEAVRSYNAGYEVDVPANVVLSQTPEATMEVKPGRAVYVVVNRGAK
PAVQMPNFLGLSEGEARQEAARLELFPVDVVGTPVANSSDDGRVLNQSLP
AMTLVQSGMPLTLFVGRYDAEAVNAERIELPNLLGMSLGQAQQTLAEAGL
IIGHVVTERSRLLLPNTVISQRPAVGTLLAPGQAVDLTIVGE
>Cag_1018 SecA protein
MLKIFEKLFGSKHEKDVKKIQPTIQRINELQRALASLSDEQLRQKGRELK
QKVRGVLEPMELEQQKLFHQLDSPNISLDEAESVNNKLDDLAVAYETATA
SVLEEILPDTFALVKETCARLKGHTYNVMGRQFVWNMVPYDVQLIGGIVL
HSGKIAEMQTGEGKTLVSTLPTFLNALTGRGVHVVTVNDYLAQRDKEWME
PLFAFHNLSVGVILTSMHPALRRAQYLCDITYGTNNELGFDYLRDNMANT
PEEMVQRKFYYAIVDEVDSVLIDEARTPLIISGPVPNADNSKFQEIKPWI
EQLVRAQQQQIAAWLGDAETRMKTNATDPEAGLALLRVKRGQPKNSRFIK
MLSQQGVAKLVQITENEYLKDNSSRMHEVDDALFYAVDEKANTIDLTDKG
RDFLSKLSHQDSDIFLLPDVGTEIATIESNAALSTNDKIQHKDALYRLFS
DRSERLHNISQLLKAYSLFERDDEYVVQNGQVMIVDEFTGRILPGRRYSD
GLHQAIEAKENVKIEGETQTMATITIQNFFRLYKKLAGMTGTAETEASEF
YEIYKLDVVVIPTNASVVRKDMDDLVYKTRREKYNAIAQKVEELQKRGQP
VLVGTTSVEVSETLSRMLRTRRIAHNVLNAKQNDREAEIVAEAGQKGTVT
IATNMAGRGTDIKLGDGVRELGGLYILGSERHESRRIDRQLRGRAGRQGD
PGESVFYVSLEDELMRLFGSDRVIAVMDRLGHEEGDVIEHSMITKSIERA
QKKVEEQNFAIRKRLLEYDDVLNQQREVIYSRRKNGLLKERLTSDILDLL
KDYSDTIVKKYHKDFDTAGLEEQLMRDLSIEFQLDRATFEREGIDAVVDK
VYETALTFYRRKEESLPADIMCQIEKYAVLTVIDQRWREHLREIDSLREG
INLRAYGQKDPLIEYKQEAFRLFITLLKEIEAETLSLAFKLFPIDPEEQQ
QIEERQRQSAIRQEKLVAQHDVAESFVGLNDDDEPLPAQPITTEQKPGRN
DLCPCGSGKKYKACCGQ
>Cag_1683 thiol:disulfide interchange protein DsbD
MKQTMIGRVVALFFMLIVALVQTLPLSAAELLGPDEAFRLQAELQDKRAL
RLQWTIANHYKLYREYVRVSVTEGKAELQPLTLPKGIMTTDPVSGEKIEI
YHDQLTVSLPMLNADAPFTLNVSYQGCAEDGLCFPPITKRFRVNPEQVGK
LTPLADASLDGGANPFAAMQQEATSESATSAKASTPQPKNQPENDFSLAT
SALASGSLWQIVPLFFLFGLLLSFTPCILPMVPIISSIIVGEGNSSRSRS
FLLAVAYCLGMALVYTSLGVAAGLAGEGFAGFLQKPWVLILFSLLLFVFA
LSMFDLYQLQIPTALQNRLCKASGNLKRGRFVGVFFMGALSALLVGPCVA
GPLAGTLLYISQSRDVLLGGFALFAMATGMSVPLLLVGVSAGSLLPKAGT
WMVGVKYLFGVLLIGVAIWMVTPVLPMALQMVLWAALMLLSALFLGLLDA
APEKATVGMRFKKTAALLLLCGALVEVVGAASGGSNPLQPLAHLRPSAGS
SDAPANNQLHFTTVRSLAELETILQSTNKPVMLDFYADWCVSCKEMDAFV
FEKPEVQQALSSMQLLRVDVTANNADDRALLKRFNLFGPPGIIFFNAEGK
EIAGSHIVGALDAEAFLQHLQTLP
>Cag_1218 putative transcriptional acitvator, Baf
MDISASTDRLLLVVEIGNSSTSFVVFQGDQSLALQKVATNLLTTVDGVAA
SVEPIFAAHPMLVDAVVCSVVPQAEEAVVTYLHGSITGKVMQVNSALKLP
FTLAYEDVTTFGADRLALCAWCCLSHTAYAFIALDIGTACTIDVLNSKLH
YLGGMIMPGLELMARSLHEHTARLPLVDVSTVSLSLLGNSTTECMQLGIV
WNFTLGLEKMIESIKMYLEYEEHDREILLVATGGAAPFVTSLFTMQCQVE
ELAVAHGARLLFSYNQ
>Cag_1505 hypothetical protein
MATIPHYTPNDFMHEIEQVPPQYLPQLFQIVHIYKESITKKACLDSFEQS
WQQAIAGNTMPISELWEDIDAE
>Cag_0719 conserved hypothetical protein
MIILIGSQKGGCGKSTLAVNVACALALDKGADALLVDCDTQSSVARWVQD
RQTHAALKNIPCVQISGDVRITLHDLAKRYDHLVVDVAGRDSVELRSALS
VADMLLSPIRPSQYDLDTVPHLAEVYSRAKDFNEKLRASLVLNLCPTNPV
IKEAQEAETYLQDFAEFAVAKTRIYDRKAYRDSVAEGQSALEWKDSKAAD
AIRQLMMEVMPND
>Cag_0378 helicase domain protein
MSNAKIFYHDIGDYYSREEKLALIKKYHSLAHPNMQWQQLQPNEHGDWIS
QRNDLFETFIPLGDKENKKADTFFVPFYSRGLASARDSWCYNSSKTTLEN
NIRTLIEFYNQQRIAYFNTIDKDSKITVENFIDYDSSKITWNRGLKNDLE
KNKAIDFNRDYIITGLYRPFNKQKIYFARELNDMVYQIPKIFPASNSKNY
VICVSGVGASKDFSVLITNCIPDIQLQFNGQCFPLYYYEKQEKSNPTLFD
AAKEPDYIRRDGVSNFILEQAQKRYGNRVTKEDIFYYVYGILHSPDYRTR
FASDLKKMLPRLPLVENVKDFWHFSKAGRELAELHINYEAVPPAKGVILL
YNNIPTEEIEKGLQSSKMQEINYMVTKMRFPKKDQKDTIHYNNQITITNI
PLKAYDYIVNGKSAIEWIMERYQITTHKESGITNNPNDWATEVGNPRYIL
DLLLSIINVSLQTVEIVNNLPKLEF
>Cag_0134 hypothetical protein
MVERIQNWIKTMLGLTAMNTPAHPNQDLPTWLRWLSSLLGVGLSIGSLVM
LYCPPEKASRELDGKGTVIKVLLESTDVTTPFLSIFLAGVALVVFGINGI
RFAKITAAGVSAEAPDATAAATNYYKAPSEDRPQTEVQVAEKESPDPTDV
PAGYLEAEDGGKYAVYKLNEVPSSVITDALASWPTEDSKPEDLSGFEFAT
RKTGKGNHPWTLKFKGKKAVIVSYGGFAKPGATVSHPE
>Cag_1900 3-isopropylmalate dehydratase
MAQTITQKIFAKSAKRPFVDPGESVWLNVDVLLTHDVCGPPTIDIFKEKF
GSNAKVWDPEKVIILPDHYIFTANEHAHRNIDLLRQFAKEQGLPHYYDVG
TDRYKGVCHVALAEEGFNLPGTVLFGTDSHTCTSGAFGMFGTGIGNTDAA
FILGTGKLWEKVPESMKFTFEGEMPAYLTAKDLILQILGDITTDGATYRA
MEFDGEAIFSLPMEERMTLTNMAIEAGGMNGIIAADNIAEEYVKARTKKP
YEIFQSDPDAKYHSTYRYNVRDLEPVVAQPHSPDNRATVRSVAGTKITKS
YIGSCTGGKLSDFMMAAKILKGQKVTVTTTIVPATTLVARSLETEQYDGK
SLKQIFEEAGCNVALPSCAACLGGPADTVGRSVDNDLVVSTTNRNFPGRM
GSKHAGVYLASPLTAAASAITGKLTDPRDFL
>Cag_1263 hypothetical protein
MSKQDLKLMPRFLKGLPNIWTILGWILIISSFLKQPVVAGILIALAGHLL
TQAKHRDDLKEKQSLFYLDAWVKAYEEAQSLLKDGNNDRVEWIAAARALL
HAEQLEKKITEDAHLELLDLYKLKYRHFFYSIIDNKPESFFHDENDRDKA
LSEPSVYAIWKAAQWSEEYNDHSKDPLKRKFADDEVEKTKFASIGLYRFL
KKKRTR
>Cag_1489 conserved hypothetical protein
MFASSSSRKKSLPVVLVSKKTKSRIMQVVRRMVVVLAVVVGIAAMSFVSI
GLLLSLPASKPQKADVIVVLGGDKGLRVQKGAELYNDGFSKKVLLTGIDR
RYYRPSRPNWRERRIRDLGVKKSAIVVDTKSQTSWEEAMNTAATMERRKW
ESAIVVSDPPHMLRLWLTWHHAFAGTSKRFTLVATKPDWWHPIFWWKHKT
NYKFVVSELKKNIAYVVFHYIKWSDDPNQDVLTER
>Cag_2028 23S rRNA methyltransferase/RumA
MMADMQYRKGDVISLQVIDLAEKDSAVGRLESGITVMLSGMVAIGDHVSA
RITKVRQRYLEAATLDILEPSPERTTPVCSYFGVCGGCKLMHVCNEAQLR
YKQKKVSDALKHLGDFVEPPVASVLAGTSPLHYRNKMEFSFAAKRYLMPE
ELSLTELSRSKEFALGFHTPGNFEKVIDIDYCYLARNEMNQVLELTRRFA
LQHALPPYSVKTHTGFLRNLVVRFSEHTQELMVNLVTSWHDAEVMARYSA
MLHHAMPEQKMTVVNNITSRKNGVATGEEEVVISGEGFITERLCGLDFRI
SANSFFQTNSAQAEVLYQTLLAVADLQPTDTVYDLYCGTGTITLCMAAHC
HQVIGLEVVESAIRDAQNNALRNGITNAHFLLADLKDFHTLLPLLEEHGK
PRVIVTDPPRAGMHPKALATMVQLQPERIIYVSCNPASLARDGKELAAQG
YSLRSVQPVEMFPQTNHIESVGCFVKV
>Cag_1434 conserved hypothetical protein
MNHNGDVKERAKDILEETLDREAVIVLARISEEMALLFQAHPEPSKEKVV
EIVTGFFLENGKSEEFIADWIRTSEEYCHARGVALNDQPAAILSDFGVFR
FMSFLKDKGLTDEQITIVLTGAVQQAASGQE
>Cag_1459 Riboflavin kinase / FAD synthetase
MRLITYDNHHVYAYGSSEPLTFLPQPSVVTVGSYDGVHCGHRVILSRLVE
VAHHNNLRSVVVTFEPHPRTVLKGALTGPLGLLTTLEEKSDLLAAAAVDL
LFVVRFTHDFAARTSDDFIRNVLVGLLGAERIIVGYDHAFGRDRSGSHNT
LERLGNELHFGVEVIDEVLIGNEHLSSTRIRKLLQDGRIEEVNEFLGSPY
LITGWVVQGAQLGRTIGFPTVNLQFHPAKLLPRYGVYFARTMVQGVPYMA
LMNIGKRPTVSSNGEATIEAHLLGFEGSLYGEELRFSILRFIRDEKRFAS
LEALQEQLEKDKKAVEMYLE
>Cag_1236 hypothetical protein
MKKIYTTLRQVVKATAFLGMLATTSPVQAQEVTYNTEGWYGTAALSKIIN
TESSGMQANLGSGVIRPGEIDYNGNFVGMLAVGHENSFCRKNNTPIYLRT
EGDYLMGSADRKSATVDQYHTVLDDSVDFRALFANALLGIKDTQHTRWWL
GGGIGYGWVDRPAITGCSSTCSFAAATTDGFAWQLKAVVERTISKDAALF
AEARYVALPGESNSTSQCYDDINVATLGIGFRSYF
>Cag_1608 DnaB helicase
MISKKAAPIIDFSKDIDFSQESRIPPYSTEVEQEVLACVLLEDEPIEQVI
QIFGESSEEVFYERRHQTIFRAMMQLYHKRQAIDIITVSEELLRMGELEV
VGGRHYLAELSGKVISAANIEYYARLAKEKFLYRRLISIATKISGVAYNS
SMDIFDLVEHASQQFFTISQAGVKKKASPIKELVKTGIRMLENLRASQSS
VTGVASGFSELDQFTAGFQPSDMIIIAARPSAGKTAFSLALARNAAVDFN
TPVLFFSLEMAEVQLAIRLMCAEAYVESQLVRTGRITPEMMGRIINSMDK
LNEAKLFIDDTPGISIMELAAKTRRMKQEQNIGMVVVDYLQLVTPVRDGR
TNREQEIAQISRSLKALAKELNIPIIALAQLNRSVEQRSGDRRPQLSDLR
ESGSIEQDADVVMFLSRPEMYGKNTFEDGTSTKDIVEIVIGKQRNGPIGD
IRLLFLKNYGRFQSTANVYITANAEAESAPQAEPERYLQPSQEFPPPASG
GAFIAQDDAPF
>Cag_0627 outer membrane efflux protein, putative
MLHHKKRGVVTFTGKQLLVVVLLFLLPFAGVQGAENSVVKSGNAVTLEEA
LQIGLQRNRTLEVARLDRDIAHQKIRETWADVLPKLTLSGTYTRSLKPSV
LLLPPNPLFPSGELQTSSDNAAFVGLDLRQPLFNASAMAGIRAANIVRSL
SDASYRKTEMAVLTDIKLAYYDVLIAREQVKLIEQSIARWEQSRRDTRAL
FRQGIAADIDTLKAFLSVENLRPDFIQAESRVASAMTTLKNLMGVPADSA
IVLSGKLELPSGTKASYPATTELAAREAFEQRPDLRQIALQADAEAENVN
SLKAERYPLLSLFGKLEAQTSFNDGINPSESRWPVSSSAGVQLSLPLFTG
YRTSARIEQATLSRRQTLTRLEEQKASVRAELETALLHLHEAQQRIEVQS
KTIAVAERSYTISRLRFREGIGSRLELSDAELQLVKARTNYLQAVYDYLV
ATTRLDKSLGRRSALLPLTR
>Cag_0338 Succinyl-CoA synthetase, beta subunit
MNIHEYQGKDILRKFGVTVPKGIVAYSADEAKQAAEQLFAETGSSVVVVK
AQIHAGGRGKAGGVKLAKSPEEAFEIARQMIGMTLITHQTGPEGKVVSRL
LIEEGMGIEKEFYVGITLDRATSRNVLMVSTEGGMEIETVAEETPEKLLK
IQIDPLFGMQGFQAREAAFFLGLEGEQFRNAVNFITALYKAYTSIDASVA
EINPLVVTSEGKVIALDAKINFDDNALYRHKEYMELRDTSEEDPFEVEAS
KSNLNYVRLDGNVGCMVNGAGLAMGTMDIIQLAGGKPANFLDVGGTASPQ
TVEEGFKIILSDKNVRAILVNIFGGIVRCDRVAGGIMEAAKKMDLHLPVI
VRLEGTNASIAQQMLDESGLSLIAAKGLRDAAQKVQEALA
>Cag_0200 competence protein
MLTLMSPLVTPRATQLVQRKVLPPLHALRQLLPSGSIPPLRNVRHLLFPN
VCLVCEQLLQPHEEHVCGACYASFDAFASPELAEYYVRRTITDHFCFPTF
FERAWSRYKFHKESDLQELLHSLKYQGIFTLGVTLGKQLGEWLHSADLPD
DIECIVPIPLHPLKKIERSYNQAEKIAEGISQLLNRPVRSSLLTRQRYMV
SQTGLSATERQQNAEGAFCAKAPLRIGHVLLVDDVLTTGATMVAAAQALH
DAGVAKVSIVTVAVAAKEM
>Cag_1532 conserved hypothetical protein
MLFAGLALAAFGLLITIASKAGATNWLSWFGNLPLDLRIEKENFNLYFPL
GSMVLISLALNLLIYVFNKLFR
>Cag_0736 hypothetical protein
MIRDLFFNIAPAISTLFKLGFEPKEECFYELTIDQYEQLRKDGEDITETL
YMILPEESKYMDNDIIVVNEQEKNSLLKAKKVIENYCEKGGKVFNSYQDK
LTYVSNLLPSVFTEDSNFRKCHLKLVEPNQSK
>Cag_0731 Adenylylsulfate kinase
MNQNIFPVFDEIAGRSEKEQMLQQHGCALWFIGLSGSGKTTIARHIEQLF
LQQGILTQLLDADNIRTGLNNNLGFSEADRVENIRRVAEVTKLFVECGIV
TLNCFITPTNEIQAMVKDIVGKNNVIEIFVDTPLSVCESRDVKGLYRKAR
EGQVAHFTGISSPFEPPQNPSIHLQTNEMTLDLCVQKVVEYVLPKIH
>Cag_0416 Endonuclease III/Nth
MNPQEKIIALHDLLSKQFPNPKSELEYLSPFQLLIATILAAQATDKQVNV
ITRELFKRAPDAITMSRMELEEITGYVRTINYFNNKAKNILEVSRRLVEH
FGGEVPQEREALESLPGVGRKTANVVLANAFGMPVMAVDTHVHRVSNRIG
LVSTKKVEATEEALMAIIPEAWVADFHHYLLLHGRYTCKAKKPACPTCTV
AHICDFAE
>Cag_1818 Cysteinyl-tRNA synthetase, class Ia
MALAIYNSLSRTKEIFEPLHSGVVSIYVCGPTVYGHAHLGHAKSYISFDV
VVRWFRQSGYKVKYIQNITDVGHLTDDADEGEDKIMKQARLEKTDPMEIA
QFYTRSFYEDMDRLGVERPNIAPTATAHIPEQIALVETLLRKGYAYEVNG
NVYFSVNSFAGYGKLSGRTDQEALQSGSRVGIRSEKRDASDFALWKKAEE
GHIMKWQSPWSVGYPGWHLECSAMAMKYLGETIDIHGGGMENKFPHHECE
IAQSEAATGKPYVRYWMHNNMVTVNGTKMGKSLKNFVNLKELLQTRNPLA
LRFFILQSHYRSPLDYSEAALDAAAQGLEKLHETLRRFRHQAAGSGSLDV
TPYAERFSEAMNDDFNTPIAIAVLFDLSKAINSALDSKGIVEADRAAIEA
FLTIAATNTLGIASNNLADEQSNGNSMQRLDKVMQIMLELRHNARKQKDF
ATSDKIRDMLLAAGIEIKDTKEGAVWSVKG
>Cag_0109 Peptidase M22, glycoprotease
MIILGIETSCDETSASVLHNGVVLSNIVSSQHCHTSFGGVVPELASREHE
RLITAITETAINEANIQKDALDVIAATAGPGLIGAIMVGLCFAQGMACAL
NIPFVPINHIEAHIFSPFINSGANSPLPKEGYISLTVSGGHTLLALVKPD
LSYTIVGKTLDDAAGEAFDKTGKMIGLPYPAGPVIDKLAENGNPNFYHFP
RALTSRSKSRKSWEGNLDFSFSGMKTSVLTWLQQQSPESVASNLPDIAAS
IQAAIVDVLVEKSIAAAKHYNVSTIAIAGGVSANRGLRSSMQAACQQHGI
TLCLPETIYSTDNAAMIASIAALKLSHGMEPLYRYNVAPYASFLHKDNFS
>Cag_0850 ATPase
MIQHLLRVSHVSKQFGGNVAVSDVSFSVEEGQIFGLIGPNGAGKTTLFNC
ITGYYPASSGDMIFDGKNINSLRPDQVCKLGMARTWQRVKPLGSLSVLDN
VMVGAFAITSNIKAAESLAKEQLNKVKLNDYADKQADSLPIGLKKKLELA
RVLATRPKMLLLDEICGGLNHTETDGILEIIRGIRKGGATVVFIEHDMKA
VTSICDRIVVLNSGEKLAEGTPAEITSNPAVIAAYLGGGHHA
>Cag_0095 Single-strand binding protein
MAELKMPEINSVIIAGNLTKDPVFRQTNSGGTPVVNFSIACNRRFRDSNH
QWQEDVCYVGIVAWNKLAESCRDNLRKSSAVLVDGELQSRTWKAQDGSSR
TVVEIKARRIQFLNKKHKNGEDDVEGFIEDECPDQHHETLQDEDADYLYD
CK
>Cag_0016 conserved hypothetical protein
MYDTTLLLEKLEQIDLAIEKIKRRFTTIKRPDDFLDSEQGIDMLDAIAMM
LIAIGENFKIIDKATNGSLFVPYPHINWAGVKGLRDILSHQYFNIDAEEI
FEICQKHLDDLHEVVKHMMEALPS
>Cag_1689 RNA binding S1
MFVQKSIDLGHGRILSIETGKMAKQADGAAIVRLGDTMVLATVVSSKKTP
PLGQDFFPLQVEYREKYSAAGRFPGGFFKREGRPSEKEILSARLIDRALR
PLFPDGYYYETQIIISVISSDQVNDGDVLGGLAASAAIMVSDIPFQNPMS
EVRVGRVNGLYIINPDVNELKESDIDMCIGGTNDTICMLEGEMNEISETE
MLEAIQFGHTAIRKLCVLQQEVAAEVAKPKRLFVPTVIPTELDSFVRANC
QSRLRELAYTPLGKEERAERTAAIYNETLAATLEHFRATISSEEAQADTA
KALCLNEHIIEDQIHAVEKEVMRHMILDDAKRLDGRALDQVRPITIELGL
IPRAHGSALFTRGETQALVTLTLGTQKDAQSVDNLTNSDDKRFYLHYNFP
PFSVGECGRLGSIGRREIGHGNLAERSIKMVAPTEAEFPYTIRIVSDILE
SNGSSSMASVCGGTLALMDGGVPIRKPVSGIAMGLIKENDKYAVLSDILG
NEDHLGDMDFKVSGTRDGITACQMDIKIDGLDYHIVENALEQARKGRLHI
LNEMEKAIPASRTDLATYAPRLTTIKIPSDCIGMVIGKGGETIRGITEET
GAEINIADDGTVTIACTTKEGTDAALATIKSLTAKPEVGNIYVGKVRDVR
DELGAFVEFLPKTDGLVHISEISSTRVAKVSDHLKVGDKVTVKLVDVRKD
PRTGKTKFALSIKALEQPKQEGGEATQN
>Cag_1656 Glycine cleavage system T protein
MKKTALYPCHEQSGAKIIDFGGYLMPVQYAGIIAEHKAVRSAAGLFDVSH
MGNFFVKGSRALEFLQFVTTNDLAKVVDGQAQYNLMLYPSGGIVDDLIIY
RMSADTFFLIVNASNADKDFAWLQQHIDQFEGVTLEDHTERLSLIALQGP
LALSILNRLFPSIDGEALGSFHFCSASFNGFDVIIARTGYTGEKGVEMCV
PNEAAIALWEALMAAGAADGIQPIGLGARDTLRLEMGYSLYGHEINQDTN
PLEARLKWVVKMDKGHFIGKEACEQAMQHPQRTVIGFSLEGRALPRQGFT
LYNSDRQAIGVVCSGTLSPTLQEPVGTCSVLREYGKPGTPILVEVRGAFH
AGIIRSLPFVTNTSLA
>Cag_1327 ATPase
MVSAFAQEHCAIKIPSTLFMKQLLSLKPYLLKYKKNLWSGFFFIVLHNLF
AVAAPTFIGKAVDGMSGALSLSAILTDVGFYFLLTLLGGYFLYLVRQRII
VTSRHIEFDLKNNYYQHLQRLPRSFYNATTTGELISRGTNDMNAIREFVG
PGIMYSFNTFFRLVFAIVAMVAISPSLTFFALLPAPLLSYLVYKIGASMQ
KRSKSIQESYAAITNLVQENIAGIRVVKSYNREAFEIERFRLLNSDYYGK
NLALGKLQALFFAFLTSMTALSLLPVIWVGGMYVVDGTMTVGGIAQFIVY
VTMLSWPIISIGWVTTIIQKAATAQVRLHEIFSMKPEVHKDEKADALPQK
ATLRFDNVSFHYAGQAEHPVLKRLSFEIPAGSKVAIVGATGSGKSSLVNL
IPRLYDPTEGAITLDRYDLRTIPLQHLREMVGFVPQVNFLFSDTIAHNIT
WGSRGEAHDEAAVEAVIEASKIAMLHADVEDFPDHYHTMLGEKGINLSGG
QKQRACIARAVAWQPKLLVLDDALSAVDTDTEARLFEALLQKLPDTTIVL
ISHRISTVKNCDRIMVLDHGEIVESGTHEELLERQHLYAELYNQQLLEEE
ILSM
>Cag_0746 hypothetical protein
MNFDKDVSQQQLLTNIFIISWAGQHENAIFIANQISFVTNKITIVYSDPN
SDFLLDVPCVLIKRPNDLFWGDKFEASLHACKDDFMLVIHADCKCDDWKG
LVIRCNEIFSKNKDIGVWAPKIEGTPYYLERTKIASIEYNALSLSLVAQT
DGIVFALSLPVVNRMKKINYMNNKYGWGIDWIFCCTAYALNLMVVVDEKH
TVIHPLHRGYDTRQAVMEMNTFLKQLTTVEFIQYRLLSSYLKLSDIKTIA
KV
>Cag_1360 conserved hypothetical protein
MFEWDENKRLKNLEKHKLDFVAAVPLFDGRSTVTATSNSANETRYVTTGF
IDGKFYTVIWTWKGSIRRIISFRRARHAEERAYCTLYGI
>Cag_1012 Cobyric acid synthase CobQ
MVTPTKTYRSLAILGTASDVGKSIVATALCRIFRNAGIDVAPYKAQNMSN
NSGVTPDGLEIGRAQIAQAEAACVVPTADMNPVLLKPNTDIGAQVVLQGR
VCSNESAQGYFRDTSRWAEAARESLLRLKQKHELLVIEGAGSCAEMNLYP
RDFVNFRTAREADAAVILVADIDRGGVFAQVVGTLAVIPPEDRALVKGVI
INRFRGDSELFREGITMLEEMSGVPVLGVIPYFRHFAIDAEDAVPLSAKV
DPAQEPETGKVGVAAIYFPHISNFTDLSPLEHDPSVTLHYLHHPKPLSAY
KVLILPGSKNVRGDYAWLQQMGWEKEIRAFREAGGLVIGICGGYQMLGCS
IADPYGVEGESGTTQTLGLLETETLLEQEKYLANSEGVLVGTSIKAFGYE
IHNGRTTVGANCHPLMNIVARNNHPESDVDGVVSADGRVIGTYFHGIFNE
PAVKQWFLQQADSTYTLQPHERGRQESYELLADHFRQHLDVNKLFELIDY
EQSLLP
>Cag_0084 transcriptional regulator, XRE family
MPKNDLTYQPVIHNHEEYLREELKDEEFRKGYEALGPTYTLIRELLLARQ
QAGMTQEAVAEKIGTTKSAVSRLESAGKHVPSLSTLRKYAEAVGCELEIK
LTPKKQGFA
>Cag_1436 DedA family protein
MLDAVVLWLQSADPSLLLFILFITAFLENVFPPIPGDVPIAFAGYLLYLQ
GGEGFTQSLLWSSLGSSAGFMVVFWLSKTVGQKWYGSEAQPHSSRLAKQV
LRLFPPSDMELLRQKFAAHGYVAVLANRFLFGSRAVIAVMAGMMHLHAAG
VFAASLVSAVLWNLLLLSGGFLLGSQWQDIGNYLVLYSAPVSLLFLGVIA
FSVWSFMKQRKKHHHSDH
>Cag_1717 conserved hypothetical protein
MAITFLQMAEKVLEEEKQPLTASEIWQIATEKGYDKLVESKGKTPWATLG
ARIYVEVRDNPSTDFIPLATRPKRFSLKTQMSILGGKIPETTKAPQLHTP
KIEFLEKDLHALMVYYGFYYLKAYLKTISHNKSDKKGFGEWVHPDIVGCY
FPYKDWKAEVVEVSSLMGNTAVKLYSFELKRELSIANIRESFFQAVSNSS
WANEGYLVAAHIDNDEDFRSELKRLSTAFGIGIIQLDIDDPDSSGIILPA
NSKDVIDWDTVNKLAGINPDFNDFLKRVRNDMKNQEIRKELYDVVVEKEE
LKKLFTKKKSSS
>Cag_0766 conserved hypothetical protein
MTIESISQRQSARNELISSLLARCPMNVEATGSHRSFIMDKRGEGVPIII
TESEKLSGRRPEYQLLDDAELKLTIFATPSPHGDE
>Cag_0691 hypothetical protein
MSKREELLQSYENGNFLETVYACSSNDHNDRSSVVFDLVALNNEGLIDVV
GAFQSLKNESSNSPDFFLTRHVFEKALPELEASVPAVMHCVLQLYLDAGQ
DLAASTVINSFFDFCTKKASRPHEALEVIKASPGKLAHLLPATLIAGSQI
DSSFYLCETMRLCKDENIELKRWALFSIGKLNLPEDIKKFGDALSALEYA
AVQETDDQILSSVIKSAFPLLQRDKSQEPRAIAIIISALRKGDDYVLHAA
SEIFGFYTGELPTTLREAFFVDLLRVKPTHKGTLDNVDYGISHLLKNGNS
EQAIQFLEALLLRHSGELTIEVFDSAISEIVSNKAFISKVLTRWFLRGDR
VLCEAVHEIVWAHHGSGLLLEIDVTELNPSDSGHILFIARKAIGYLFMQP
LSAASVLISLMRNATDDKVLKKLGELLLDPLLLNYTGKARDYIIKQSGSE
SGKVKETIDNALKDINTYLEELRSVGSLSALHPSEAQREAYNRHYSQLMA
ESWKAAEAKSVVLNLFPKSVLLYGKKLINYVYGSDGQSHRQEIPLQCLGS
EMEYARMHIFDPFGMDYMLRVFRNEQLKT
>Cag_0408 sodium/bile acid symporter family protein
MWKMLGWLQKHLIWSIPLAMVAGLLFGSVTVPTLLRMAILPLTFLMVYPM
MVTMNVRDLLKAGDNTLQGITLAINFLLMPAIGYGMGLLFFPHEPMIRLA
LLLTSLLPTSGMTISWTGFAKGNIASAIKMTVLGLIAGSLLAPLYLKILL
GAVVEIPLLQVFLQIGLIVFLPLVLGTLTRNWIIKRYGSEKYNQQIKPKF
PPFSTIGVLGIVFVSMALKAKSIIAHPVVLPNLLMPLVVIYAINFMLSTL
VGKLFFQRNDAIALVYGTVMRNLSIALAIAMGVFGEKGTDAALLIALAYI
VQVQAAAWYVKFTDRLFGAAQHAHA
>Cag_0287 conserved hypothetical protein
MDNALHWINCLQLQPHPEGGYYRETYRSSGNYSFSDSAPQTSESTFFQGE
RSYATAIYYLLQSGERSRLHRIHSDELWFYHAGAPLTVHIFPETGEPSCF
TLGLDVAQGEVPQAWVPAGAWFGASGASLKNASVDDYALVSCVVAPGFDF
RDFTFADRHELLRKFPQYSTTIERLT
>Cag_1577 conserved hypothetical protein
MIWEVQQTRRFARQYKKLHNNLAADVAMAIDVVKQNPSIGERKKGDLAAL
YVYKFRSAGQLYLLGYSLDEHIRLIYLEAVGPHENFYRDLKR
>Cag_0897 conserved hypothetical protein
MAKKQSFVDKTKKAGASDFKTAKVIFSVRSEKTNAWRFIEKNVRIPNGEN
DQEVISKAIAGFSK
>Cag_1938 Sel1-like repeat
MNVMKKFITSIVIASLTLLAINGFCETPSQKQISQWQQAAAQGNSEAQLN
LGYAYDHGEGVKQDYAEAIKWYRLSAAQGDVKAQFNLGVMYYNGEGVKQD
YAEAIKWFRLLATQGDAIAQFNLGVMYYNGEGVKQDYTDALKWFQLSAAQ
GNAMAQNNLGVMYAKGEGVQQDYAEALKWHRLSAAQGNAMAQNNLGAMYY
KGEGVEQDYVEALKWYRLSAAQGDAVAQWILGLMYYEGQGVRQDYGEAIK
WYRLSAAQEDAKAQYNLGLMYYNGEGVKQDYAEALKWHRLSAAQGNAMAQ
NNLGAMYAKGEGVQQDYAEALKWHRLSAAQGDATAQGILGLMYCEGYGVR
QNYGEALKWYRLSAAQGNAGAQYNLGLMYYNGTGVRQSKAIAKEWFGKAC
DNGFQDGCDAYRELNEAGAKTNRSR
>Cag_0282 Aspartate-semialdehyde dehydrogenase, USG-1 related
MSSSESRCRVAVFGATGLVGRTMMQVLEERNFPVTDLVPIASPRSAGQKI
PFKGKEYTVCAASAEAFRDVDIALFSAGATASKEWAPVAVAAGAVVIDNS
SAYRMEPDVPLVVPEVNPGDIFLADGTPAPIIANPNCSTIQMVVVLKPLY
DRYGIKRVVVSTYQSVTGKGKAGRDALESELAGEEQEQFTHFHQIAFNAV
PQIDVFTDNGYTKEEMKMVNETKKIMGDDSISVSPTTVRIPVYGGHGESL
NIELAQEFDIDEVRTLLANSSGIIMQDDPSARLYPMPLTSYERDEVFVGR
LRRDYWHPQTLNMWIVADNLRKGAATNAVQIAEVIANR
>Cag_1615 conserved hypothetical protein
MQSFKHISLKALAGLLLSVLLVALMALVLLNSGSVDRAARALAMQLFQKE
LHGRLEIDELHLTFPNHVTLLHPRVYAPNEREPVVEAARMTARFHFLALL
QPDIKKLAFQSLEAQRLKVRLVQNEQGSLNIERAFASRYPDTTKTGIEEY
FCKQLSLKQASFSYSFIKDGKEIPLARANNINAQLRSFTAGKALVKGEIQ
EFQSNIASHALVVQKMQGRFFFSDKRSEVLDLQTRIGNSHAILSATLEGV
SLFKPSLLQQVAQSNAFVAIEEIDLHSNDVKRLFPSLPLLEGIYQVKATA
KQQNGTLELREAQLVYRKSKLALQGTIQHPFESNKLQYNLQCDSSKVSSE
LLTALITNKEDQQLVTSLKSIGDITLAGKLSGNLSALQADVQSLTNVGMV
GFKGAIERQPNNSFAAHGNVALNALKPHLILGMADVKSQLNATGTMELLV
EPNALPEVALALQLQNSFWQHLNVTKGTLSFRHKKQLYEGALSLSNGSEN
LAVQGTVNLNSAQPTYDLTGTTYKLNVGQLLQSKNFSTDLNSRFTLQGEG
FDLRQLNLQSSVVCAPSVINDVVLPNGTAATLSIAQQGTASRVKVTSDFF
DVTAEGHYTFEDLTALGWMALSGISHEVARLNIWGETAQLNPPANMLTPQ
PFTVTYQLALHNIAPLLSIIAPLQQIAMQGTAQGKAEHNGGAYTIAGTID
LNNLIVEEEFAAKRIHLQGSLRGNSNGILEAQGRGAIAALRVGKQKVRNT
NVTAAYVPTTLTSSIDVDVADVVQRVSTSFAMKQQGSGYLLDVQQLNVQD
REGSWQANNNLPIVLDKEVLRFNNFTLMRGAQKAVLQGELSNNRASAFTC
TLSSLQMNELQHFMLNAGLEKLQGIVSATLHISGVPGAKQSSISVRADNV
AYDDLMIGIVQGSARHSNNLLHFELQSEAPAMVNGIQSAQRSNALLSTIE
GSGTIPLELKYYPFKLRVNEQQNVHATFRSDNLSARFLEYLLPFFSAAEG
TIPTLCTIEGSAAKPLISLHSRLQDTRITVKPTHVSYRLDGDIYATPQAL
ELRNITLSDNNNGKGSIRGFVHLEKLQPSRLELAASCNNLLFYNKKDQQD
DTSFGSIVGSTRNFTLTGSLRSPIVEGEVQIDRADYSLYSAGANESAQYV
GIDNTISFVARNPKPKAPKAKELKSGGSKEFYYSLIDILTIHNLRISSPM
PLKYTTIFDRIRGEELETTLSNLSLVVNKNSQRYRMFGSVQVTSGTYRFS
NASFELQPGGSITWNNVDMRSGVLENLYGRKYINALNPQSGERDDVHLLL
AMTGTLNEPQVAMGYYLNDQTQPYASSTTIGTETSKVDPNAELNVISMLL
SRQWYSKPGDGGTQENVALTSASFSAGTGILSSRISRVIQTIGGLESFNV
NVGMDKKGELSGLDLYFAVNVPGTDGKVRVSGSGSANDPRTANASTAYGS
NQKVEYRVTPKVYLEASHSSGQNSGISSSSSTLQKPTDTWGVSLSYKERF
HHWDQFWKKLLPFSSDKASDKTPNKVPDKASNKKENKPNE
>Cag_0839 conserved hypothetical protein
MGMMNHTESEKSSLTAYNQELALVVIPLLKGVIYQEENPSLWEVLRNELA
GVRDYVAVLGLELILDEAEGYAFLRSRSEGETEGANNAPRLMARRQLSYP
VSLLLALLRKKLAEFDAGAGDTRLILSRDEVVELIRIFLPPASNEVKLID
QVDATLNKIADLGFIRRLRGERQMIEVRRIIKAFIDAQWLAEFDERLNEY
LQRPVNAMERENE
>Cag_1043 conserved hypothetical protein
MKKRSALTLLLLPIAASALLAGCSPTVKIEASDKPITINMNIKIDHEIRV
KVDKELDSLLNQKSALF
>Cag_1664 Beta-ketoacyl-acyl carrier protein synthase III (FabH)
MKAAITATANYLPPDILSNHDLELMLDTNDEWIRSRTGIGERRILKDPTK
ATSYMCGEVARQLLEKRQLAATDIELIIVATITPDMPFPATACLVQDLIG
AKNAWAFDLNAACSGFLYALNTAIRFVESGAHKKVMVIGADKMSSIIDYT
DRATAILFGDGAGGVLVEPAADDSCGVLDVRLYSDGTNGRERLLMTGGGS
RYPASHETVDKKMHYIYQDGRQVFKAAVTAMADVATEIMQRNNLTADDVA
YLVPHQANQRIIDAIAERMGVDRAKVVSNVGYYGNTTAGTIPICLAELDA
NNKLKHGDNLVLVSFGAGYTWGGIYVKWQ
>Cag_1097 conserved hypothetical protein
MIDSREQKIIEYLKQAGACSSKELHEHIGIAVSYATLKRILSKLIAENFL
VSVGQGKGTKYTISPIFELLEPIDIERYYEKEVDERDIKEGFNFSLINGV
LVKYNVFTKAELEKLNKLQAKFEHNISQLTANEYKKEFERLAIDLSWKSS
QIEGNTYSLLETERLLKEKQTASGKTKEEATMLLNHKDALDFIIEHPDYL
YPLSVSKIEDIHTILTKELSVERNLKKRRVGISGTNYCPLDNEFQIAEAL
RSSCDVINAKSSVFEKALLALVLISYIQPFMDGNKRTARIVSNAILINER
HCPLSFRTVDSLDYKKAMVLFYEQNNISHVKEIFINQFEFAVGTYF
>Cag_0483 Ham1-like protein
MTDTITIILATGNRDKVKELRPLLEHISPIITVVTLPELGVSVDVEETEE
TLEGNALLKARAIFSILENRFPFLIALADDTGLEVAALDGAPGVYSARYA
PTADGTAPTYSDNVNHLLKNMAGKEERSACFRSLIALKGRIPTGDSKAFA
FEQTAAGEVHGSITREPFGDGGFGYDPVFYVEATAKTYAQMSIAEKNSMS
HRARAVQQAIADLRTLFAEHHLQLTPSTEQT
>Cag_1528 restriction modification system, type I
MEKIESIKTLYQQSHNNLETLYNALSQKAFKGELDLSRVAVPEETADGKR
RD
>Cag_0211 ATP phosphoribosyltransferase
MSNENKVLKLGLPKGSLQDSTIDLFAQAGFHFSVQSRSYFPSIDDDELEA
ILIRAQEMAHYVELGAFDVGLTGKDWIIETDADVVEVADLVYSKASMRPV
RWVLCVPESSSIQSVKDLEGKHIATEVVNITKKYLAQHGVNASVEFSWGA
TEVKPPDLADAIVEVTETGTSLRANKLRIIDTLLESNTKLIANRQSWEDP
WKREKIENMAMLLLGAINAHGKVGLKMNAPKASLEKLMSIIPALRQPTIS
ALADAEWVALEVIVTEKIVRKLIPELKRAGAEGIFEYNINKLID
>Cag_1765 peptide ABC transporter, periplasmic peptide-binding protein
MVMNSILKKTATFRRLVGVVATTACTLLTACGGSNSKSTAQVGTPAMDST
LVIAMLGDADYLNPVLAGTVTSSNIVGLMSPSLLQSEFDTTTGLLNYMAL
EKKLRAGGSSSKMPQGAVAKSWTMSADHTTLTYTLRSDAFWSDGKPVVAE
DFKFTYQLYGNPLIASARQQYLAELVGADKGQIDFNRAIEAPNDTTLIFR
FYKPVPEHLALFHTSLAPLPAHQWKGVKAEEFRQSPLNLQPLSAGPYRLT
RWTQQQEIVLGANRTSTLPKPGNIPTLSFRVVPDYTVRLAQLQTGAVDVV
ENIKPEDFSTLKQAKQPVDIKTLGLRAYDYIGWSNIDFTEYRKNHRIKPH
PLFGSPTVRLALTQAIDREAIIDGYLREYGVVCNTDISPSLKWAYNNKIT
PHPYDPAKAKALLAADGWKYGADGILQKQGKRFSFVLHTNAGNARRNYAS
VIIQQNLRAIGIECKLEVQESNVFFENLQQRKLDAWLAGWAIGLEIDPLD
TWGSNLEKSRFNFTGYQNPRIEQLSALAKQKMEPTGARPYWLEYQEILHR
DQPITFLYWMKETHGFSRRIQGAELNIAGAFYNLDDWKLQPSASIQ
>Cag_0995 hypothetical protein
MSTMDTTTNTQELNRIISKLHTERVRYDCIETLGKVNEKEALLEASRYFI
KATKKQITEAMQEIEAIKGKKYNSSFNPRAREGRDPKFFKCS
>Cag_0615 Outer membrane protein-like
MSQQIFLTINQLVIMKQHQNISTMGGKIIAIALLAPLFGFSQPSTSKAAE
GDSAPSQATLAPAISAADMQASPSVQTAPTIVAAPQVPTASGLRLQQFLA
SVVDNNDEIKVQKLEWLSNERLLKASRGMYEPVLKVSATRESNHMQNTAQ
EYLQTYSQHYEFSEANNIWSSSIEGLTPFGSTYRLGYDYKKLQNSLQSAM
AVPTDEEYVTFLGLTLTQPLLKGSGQEATNANIRISRANADIAYEGYRQA
SVEAVARAVQLYWQCYGAQEKLAMRQRSATIAEELLQANKSRYEAGKVDY
TAVLDAESGLRLRQALVAAAEQTELTSRKNLLSLAGESAMAQVPATIRME
DVPDCSPLSPDYKQVYEKALTSYPQYLSALATVERENFRATYAHNQEKPQ
LDVKGSYGYNGLGTTVDNSLDRLGSTDFPSWSVGLELTFPLIGDMKSRNE
ATAARLKKEQAIRRLEMQKIELSNQMDIVAGLVSRVYSQVQNYEKVVAIN
AELVRIEDTRFKLGKSDTRMLLEREEEYLKVSESLLDSRLAYQYALVNLY
ALEGSLLTRYGLTLSDKTSATTLTQGM
>Cag_2003 fic family protein
MKFEEFTAGYWQQRYQYKSFEPSLINHEWTWDEPTINTLLEQANCALGEL
NAFSLIVPNIDLFIQMHVVKEAQTSSKIEGTQTGIDEALLSEEQISPEKR
DDWREVRNYIDAVNSAITTLHDLPLSNRLLKQTHKILLSGVRGEHKLPGE
FRVSQNWIGGSNLTDASFIPPHPESVAELMSDLEKFWHNQDIAVPHLIRI
ALSHYQFETIHPFLDGNGRIGRLLIPLYLVSHGVLAKPSLYLSDFFERHR
SSYYDALMHVRTSNNLIHWLKFFLNGVAQTATKGRDIFQQILTLREEVEQ
AVLSLGKRATLAREALHLLYRQPIVEATDFSTMLKVSAPTANALIQALID
KAILVEITGQQRGRIYSFERYVKLFME
>Cag_1632 Ribosomal protein S21
MVSVQINDNETIDKMLKRFKKKYERAGVLKEYRANAYFVKPSIDNRLKRS
RSRRRAQRANEERNS
>Cag_0186 glutamate synthase, large subunit
MNAMQKAGLYDPQFEHDACGVGFIAHMKGIQSHDIVEQGLRINENLKHRG
ACGCEKNTGDGAGILLQMPHKFLRKVCRDVNIDLPTDNRYGVGMVFLPPD
ISQRRAIEDICRQMIQVEGLKFLGFRKVPTNSESLGQTARSQEPVVKQLF
VGWGKKTTSELDFERSLYIIRRRITKRVKYTAGLLGSSYFYFASLSSRTI
VYKGMLLPEQVGAFYPELHDSDMESAIAMVHSRFSTNTFPSWDRAHPYRF
LSHNGEINTLKGNVNWMKAREKNIESEIFGTELETIKPIILEDGSDSGIL
DNVFEFLVLSGRSMAHAAMMIVPEPWSGNKQMSPEKRAFYEYHSCLMEPW
DGPASVTFTDGVQIGAVLDRNGLRPSRYYITSDDLVIMASEVGVLDIAPE
KIIRKDRLQPGRMFLVDTAQGRIISDEEIKRSIAAEQPYSEWIDRNIIDL
ESLPDHPRMKNPDADKYSITARQKVFGYTQEDVNLQIKVMSEKGVELVGS
MGNDTPLAVLSNKPKLLYDYFKQLFAQVTNPPIDSLREELITSTMVMLGT
EGNLLEPTELNCRRIRLPHPILTDEALEKIRGIDKPGFRALTLPIFYRVN
DGGKGIELAMHDIYRLAEKAAHDGVNIIILSDKGELDSEHAPIPSLLAVS
GLHNFLINAGLRTKVGLVLESAEPRTVHHFAMLISYGAGAINPYMAFETI
HSLTETGKVAFDEKTAIKNYVKAAVKGIVKTMAKMGISTIQSYRGAQIFE
AVGLNSQLVDAYFTKTPTRIEGIGLDVVAEEVHKRHECVFPRSGNKVDRG
LEAGGERKWRSNGEFHLFSPEALHFLQHSCRTENYELFKKYEGLIDDQSQ
HLCTIRGLMDIKFGDHPIPIEEVEPIETILKRFKTGAMSFGSISQEAHET
LAIAMNRLGGKSNTGEGGEDPARFKPDANGDLRKSAIKQVASGRFGVTSE
YLANAEEIQIKMAQGAKPGEGGQLPGSKVYPWVAKVRHSTPGVGLISPPP
HHDIYSIEDLAQLIHDLKNANPAARINVKLVSTVGVGTIAAGVAKAHADV
VLISGHDGGTGASPISSIMHAGMPWELGLAEAHQTLMLNNLRSRIVVEAD
GQLKTARDIVIATMLGAEEFGFATTTLVVMGCIMMRCCQDDSCPVGVATQ
NPELRKNFKGKPEHVETFMRFLAQGVREYMARLGVRTLTQLVGRSDLLNM
RKSVNHWKAQGVDLSKILHQPEVADSETRYCTIEQDHGIRESLDYTTLLS
ICKPALKKKTRVVSTLPIRNTNRVVGTIVGYEVTKAFGAEGLPDDTIHLK
FIGSAGQSFGAFIPRGMTLELEGDANDYIGKGLSGGRIMAYPCSSSTFVA
EENIIIGNVAFYGATSGEGYMRGRAGERFCVRNSGMTAVVEGIGEHGCEY
MTGGRVVILGSTGRNFAAGMSGGIAYVYDFDGNLEAHCNKDMVALTPLES
KEEFTEVRAIIERHTAYTGSTIGQLLLNNDALIHKHMVKVMPHDYRRALE
AMKEVEAAGLSGDEAVMAAFEKNIHDPARVSGN
>Cag_1991 Peptidase S41A, C-terminal protease
MFPQRESKPRHKQSQRNGWRIIQRMATALLALSLPTTTLAYPQAESQSFA
VVSSIELLSEVYRELAAGYVEPLDTALLMKTGIRGMLRSLDPYTTLLERD
DADELADITRGRYVGIGISLATLEKKLYVTAVNEESPAAAAGIRTGDAIL
AINEAKVANIAVDSLRTLLHGTNGSPITFQLERRGSAPRTTTVQRQSVPL
KSVPYYELHNNIGYIALDGFTTRSPHEVRSAWQSLQQQATANKQPLRGLI
VDLRDNSGGLLDAALEITSLFVPNGSEVVSIKGRSTHSHSTLKTTTEPLD
ATLPVALLINGDTASAAEIVAGALQDVDRAIILGERSYGKGLVQSVKKLS
YGNTLKFTTAKYYTPSGRLIQKELKKESSPHSTNADSKQALASAVPDTTQ
RFYTRNHRIVYGGGGIMPDVEIKEPASPYVTALRKRGMIFLFANEWYATH
SDDAPASSALLPSQTELLAHFEKFLQQKEFRYTSNAAKRLEELKSAMKES
GRENPEALRTMEREVELADTEERNREAKQVAVALESAILRHASEHLARQA
ELRHDALVLQAEELLIYPARYRAMLKASSTRK
>Cag_0156 ABC transporter, permease protein
MESELLLLLLQIARLSVPYVLTSVGATFSERGGVVNLALEGLMLAGAFGA
AYGEYLSGSPLAGVAMALLFGTAVALLFAFVTVTLKANQIVAGIAINLLV
MGATRFGLTLLFGSAMNSPRLEGFAAPFLLLDPLFLTALLSVGVGQWVLF
QTPYGLRLRSTGESAATADSAGVSVSSMRYSGVVVSGALAALAGAFLLFQ
QHTFTDGMTAGRGYMALAAMIIGKWTPIGAALASILFAAAESMEMWFQSG
VIPSQIIQTLPYVVTLVVLAGFVGKAQAPREVGVPFENGRGE
>Cag_0393 hydroxyneurosporene synthase CrtC
MNITTRPEEELWHNVSSAGAYEWWYFDAVDVESGISFVVIWFCGFPFSPS
YATHYEQWKRGAINHPPHPSDYSAFSFQCYEQGQELINFIKESDRSAFSS
NPSSVGVTFEQCSFTYQPDNDSYQLSIAFDFPARNRSVKASFCFAVQQRV
ALEQQDGNNGGKVPRHQWLLGAPYARVSGDMRLMNSGGQHLRTITVQGAH
GYHDHNLGELPMQEYMKRWYWGRAFSSRYYLVYYLIFYRNTAYAPRFFVL
LHDVELGSTTHYQPATLREEQLHHGLFAPLHSRQLFLSGNGASLTIEHHH
PLDAGPFYLRFPATISLKQEGQAVITLQGISEFLNPQRLNSHFFRFFTRS
RILRSNQPSLMYNVYNRFKQIVG
>Cag_0216 response regulator receiver domain protein (CheY-like)
METIAQNSFECHKEHADAKYIKHMKIVNQNIVHVWVETEYGTEIDLEHMD
IDLLQSLLPKIQSPDKKLHILYNLTNIKSITYNYKYAITKLLFHLSPSIG
VIAFYNISPSVNAIIQIFEAITPKGIIILQEKTYDDALTKIMQYNDGKPP
TSSHAMLNNNDAQYIRCFLEGCARLSWLNIYDQPITIPPEGHPLQLLFQA
FDSLREDLIVKEKTQGIEIARLRQEADRAIQDKIIVLKAKQELYAKKIQE
KEEEIATLAKQIAHQENVKSYSIDHQKKHRKEIASLLSLINSLDINDSAK
EKLQAACNKILKDEKELYNKISEVDKFYIQRVKEKFPILSDKELHTCLLI
TKNFNSKEIALEMGIAERGVETFRYRIHKKLTLPKNQSLKNYLLSLTL
>Cag_0569 GTP-binding protein Era
MTPSHFASGFAMIIGQPNAGKSTLLNALLDFKLSIVTPKPQTTRKKITGI
YNSERCQIIFLDTPGILKPRQKLHESMLGVVRSTVTDADVLIALLPYTGT
AELFDRSFAAELFTEWILPSGKPVVAVLNKSDLASQEEQKAAEAFVWEQW
KPTNVLSVSALKRKNLTPLISALYPFLPLTEPLYPDEALSTAPERFFVSE
LIREKIFMLYGAEIPYSTEVVVDEFREQHDDDPSRKEFIRCSIVVERDTQ
KQIIIGKGGAALKKLGQLARKAIEEMLGRPVYLELFVKVRPDWRKKSGML
KSFGY
>Cag_1899 3-isopropylmalate dehydratase, small subunit
METIIQGKAYVLGKNIDTDQIIPAEHLVYSLSDPKEVVLYGKYALSGVPP
EQGGLPQGNIPFVREGEVTSDYTIIVAGANFGCGSSREHAPFALQVAGAK
AIVAESYARIFYRNCVDGGFVIPCESTESLAQAVSTGDELKIDVMANTLE
NLTTKKSYKLNPLGSVFSIVEAGGIFAYAKKENLMAQS
>Cag_0903 Peptidase M20D, amidohydrolase
MKQEESHHPIAEAIQHKAAELFPEVVALRRDIHAHPELSLQEHRTTALIT
SYLMQLGITPEKPLLDTGVIALIRGTSPHHHGKVIALRADIDALPLQEEN
STDYCSIEAGKMHACGHDMHTAMLLGAAKILSGMKEQLAGDVLLIFQPSE
EKAPGGARPLLDAGLFATYKPILILGQHCFPTIECGSVAFCRGAFMAAAD
ELYITVNGKGGHASAPHKAADPVLAAAHMVTAVQQLVSRVVPPHEAAVVT
ISAINGGHATNVIPRQVTMMGTMRSMNEEVRAILQERLQQAITHTAQAFG
VEAELTIVKGYPVLYNNQTITDQASCICAEYLGHHQVQHCQPLMTAEDFA
YYLQECPGTFWQIGTGVREGETANTLHSPTFNPNEEALQVGTGLLAYNAY
RFLASLHGE
>Cag_0824 Dhh family protein
MIIPSYGRTLHAEEWQPLLEPLLAAQHLVLTTHENSDGDGLGCEVALALA
LTALGKEVSIVNPTEVPPNYQFLRQLYPIVQFNPKSEEAIQELSLCDAVV
LLDANLSDRMGTLWPHVRFARELGSLKLLCVDHHLEPNDFTDVMISESYA
SSTGELVYGLILAMEQSVGRALFTPNIAQALYVAVMTDTGSFRFSKTTPY
VYQLAGDLVARGANPEKAYDLIFNSLTPQALKLLGLSLSAISLVEGGKLS
WLLISQEMLKATESKLFDTDIIVRYLLSVPSVAIAVLLVEMQDGRTKASF
RSRGKLPVNKLAKEFGGGGHMNAAGALFPYTPEKVQQVLPQAVRRFIKEH
EALL
>Cag_0799 oxidoreductase, FAD-binding
MKPANTAPREIPFNYTSASDRQAISFLLGHDVVRMLDELRERRVTGRSAR
LLMGIIGEILIHRRNPYLFQELVESVPRRRRLFERAFREVDTIAEAANGD
ARVTTIVAAVREQLQRFHVAVEKTPELRRRMKRELGAVVGVKNVLFDPFS
LSAHTTDATDWRLYVPVAVVTPDEEAQVAPLITAIAALGLFAIPRGAGTG
LTGGAVPLRPGCVIINMEKLNHIRGISERDFQLEDGEISPASVIEVEAGV
VTEKAMEAAEEHGLVFATDPTSEWSCTIGGNIAENAGGKMAVRWGTCIDN
LLEWRIAMPSGENWTVRRVDHRLRKILHEDSVTFEVINDAGKRLERIELR
GTDIRKKGLWKDITNKALGGVPGLQKEGTDGVITSATFVLYPKYPHKRTF
CLEFFGPDMDEASKVIVELSKIFPHQVEYCGVLLALEHFDDEYIRAIDYK
VKAPRAQTPKAVLLIDLAGNKADEVQAGLAKVEALLAEHPNTLMFVANDA
AEAVRFWADRKKLGAIARRTNAFKLNEDIVIPLLALAEFARFVDKVNIDE
ERYAQRRFVARARDVLATAEVKGDGGRFASKIPAGIALCDAFDARIVAEP
EELLRTLAIIQELASELGELVQGYPDMQSALDNAYQQVRDRRIVIATHMH
AGDGNVHVNIPVLSNDRPMLERADKVVDHVMEKVVALGGVVSGEHGIGVT
KLKYLDAAIVDELSRYRRKVDPKGIMNPGKLEDYEALGHIFTPSFNLLEL
EAHILQRAQIEELSKKVDYCIRCGKCKTDCCVYYPARGMFYHPRNKNLAI
GSLIEALLFDAQRERSTDFALLQWLEEVADHCTICHKCLKPCPVDIDTGE
VSVLERKILSEWGFKHTSPVTDMTLHYLDSRSPITNALFRSIVLRMGGAA
QRVGHKLTQPLQPKLDPPALYPLRLLRSAVPPVPQETLRDVLPDCEADQV
LLFEPHSGEATANLFYFPGCGSERLHSTISMAALHVLLQQGCRVVLPPPF
LCCGFPAHVNAKSDQYSSIVLRNTVLFSQIRETFSYLDFDACIVTCGTCM
EGLDAIETEKLFGGKILDVASFVKSSGLTLKDEEHGEGEYLYHAPCHDSL
NGKARETLKAIGGFGSVKAIPHCCSEAGTLALSRPDITDSMLHRKRDAIM
EELPNGEKRIMLTNCPSCVQGLGRNRDLGIEPKHIAVALAEKISGATWLD
IFKKEAAQAKAIRF
>Cag_1359 Polyferredoxin-like
MMQAKNKKTHIGVWKRRRRIAEIAQAILLTAIPFITINGHSILRFDIPTL
KLYFLGTVFWIRELYLLVGMVLIFLLFIGFITAIFGRIWCGWLCPQTVLL
DLSQTIARFIGKKQEQLVQRILLIPFAALIAFTLICYFVPPAETLQSLFS
APIITAFFVALWIALYLELAFLGRGFCTSICPYAMLQNMLFDKETLVIEY
DISRDSTCMKCDDCVTVCPVGIDIKKGLNSACIACAECIDACKTINDKRH
VPPFPNYKGTILRAKSLWLGGVTALAAIGLAALIWSRPPFDVVVTRNAQP
LPPGINRYSLTMYNNSSKPIEVELSSPDNVMLLGNHTFQLAPFGSINSSV
MVKASSKNMPDQIELRFNGNDSSAIREVGFF
>Cag_0534 Hydrogenase accessory protein HypB
MCDSCGCSGDGGMVIRKPSQAEEHHHEHAHHSHHHADHHHSHEHGEEHSH
SHSHEHSHARKVVVEQDVLLKNNLLAERNRGYFDARSIYAINLLSSPGSG
KTSLLERLIPALQQHFPVAVIEGDQQTTNDADRLHALGISAIQINTGSGC
HLDADMVNRAIKQLELKDHSLLCIENVGNLVCPAMFDLGEALKIVVISVT
EGEDKPLKYPAMFHAADVCIVNKIDLLPYVDFNVERCKEYAMQVNHHLQW
FEVSAKSGDGLAALQEWLEHQVVLH
>Cag_1400 chloride channel, putative
MNILSHKKGRLARRIIAFFYILFRRSRYFKGSSQNFIKLTLEYILVQLNL
NQDIPFLFVAVIVGLVTGYVAVLFHEAIKAISNFSFNDLRLLGDISFIEQ
YWVFFLPFIPAIGGLFVGLYNTFIIKKSSRHALASVIKSVAHNDGIIDRK
LWFHKTITSVVCIGTGGGGGREAPIVQVGSAIGSTIAQWLRFSPEKTRTL
LGCGAAAGLAAVFNAPIGAVMFAIEVLLGDFSVKTFSPIVIAAVIGTVLS
RSFLGNRPTFDVPDYTLVSNIELLFYCVLGVLAGLSAVMFIKTYFAIEEW
FDKLQIRRNLPVWIMPAIGGFLSGIICIWLPGLYGFSYNVISNAVYGNET
WYNLIGIYLLKPVVAGLSIGSGGAGGMFAPAMKMGAMLGGMFGIVVHQFF
PLITATSGAYALVGMGALTAGVMRAPLTVILILFEITGQYEIVLPIMFAA
VTSAVVARLAYRHSMETYVLEKQGIKVGFGIALSVAEQVVVSDILDKKRT
QFVSTTPMKKILEVFYSTPETNFLIVDKQGVFIGNISLDDIRILLKNGCN
DDLIADDIVNKNVPVLYTNSRLDEALKLFELSDYDILPVLDTKNNILQGV
LRQEKAFASYRKQLNLYGSDYSDKSVHQGVK
>Cag_1952 DsrE protein
MKIGILLKEGPYNHQASDTAFKFAEAAIKKGHTIDAVFLYNDGVTNVTKL
MDPPQDDRHIAGRWSELSKQHGVEILACIAASKRRGINDDVLVEGSSITG
LGTLTDIAIRNDRLVTFGD
>Cag_1470 phosphatidate cytidylyltransferase
MTKLLPNNLIQRIAVAVVGIPLLLWLVVEGGFWFFGFVLLLSLLAVSEFH
HLAARKAHPPALWVMLLVTALWQTNGYFHLIEVWELLLAALLFLMVIALF
LRNGSPLLNIGSMLPGLLYVNLSFGALLRLRLAAPEERGVTILLLLFICV
WAADIFAYFGGSALGGKFIHRKLFERHSPHKTWEGFVLGLFGSVGAAYLA
ALFLPNLPLLFTLLTGVLIGVASPIGDLVESMFKRDAGVKDSSTLIPGHG
GVLDRFDTIMFVAPILYFLTFYYL
>Cag_1699 hypothetical protein
MDSFCQQLFSCFSLTTRRTMKPTFHFTLICCAVALLASNQQTFAGNDVLH
QNSASLQSSRLQNEPQRAPTPSVIAFQPNQKSVTGTIGNDRITLYGNSGL
ASTATPVRDPAISRSQTFSSTVAIDLNGTILPGTAAETPTVHFYINGKDY
GLATLSSVQSDYSKKIGGIAHSGKQRFIFPVDDIDIRTIKIEIESPAVLR
SEVYIYGVTITPEGAADQKVEPSSLRGATVTFATPSRYYDGGKSYKIPYG
SIPSDVRSITIDTSLYRKTLQQAPGTPANPLTVHGGGGIDTLYLLGNQDQ
YVLAGGKNSSLIIAESAGLSQNALATNIAKVEFADSSFFLPPQATGINEA
VLAENEASIATLRATLPPHVGMAVHPIKHDPLRGKIGDVLSKNCPTEFYQ
NAVRLTAPQGDAPMALRSLGFVIAPARRDAPTIKLQGAAKEQEMVVVDTA
TLPETSLIEAQRVNVVLLSGKTPITFRGNGDGMVILADEGNQEMWGGTGD
DLLWGGVGNDNLYGGIDDDLLCGGSGDDVLDGGSGIDAAYFSGKSEEYRI
THHPTTSMTTVVDLVAERDGTDNLFNIEQLRFADRTMFLGDKK
>Cag_1494 conserved hypothetical protein
MSQLPQLSGRELVKVLCKMGFIVKRQQGSHIVLRREEPWAQTVVPDHKEL
DRGTLRAILRQTDITVQELLALLKDN
>Cag_1936 conserved hypothetical protein
MQILDLSHTIEPTMPLYLGTPSPSFQPIASIAHDGFAEQLLTFSSHTGTH
VDAPSHLFKQGATVEAMDVSRFVGRAVVLDVRSLLGEEIGLELLLPHEAL
VRECQFVLLYTGWSCFWGKEAYFGHYPCLSLEAAQWLTSMELHGIGVDAL
SVDSADSHELPIHRILLERGMVIVENLRGLEPLLHQRFLFSALPLKLAGG
EASPVRAIAKVDGVF
>Cag_1160 Pyruvoyl-dependent arginine decarboxylase
MSFVPTKVFFTKGVGRHKEYLSSFELALRDAKIEKCNLVTVSSIFPPKCK
RISVEEGIKELSPGQITFAVMARNSTNEFNRLIAASVGVAIPADDTQYGY
LSEHHPFGESAEQSGEYAEDLAATMLATTLGIEFDPNKDWDEREGIYKMS
GKIINSYNITQSAEGENGMWTTVISCAVLLP
>Cag_0632 TPR repeat
MVQSSNEGGGVSATAPYYTSYKQALAYVDEQRYEEALQLFDHCIAIERRH
AALLYGRAVTLLALGTYRQACCDLFKSLALDKAQPEAWKHLAYLLFMLGK
DEPAEKTLKKALERFPDYAPLYCVLADIYLDLGEFDKAHEAIEQALRLDP
QNPEPHSKLAMYYVARGNMEGLQQECKTLEQLDAALAEQIRTLFFENQ
>Cag_1692 5-methyltetrahydrofolate--homocysteinemethyltran sferase
MGTMIQRHKLKEEDYRGSRFADHSHPLLGNNDMLVLTQPDIIYTLHCAFL
EAGSDIIETNSFNANPISQADYSAVELVHELNVEACRLARRAADEFSAKT
PNKPRFVAGSIGPTNKTLSLSPDVNRPGYRAVTFREVVDNYLVQLEGLRE
GGVDLLLVETVFDTLNCKAALFAISEIFERTGWRVPVMVSGTVVDASGRT
LSGQTTEAFWSSVCHLPELLSIGLNCALGSKQMRPFIEAMANVAESYVSV
YPNAGLPNEFGQYDDSPEYMATQIADFATSGFVNIVGGCCGTTPDHIKAI
AEAVQTISPRKRPTKKHELLLSGLEPLVVNHTTGFINIGERTNVTGSKKF
ARLIKEGNYDEALSIARQQVENGAQVIDVNVDEGMLDSEKVMKEFLNLIA
SEPEISRVPIMIDSSKWSVIESGLQCVQGKGIVNSISLKEGDELFRERAR
KILRYGAATVVMAFDEQGQADNLQRRIEICQRAYNILVNEVGFPPEDIIF
DPNVLTVATGIEEHNNYAVDFIESVRWIKENLPHAKVSGGISNVSFSFRG
NEPVREAMHAAFLYHAIRAGLDMGIVNAGQLAIYEDIEPELLVRVEDVLL
NRRADATERLVSFAETIATGGEKVEAKAAEWRNLPVAERLRHALIKGIVE
HIEEDTEEARQLYPSPLQVIEGPLMDGMNAIGDLFAVGKMFLPQVVKSAR
VMKRSVAHLIPFIEAEKAKNKDTSAQAKVLLATVKGDVHDIGKNIVAVVL
ACNNYEVIDIGVMMPCEKILEAVEREKADLLGLSGLITPSLDEMVHVAAE
MERRGMKIPLLIGGATTSRMHTAVKIAPVYSGAVVQVLDASRSVPVVNSL
LNPALSPKYIADLKNEQAGLRESHAAAQAARNYVSLADARNNAAKVEPVA
VQPKKVGLTLFEDVAVAELRAYIDWTPLFMTWELHGRYPQILSDAKYGSE
AQKLLNDANALLDRIELEKLLGVKGVVGIFPANSTGDDIAIYTDESRTKV
LTTLHTLRQQQEKANGEANVALADFIASADSGVADYLGAFAVTTGLGIAK
TLERFVAEHDDYHRILMQTVADRLAEAFAEMLHECVRREVWGYEEVCSKT
SSCCACHAESATTSQANHLEELLAGNYQGIRPAPGYPACPDHSEKAEIFT
LLNAEVNTGITLTENFAMNPAASVSGLYLAHPEARYFMLGKIGKDQVEDY
AQRKGINVAEAEKLLATSLNYQAG
>Cag_1830 Peptidase M24A, methionine aminopeptidase, subfamily 1
MITIKSDREIELMREAGALVSQVLDLIEHTVQAGMTTKQLDVLAEEYIRS
HNAVPSFLNYVPKSDPHVTPYPATLCVSINEEVVHGVPSKKRVIKEGDIV
SVDCGVYKGGYHGDSARTYIIGSVDCKVQELVDVTRESLVKGIAMAVDGN
RLHDISSAIETYARSFGFSVIENMVGHGIGSELHEDPPVPNFGRPHTGAK
LRSGMTLAIEPMIAMGRSKKAVSRRGGWVAVTEDGKPSAHFEHTIVVRPK
KAEILTISKN
>Cag_0918 cytochrome c family protein
MMRKKVISATVALLATVASFAHAEEDARLIKARALADQLPPKLGAVLKAE
IAKSGPEGAISVCRDEAPKIAKELSKESGTRIRRVSLQQRSCKAKPDKWE
KAVLEEFDRRAAAGEPLTTLEKGEQVGSKYRYMKALPVQDRCLNCHGSLE
TMKPAVKAALEQHYPKDKATGYREGQIRGAISVRL
>Cag_1083 PhoH family protein
MQQEVSIEFEGIEPVIIFGPYDSLLKKLRHEFSDVQITARGNHISLRGGA
DEVALLERIFQEMIALANKHGEVLDSDLNTLINLGLSRPNLPEITRSGDG
DIIVSTPDSTVRARTDGQRRMVAEARNNDILFAIGPAGTGKTYTAVAIAV
AAWKAKRVKRIMLARPAVEAGESLGFLPGDLAQKIDPYLRPLYDALQEML
SAEKLKLLTEQRVIEIVPLAYMRGRTLSNAFIILDEAQNATNKQLKMCLT
RLGTNSKAIITGDITQIDLPRYQDSGLTNAPAILNNIKGIGFVYLDKSDV
VRHRLVRDIIDAYEHYEQT
>Cag_1086 Histidinol-phosphate aminotransferase
MNTPYSIEHLLNPALRNIATYKVEGGQQAEIKLNQNESPFDVPQWLKEEI
IGEFIREPWNRYPDILPYRAMEAYANFVGVPAECVIMSNGSNEMLYTIFL
ACLGPGRKVLIPNPSFSLYEKLALLLQSDIVEVPMKSDLSFDVEAIMKAA
HNEAVDVIVLSNPNNPTSTSMSYDAVRKIAESTQALVLVDEAYIEFSRER
SMVDTIEELPNVVVLRTMSKALALAGIRIGFALANAPLMAEISKPKIPFA
SSRLAEITLMKVLANYRLVDEAVSAILSERDALYEQLRMMEGVSPFASDT
NFLIVRVADANATFKRLYDKGILVRNVSGYHLMEGCLRCNVGLPEENRRL
AEAFAELSVEVKG
>Cag_1361 conserved hypothetical protein
MQERGEDKSDWKQAALLTSGEIEAAVASDEDEAGMVVDWSNVSVELPRPK
SVLNMRIDYEVLEFFRSQGKGYQKKINAVLRSYMEKQKHSAMS
>Cag_0331 conserved hypothetical protein
MNPANAHIDNQLRQVLERHPLIRLAILFGSVAKGTAGFESDLDLAVGAAR
PLTVQQMMALIEDLVEMSGRPIDLIDLSTVGEPLLGQIIAHGRRIIGSDT
DYVNVVLKHIYNEADFVPLQKRILKERREAWIGK
>Cag_1777 PIN (PilT N terminus) domain
MNKRFLLDTNVLSELMKNNPEKKVIQWFDAHQHERFFTSSITKAEILLGI
ALLPKGRRQKQLYEATFKLFSTTFLEYCFPFCERAAVVYSDIVAHQKRIG
RGICTEDAQIAAIALTENLTLATRNVKDFHFIQGLEISNPWHG
>Cag_1140 conserved hypothetical protein
MNEHSHHRPLTKIGILLMLIGRYLFAGFFIYGFWHKLTRGWLSSDITQRH
FIKRLGELPIDSWQAAYLEYFAIPLAMPIAWIVTVGELLIGVSLVVGLMT
RINAGFALFLLLNFAAGGYYNLSLPPFIIFAVVMMLLPTSHWLGMDKQLH
NRFPHSIWFR
>Cag_0989 ABC transporter efflux protein
MALYQIKANKLRSFLTALGVIIGIVAISMMGTAINGIERGFDKSLAMLGY
DVVYVQKVSWSGMGEWWRFRNRPDIKTDYASQINRIIAETPNSELLLAVP
QRSTYQASATYRERTLEQVFALGSTSDYLATASGTLTAGRFFTASEAASG
AMVCSVGNDIAESLFPNEPALGKTIKLKNRKFRIIGVFRKQGKFLGLFSF
DNQVIMPLTAFERVYGKDQFTSIRIKIKSVEKLELAKEELLGIMRRLRHL
PPNKEEDFSINEQQAFKSQLDPIKNGIALAGIFITGMSLFVGAIGIMNIT
FVSVKERTREIGLRKALGARRRTILLQFLIESVMICLAGGMSGLVVTLFI
TLVAGMVAPDVPLSFSPSLLMLSLALSVATGIISGIAPAITASRLEAADA
LRYE
>Cag_0610 conserved hypothetical protein
MKKTAKLLSLAVALFAGVSGTAQAEGFKLGADVVSSYVWRGTQVTTSPAI
QPALSYTFKNSDIVVGAWGSYAISEHTGAVANQETDVYVTVPVGPVSVTL
TDYYNQTATSRTFDFSDDSNNIVELSVAYAKDNVSLMGAMNVAGTDTDNA
MYLEAGYKFYEKDGYTAKACLGAGNEAYTSDQDFTLVNTGISVSKDRYTA
SCIYNPDTEASSFVFMASF
>Cag_0927 Beta-phosphoglucomutase hydrolase
MSSSFKGAIFDLDGVITGTAKVHSLAWESMFNSFLQNYAEANNEPFVPFD
PIHDYHKYVDGKPRMEGVKSFLFSRDIELPFGELDDNPENETICGLGNRK
NSLFTEILEKEGPEVFSSSIELIEQLIERGIKIGIASSSRNCQLILRLAN
LEYLFETRVDGEVSIHLGLKGKPNPDIFVVAAKNLGLEPHECVVVEDAIS
GVQAGARGNFGMVLGIAREIEGARLIEQGADIVVNDLGEITPEDIEEWFT
KGLEFEGWNLTYTEFSPKDEKLRETLTATGNGYLGVRGAYEGSKSSHNHY
PGTYIAGIFNRVPSLVHGQTIYNNDFVNTPNWLPIEFRIGGGAFIDPFRQ
KILSYRQNLDLRSGLMERDLVVQDNLGRITSITSSRFASMANPHQCAVKF
TLKPVNYSADIEFRCSIDGTVQNRNVARYSELTSDHLEHVEAQHNGATML
LHVRTSVSKYEIVTAAKTRIIMHGKEVTAERQPLHSNRFIGEQFTLSLGP
SKGCTIEKMVSIYTSLDQNSTSPLTAAKTSLQDCSSFDELLTPHVAAWEA
LWEKANLQIEGDRFSQKVLRLHTYHMLCTASPHNASIDAGMPARGLNGES
YRGHIFWDEIFILPFFNRHFPDISKSLLLYRYHRLDAAREYARENGYKGA
MFPWQTADDGKEDTQSVHFNPKSGSWGPDHSCLQRHVSIAVFYNTWRYIY
DSDDTTFLNEYGAEMMFEIARFWASIATLNPATGRYHIEGVMGPDEFHES
VPGSKKEGLKDNAYTNVMCVWLFEKATEIAAKLSPEALERLQKTIGYTPE
EAEEWHKIGHHLNVLIDHDGIMEQFDGYKSLKELDWKHYRTKYGNIHRMD
RILKAEDDTPDNYKVAKQPDVLMMFYTLSPGEVAELLTKIGYMVPDALTL
VRNNYAYYEPRTSHGSTLSKVVHSIISSYLHDGRDMAWSWFLDALKSDIN
DTQGGTTHEGIHCGVMAGTLDTVARYFAGIAFYNEKLNVHPNLPDQWKKL
SLTVCFRANRYALTIEKAAITVTLLESDSNEAPACIAGHHLTLQKGVPYN
SALHA
>Cag_0477 D-alanyl-D-alanine carboxypeptidease, putative
MRSFLTTWRYRLVLVLLLCNTLLPWQTFAEQPRFAPLDALMQQAVRDSVF
PGASIAVLHRDKVVFHKGFGRHTYQPSSTVVDTTTIYDLASLTKVVATTN
MVMQLVERDSLKLHEPVATYLPTFAQRGKDRVTIEQLLRHTSGLRAHEHY
GETCKTANGVFNTIYDDTLLSAPGSVTRYSDLGFMLLGKIIEQQTGASLA
ANFNQRFAKPLGMANTMFTPPTFLYDRIAPVEADNNWHLTTTRPLVHDQN
CALLGGVAGHAGLFGTTNDLIGMVRMWMNEGKVDGKRYVKAATHRAFTKQ
ENTARALGWDKRSAQGYSSAGTRFSMESYGHLGFTGTSIWIDPKQELAVI
LLSNRVYPSSENIKIRAFRPRLHNTVVECLAKDGK
>Cag_0874 conserved hypothetical protein
MNYQQLLALFKETHCELQHKAARSVDTALVIRNWLFGWYIVEFEQGGAER
TVLYGKSLINRLSQELKSLGLKGISPTSLKQCRTFYLAYEKIGQALPDQC
RGKLRISEVLQALPAKSENELPEIQQALPVISFEVLHNIPQIVHELSTTL
AGSFKLGWTHYVALLTISNADERSFYEIEAYHNSWGARELERQIAASLFE
RLALSRDKEEICQLAQKGQVVEKATDIIKNPFVLEFLGLEEKSSYSEHAL
ETAIINHLEHFLLELGKGFLFEARQKRFTFDNDHFYVDLVFYNRLLRCYV
LIDLKRDKLTHQDLGQMQMYVNYFDRYVKTDDELPTIGIVLCHRKNDALV
ELTLPKDSNIFASKYQLYLPSKEELKRELEEAAGIKH
>Cag_1538 Bacteriochlorophyll/chlorophyll synthetase
MSTTTVRIGFVEKIRAHIQLLDPVTWISVFPCLTCGVMASGAMQPTVHDY
LLLAALFIIYGPLGTGFSQSVNDYYDLELDRINEPTRPIPSGRISKKEAL
WNWSIILTIAIVLGGWLGIHIGGERGMLFFGCLLVGLLFGYIYSAPPLKL
KKNILLSAPAVGISYGVITWISANLFFSEIRPEVLWFAGLNFFMAIALIM
MNDFKSQEGDAKSGMKSLTVMIGAKNTFLVAFTIVDLVFAVFALLAWSWN
FQLLAYFILATLLVDLVIQFKLFTDPRQGLSFLNGAIDNSFGNAIGKSES
KEHNAFLRFSMLNNFLFLINQLIAAALIGLKYINSTHP
>Cag_1618 conserved hypothetical protein
MRVFKNKWFNHWASREGISDDVLFGAAKEIIIGNVEANLGGYLFKKRLPR
QGKGKSGGYRVIVGFKKQNNDRIIYLYGFSKSQAATISKKEEAALKMVSS
EFVAYSDEQISRLIQQQYIMEVLSNE
>Cag_1202 conserved hypothetical protein
MTPDEIDDQKKIEFYAASVSAWYESSLEHDKSLLTLSAGGIGLLITLLTT
VGLGTAEALVLYVGAIISFVISLVSVLFVFRGNKKHIEDILSGKNQGTDP
VLSKLDGTAIWSFGIGVVFTAVIGISAAIHSFTSKENTMANETTKTTQAV
PLHESFNGAANLQSGTDLGQSFNGAGNLQPQQTTQPATPSTTPANSGNSQ
NQSDKGK
>Cag_0637 NADH dehydrogenase I, 49 kDa subunit
MQELEKAVPSSVRVTRQSENLVILEKDLATEQMVLAMGPQHPSTHGVLKL
ECLTDGEVVTEAEPVLGYLHRCFEKTAENVDYPAVVPFTDRLDYLAAMNS
EFAYALTVEKLLDIEIPRRVEFIRILVAELNRIASHLVAIGTYGIDLGAF
TPFLFCFRDREIILGLLEWASGARMLYNYVWVGGLAYDVPADFLKRIREF
CAYFRPKAKELADLLTSNEIFVKRTHGIGIMPADVAINYGWSGPMLRGSG
VEWDLRRNDPYSLYSELDFNVCVPDGKHSVIGDCLSRHLVRAYEMEESLN
IIEQCIDKMPSSDGFNSRAAIPKKIRPKAGEVYGRAENPRGELGFYIQSD
GKSTKPLRCKARSSCFVNLSAMKDLSKGQLIPDLVAIIGSIDIVLGEVDR
>Cag_1181 hypothetical protein
MQPFEHIPKIIAVKALDAQHLMVTFEGNIIKRYDCTTLLAMQEFKLLKTY
AFFKAAQVDAGGYGIAWNDGMDVSGYELWKNGVLQ
>Cag_0429 transcriptional modulator of MazE/toxin, MazF
MKFGETMAYIPDSGDIVWIMFNPQAGHEQAGRRPALVLSPKAYNGKVGLA
LLCPITSQIKGYPFEILIPEGLEVKGAILSDQVKSLDWKARKAVFIGKLP
AEKYNKVVKNLIALIQ
>Cag_1909 conserved hypothetical protein
MVAVQHKQNAQKQSKPFAFWLSVATLSHFALLLALLLYQQFSNRQQEPPP
VVNVMLVSLPGRVGSAAVPAPTLEAPQQVPAEQAKEASVVKGSPSAVPTK
VPVATPPASTTKKVPEAQPVVDRQQQMNQALERLKQKVGKSASPSVTASP
SAAPSPLAPSSSNSLTNALAKLQAKVKASGQATTSAPSTTSPTTSPTAKS
GGNIAAASRTTGSGSGSGSPASYKAEVASIIQNNWAFSNPMLRGEGMEAY
VRIHVLPNGTISQIVFDRRAASEYLNNSIKRALEKSSPLPVIPQEAGGRD
MWIGFLFSPEGIER
>Cag_1446 isoprenyl synthetase
MSTISQEVVEKKYALYHTRINNALTACFSKQTPQTLYAPARYILEGKGKR
IRPFLTLLACEAICGSPDDALDVALAVEILHNFTLIHDDIMDQAELRHGR
PTVHKTWNVSAAILTGDMMIAYAYELALQSKGSRQTELVHILNDANITIC
EGQALDMELEEMPDASMSDYLDMIAKKTGRLISAALEAGGVAGNATDEQL
RSLVLFGEKIGQAFQIQDDYLDIMAEAGKSGKMAGGDIINGKKTCLLLRS
IELSSGTDRDLLLSIIANKGISAERVGEVKAIYQRCGVLDEIAALINSAT
EEALSAVDNLPYAEGRDHLKGFANILMKRDF
>Cag_0941 CAAX prenyl protease 1, putative
MNIFGVAIFITLFTTFLVKVISELLNLRAAASPLPSEYAALADNATRQKS
RDYLAATTRLSLFSAGFDLIALIIFWFSGSFNLLDQTLRSLGFNSIITGM
LYIGTLMLVQSIIELPFSLVRTFIVEEKFGFNKTTIGVFLGDLAKTALLS
IIIGLPVLAALLWFFESAGNLAWLWAWSGIVLFSLLLQYIAPTWIMPMFN
TFKPLLDNELSRAIMQYSAKVQFPLSGIFEIDGSKRSSKANAFFTGFGKR
KRIALYDTLIKAHPVPELVAVLAHEIGHFKKKHILINLLMSSANLALLFF
LLSLMMHNRQLFDAFFMEETSVYGSLLFFTLLYTPAELMLSVFMHAISRK
HEYEADAFAVTTYEQGSALADALLKLSHHNLSNLTPHPLYVFLNYSHPPV
VERLQRIKALLASQTQRSTPTL
>Cag_0870 ABC transporter, permease protein
MISLAGRDILHGWVRFIMTGFGLGLLIGVTLSMAGIYRGMVDDANVLIEN
SGADLYVVQQGTIGPYAEPSSIYQDSYRSILGMEGVARAANVTYLTMQVK
GNGSDVRAMVVGITPGSVGEPGQPTFLVAGRHITASHYEAVADIKSGFAL
GDTITIRRHHYKVVGLTRRMVSSGGDPMIFIPLKDAQEAQFLKDNDMLLQ
QRRRTAANPQLNPPANPSVLDAVIASQTTNLSVNAVLVQLKEGYLPETVA
RSLERWQRFEVYTRAEMQQILVEKMIATSAKQIGMFLVILAIVSAAIIAF
IIYTMTMAKIREIAVLKLIGTRNRVIASMILQQALGLGALGFVVGKIVAT
LWSPFFPRYVLLLPEDSIRGFMIVMVISALASILAIAAALKVDPAEAIGG
>Cag_1980 peroxiredoxin, putative
MHDDELYYDISMPLLGDDFPEMEVQTTHGVMNIPGDLKGSWFVLFSHPAD
FTPVCTTEFVAFQERIAEFDKLGCKLIGMSVDQVFSHIKWIEWIKDTLNI
EITFPIVAANDRIANKLGMLHPGKGSNTVRAVFVCDPNGKVRLVLYYPQE
IGRNMAEILRAVTALQISDNNKVAIPADWPNNSLIKDEVIIPPAKDVAEA
QKRKNMGGCYDWWFCHKPLDK
>Cag_1726 conserved hypothetical protein
MIEAIQHIIDNPVASLLVIGNLILIESILSVDNAAVLATMVMDLPANQRA
HALRYGIIGAYLFRGLCLLFASILIQIWWLKPLGGLYLLYLVLKWLHGKK
TPQPDDDLLDKKSHPLYQMTVGKLGIFWSTVLLIEMMDLAFSIDNVFAAV
AFTENLLLIYLGVFIGILAMRFVAQGFVGLMERFPFLETSAFLVIGVLGV
KLLLSSLSHFYPHSPFVALLESHETDALVSLTTVLIFVIPLITSALFNLP
ARQRS
>Cag_1893 Heat shock protein Hsp70
MGKIIGIDLGTTNSCVAVMQGSQPTVIENSEGNRTTPSVVGFTKTGDRLV
GQAAKRQAITNPKNTIFSIKRFMGRRFDEVGEEKKMAPYELINDSGEARV
KINDKVYSPQEISAMVLQKMKQTAEDFLGEKVTEAVITVPAYFNDAQRQA
TKDAGRIAGLEVKRIINEPTAAALAYGLDKKNASEKVAVFDLGGGTFDIS
ILELGEGVFEVKSTDGDTHLGGDDFDQKIIDYIAEEFKKQEGIDLRKDAI
TLQRLKEAAEKAKIELSSRSDTEINLPFITATQEGPKHLVINLTRAKFEA
ISADLFNKVLDPCRRAVKNAKIEMREIDEVVLVGGSTRIPKIQALVKEFF
GKEPNKSVNPDEVVAIGAAIQGGVLKGDVTDVLLLDVTPLSLGIETLGGV
MTKLIEANTTIPTKKQEVFSTASDNQTSVEVHVLQGERPMAADNKTLGRF
HLGDIPPAPRGIPQVEVIFDIDANGILHVSAKDKATGKEQSIRIEASSKL
SDAEINKMKDDAKQHADEDKKRKEEIDIKNSADALIFSTEKQLTELGEKI
PTDKKSALEGSLDKLRDAYKNGTTESIKSAMDDLNSQWNSIASDLYQSGA
GAAQAQPEAPQNSGSSQSSGGDGAVNAEYEVINDDKK
>Cag_1907 ExbB/TolQ family protein
MSLDAPSGIFALVADAGAVVLVVLFTLLAFSVVSWAIIAYKAIGLRAARY
ESKLFMDAFFDVAPERLFAESERLSGAPLARVYRAGYIAFRALDGKNVTF
KQAQAVVARAIKRATNAETKQLASLVPFLATVGNTAPFIGLFGTVWGIMT
SFQAIGVTRSASLSAVAPGISEALVATAVGLAAAIPAVMGYNYLTQQVGL
LERDIEEFAPEFVTALINEP
>Cag_1425 Modification methylase HemK
MENQKEWQVIELLKTTTDFFAAKEMSEPRISAERLLGHVLQKSRLELYLH
HDAPISPSELEQFRSFCRQRLQGRPVQYITGEQYFYGAPFFVDERVLIPR
PETELLVERALEVSGVSALAGEVAVLDVGTGSGCIAVTLATLAPNLRIVA
VDLSPAALDVARLNAERHGVTNRMTFVQADMTSPYFAQQLPFATYQLIIS
NPPYIPKAEWATLEREVRDFEPELALTTPSGMECYQALISAAPTLLADGG
TLALELHADGARAVATMLESAGLQEVALMKDYGGFERIITARKSEKS
>Cag_1392 putative type II DNA modification enzyme (methyltransferase)
MNNLLIHGDNIAGLDYLLHQKQLKGKIDLVYIDPPFATGGNFTITNGRAS
TISNSRNGDIAYSDKLTGDDFINFLRKRILLLRELMSEKASIYVHIDYKI
GHYVKIMMDEVFGIDNFRNDITRIKCNPKNFTRIGYGNIKDLILFYTKSS
NPIWNEPTEKYSENDIVNLFPKITTNGRRYTTVPIHAPGETVNGKSNKPF
KGMLPPQGRHWRTDVITLEHWDKEGLIEWSSTGNPRKIIFADEREGKRVQ
DIWEFKDPQYPIYPTEKNSDLLDLIITTSSNPNSIVLDCFCGSGTTLKSA
HFLQRQWIGIDQSPHAIEATINKFSDIKADLFIESPQYDFIALTDELINQ
S
>Cag_1488 conserved hypothetical protein
MIVSFGSKECERIWDGFQVKSLPCEIQDIARRKLRMINNALTLVDLRIPP
ANRLEKLSGDLKDFYSIRINKQWRIIFRWHNGEASMVEIIDYH
>Cag_1779 Branched-chain amino acid aminotransferase I
MNTSEQIWMNGELVPWSEAKIHILAHVVHYGSSTFEGIRCYETTKGSAVL
LLDEHIRRLKDSSKIYRMEIPYSDAELKEAILATIRANNQKACYIRPLVY
RGQGALGVNPTRAAIEVAIATWEWGTYLGDDVLETGVDVRVSSWNRPAPN
THPTWAKAGGNYLNSQLIKMEALMDGYAEGLALDVNGYVSEGSGENIFVI
RDGIIYTPMTGQSILAGFTRYAIMHIAREQGYEVRETLIPREALYIADEV
FLTGTAAEITPVRSIDKYPIGNERRGPITEHLQRAYLNIVQTGEDPYNWL
TFI
>Cag_1421 conserved hypothetical protein
MKAIILYDSKSQGGSTDRLVDAIGVKLAEAGHYVEKARCKSNGDYSFVKE
FDMVIMGSPIYYLMVSTELLGSMFQSNLKSCVEGKQIGLFLLCGSPEIMG
NLLYLPQLKLHLLGQSLVAEKIFAPDQASNPEAISAYANKLLAALKNH
>Cag_1911 OmpA domain protein
MKTRKYAFAAFALLALAGCSSKSSVAPAPSASSSAPQALATPLAEPVVPP
PAIASEPLPMYTPPVTSQPLTPSATTTTLVPATVKGYGYDQWQKGPLGDI
FFEYDSATLDESAQMQLQQNAALLQQFIVESIQIEGHCDIRGTSEYNLAL
GERRATTAKEYLMRLGVPASRLETVSFGEERPFDNGNSEDAWAKNRRVHF
VLIKQ
>Cag_0983 Magnesium-chelatase, subunit H
MSSLRKIVAIVGLEQYNAALWQKVKELLAGDAALSQFSDVDLERQNPEAA
AAIVGADCIFVSMINFRDQVEWFKAQLARVQKEQTVFVFESMPEAMALTK
VGSYAVSDGKAGMPDVVKKVAKLLVKGRDEDAMYGYMKLMKIMRTMLPLV
PDKAKDFKHWLMVYSYWMQPTAENIANMFRLILREYCGEQVSVAAVVDVP
NMGLYHPDAPAFFTDVRNFKSWQKRRGINPDKQQKVGLLFFRKHLLQEKT
YIDNTIRVLEKDGLCLYPAFVMGIEGHVLVRDWLVKEKIDLLVNMMGFGL
VGGPAGSTKPGIAAEARHEILTKMDVPYMIAQPLLTQGFESWQELGVSPM
QVTFTYAIPEMDGAISPILLGALQDGRIETVPERLERLALLAKQWLRLRA
SANREKKVAFIVYDYPPGLGKKATAALLDVPQTLFAILQRLKKEGYNVGT
LPNSAEALFEALDRATDPQFVQHDSLVINHEDFKQVTSYRERERIDERWQ
QFPGDIVPLGEQELFVGGLRFGNIFIGVQPRIGVQGDPMRLLFDKANTPH
HQYMAFYRWISRTFQAHAMVHVGMHGSAEWMPGLQTGLTGDCWPDALLGE
VPHFYLYPINNPSESTIAKRRGLATMISHVVPPLSRAGLYKELPALRELL
LEVRESSLSNNSTISTLNDLAGLEEAIMQKAELLNLTDDCPRLHNEPLSD
YASRLYSYLSELENRLISNSLHLFGEAAPLESQLVTVTETLKNREENGLC
LPSLVLELRGATEAGVNYNEISSLARKGDETAIRLREQAEEACRQLIEQV
LFERKPLAATFAALCGGIAPSAEVQAFLEQLMREGAQLLAALRDNRGEME
ALLHALDGGYLPSGPGGDLVRDGVNVLPSGRNIHSIDPWRIPSTTAFRRG
SHIAEAILAKHLEEHDGIYPETIAQVLWGLDTIKTKGEAVAVVICLLGAE
PAYDAFGKISHYRLIPLEELQRPRIDVLMQLSPVFRDAFGLLMDQLDRLI
KDAAKADEPAEMNFVKKHVDEALAVGMSFESATARQFTQAPGAYGTYVDD
MIEDSAWQSENDLDDIFIRRNSNAYGGARKGENESAIFQKMLASVDRVVH
QVDSTEFGISDIDHYFSSSGSLQLAARRRNTRATDVKLNYVESYTSDIKV
DDAAKALRIEYRSKLLNPKWFEGMLKHGHSGAGEISNRVTYMLGWDAVTG
SVDDWVYKKTAETYALDSAMKERLAALNPQAMKNIVGRMLEAHGRGIWNA
DQEMIDQLQEIYADLEDRLERVDG
>Cag_0621 cytosolic long-chain acyl-CoA thioester hydrolase family protein
MQTTMETYKLVLPEHLNHYGFLFGGNLLKWIDEVSYIAVSLDYPGCNFVT
VGMDKVEFRKSIRSGTILCFVTEKSKIGATSVEYTVHVYKKSIESGERVL
AFSTHITFVCLDEEGKKKELCCGSYQSIDSNKNCK
>Cag_1296 conserved hypothetical protein
MIISASRRTDIPAFYGEWFINRLRVGEVLVRNPMQPKQVSHIALTPETID
ALVFWTKNPNPFFRYLAEIDAFGYPYYFLFTITPYDTTIEPHVPTLEKRI
AHFQYLAKRIGAERVVWRYDPILFTKTLSPTWHIAAFRHIANALSGYTKR
CIISFIDNYRKVRRNMASLPLITPNEGMITQLLQTFTNIAEQQQINLQVC
REEIDVTHYGIANGSCIDRSLVEQLCGRPLVGIGKDKNQRKTCGCIASRD
IGRYDTCLHGCRYCYAVSNHAKAAAAYKNFNPDTPLLCNELCGNETITCA
PKQNQSKLECLPLFEKT
>Cag_0057 FtsQ protein, putative
MPDDEYQEYYDPEEGELVEEEVLEEPLPAPTSGGGSFVLIVVALVLLLVA
GLASVALQWKQKVVVRNFIVEGESVLKEQEILAPIEFAKGHNLQLLEVGV
LKSQLLALPYVHDVVVRKEFNGTIRLRLHEREPVALTVHNGHIMVIDREG
FLLPWRNTVAQRYPKLLTVYGTERYAKSERGLQRLHERDVAVILEFIAAL
AESDYASLLIRELHLDATNTTWSKASQSSSHFIFGNDGRFNEKLKNFEIF
WQKVISKKGFTFYNIVDLRFKDRVFTIPSVISPSPQEITPL
>Cag_1452 hypothetical protein
MISIADIYDRLSTEPLQGNDLKKYYVDVFSGRGDNPMISLKRLLQNKPNG
KLQILFSGYRGCGKSTELNKLQQELSNDFIVLNFSVLDELDPVSLNYVEL
FIITMEKLFEVVGTYHININPQMFEIVRQWSSSKEIEDIRELTGEASLKV
GADVEVSAPLFARFFTKMRLGANASTSTKKTVIQNIEPRLSDLISHCNDL
IREIKLKLSFIGKKGLVIIIEDLDKLSVEKAEELFFNHSHILSSLQTHLI
FTFPISLRYHPKAIAIKGNFDEDYELPMLKVHDKAGNRFAGHNAMREIVT
CRIGENAFEPPALLDKFIAMSGGCLRDLFRMIRNAADSALNNEREIITEA
NYTKSFYRLRRDYENTIAEKRVNNELIISVPEYYETLKNLALSTTKKVDN
TDAVLDLRQNLCILGYNDEGWCDVHPVVRTILEERFPDMNK
>Cag_0588 conserved hypothetical protein
MIHSIRLDNLLSFASGNPTLPLQKLNVFIGTNGAGKSNLIEALDLVRATP
RSPSNNDFQRVISRGGTIMEWIWKGSPDTPATIELIMDNPYNSHTNEKQP
IRHLFSFKGEQQRVIFVDEIIENESPYHSNNEPYFYYRSYNGKPVINSAI
AGERKLQRDSINEELSILAQRRDPEQYPEITKLAEIYEEFRLYREWTFGR
NTIFRNPQRSDLRNDRLEEDFSNKGLFLNRLKTHKPKAKTAILEGLKDLY
QGIDDFNISIEGGTVQVFFTEGEFSIPATRLSDGTLRYLCLLALLCDPEP
PPLLCIEEPELGLHPDIIPKLADLLIDASQRTQIIVTTHSDILIDALTEI
PESVVVCEKNEGKTTMQRLNSNDLAEWLKHYRLGQLWTRGDIGGTRW
>Cag_1273 conserved hypothetical protein
MKPPFLITTLNKAHDRNAFYSGSEMLDRYLKQQVTQDIRRNLTACFVALN
NEKQIAGYYTLSSASIALDALPESLIKQLPRYTSLPAARMGRLAVAKTYQ
GMGLGATLLTDAIMRAKQLNREIGMYALLVDAKDEHAAMFYLHHGFIRFT
NSPQTLFLPLSQISLQN
>Cag_0553 hypothetical protein
MRELKNIEIQFIPLSKDEKKIRFFGTLIPAILLLKASPLRFLRFQFATLY
TPNVGVKEFIASDADDVARIGNMWVFVHRWDSMEFDPFAFARAKLFLKRI
TRLVKKAGYKAEPFDPLSPEMNLPQLGAEAGLGNLSPYGLLVHPKYGPRL
ILTGLKTEYNLECNMQPKSEGCTDCLLCLHECPQEPAKGGVIDLGKCQSC
TKCFEVCPIGREVCRTSH
>Cag_0695 Seryl-tRNA synthetase, class IIa
MLDIAYIRQNPDEVAAMLRHRQLASEEPKLQQLLECDRQRKELVQRSDEQ
KALRNKVSKEVAEIKKKGVGSPDELISQMKAVSDAITAMDSRLSELEAEM
ENLLLALPNKLHPSVPIGRSAEDNMVFGEPVHFEHSLNFPLKNHLELGKA
LRILDFERGAKVSGAGFPVYVGKGARLERALINFMLDMHTEQHGYTEVFP
PFLVNQESLRGTGQWPKFADQVYYIGEDDLYAIPTAEVPLTNLHRGEMVE
TNALPISYTAYTACFRREAGSYGKDTRGFLRVHQFNKVEMVKFTRPEDSY
TALEEIRHHAQAILEALKIPYRVLLLCSGDISANATKCYDIEVWSPAEEK
YLEASSCSNFEEYQARRSNIRFKPDSKSKPEFVHTLNGSGLATSRLMVSL
LEHYQTADGHIRVPNVLQRYTGFTEI
>Cag_0451 Tryptophan synthase, beta chain-like
MNADVTKILLSEEDMPRQWYNIQADLPTPMPPPLAPDGTPITPEQLAPVF
PMNLIEQEVSTERWITIPQEIQAILKIWRPSPLYRAHRLEAALQTPAKIF
YKNEGVSPAGSHKPNTAVAQAWYNKEFGIKHLITETGAGQWGSALAMSCK
LVGIDCKVFMVRISFDQKPFRKMMMNTWGAECIASPSMQTNIGRKILEET
PDTPGSLAIAISEAIELAVQRDDTRYALGSVLNHVMLHQTIIGLEARTQL
EKVNLYPDVVIGCAGGGSNFAGISFPFIGDKIHGRDVQIIAVEPEACPTL
TRAPYSYDSGDVAKMTPLLPMHSLGHGFIPPAIHAGGLRYHGMAPLVSHV
KQLGLIDAVALPQTECYEAALLFAHTEGFIPAPETSHAIAQTIREAKQAK
EEGKEKVILMNWSGHGLMDLQGYDAFLSGRISDYPLPEEYLLRSLAAIKD
HPQPPQA
>Cag_0125 hydroxyacylglutathione hydrolase, putative
MSASQLVVKQIRTGGDRNFAYIAACTFTQEAMVVDASYNPAMVATVAANE
GFTIRYIFSTHSHVDHTNGNAELSQLCGVPALLYGDMVPDLQRSVLDGTV
LPLGKLNIQILHTPGHTPDSISLYCDNALFTGDTLFVGKVGGTYSDEDAR
TEYESLWQKLMVLPDATMVYPGHDYGVAPTSTLAHERQTNPFLQQKSFND
FLSLKKNWAAYKKAHGIV
>Cag_0838 conserved hypothetical protein
MPFDYSTLNLLRQNHPAWRLLCAQHAPLVAGFLHRVFIVPNVRILSQADL
VEALEDELFALRQQLGADQFPHTAQSYLNEWAENDKGWLRKFYPDGTDEP
HFDLTASTEKALAWLESLTERAFVGTESRLLTLFELLRQMSTGSQTDPEV
RIAELQKRRDDIDAEIERIRAGEIELLDDTALKDRFQQFLQLARELLTDF
REVEHNFRTLDRRVRERIALWEGAKGALLEQIMGERDAIADSDQGKSFRA
FWDFLMSQSRQEELSLLLEEVLALPPILSMRPDNRLRRVHYDWLEAGEHT
QRTVARLSEQLRRFLDDKAWLENRRIMDILHNIETQALDLRDDFPSGGFM
PLNAASATIELPFERTLYRPPFKPLLAGVALDEGDAEIDTAALYAQVIID
KAELLRNIRFELQMRNQVTLAEVVERHPLRNGLAELVAYLQLAGEWQQST
VDEAVEEQVQWQSATGITRAATLPRIILLK
>Cag_1281 Membrane-bound metallopeptidase-like
MALFGCLRCLVVPLWLVGGLLVSFPLTIAEAAPSSSRHTATIARISKERA
AVEANLRSLKQQLQEYQTRLTQVSRKEAQSFKALEAIRGKITVLEKMITD
NQRYLAELDADIDHLQNELEGNRQVYGQLSDDFRRTAISVYKYGGNRDVE
HIFGASSVNDALVRAQYMGFFTRAVSHNVNELQAAAVRLEENREALEQTY
EQKMAMVKEQERQLQQWSASKKEKEQVLVTLKKNKQEYSKQLSIAQKKRQ
QLQARIEGLIIAEQRAIEAEMERQRKLAEARRVAAEKRARERAEAERRAA
AQREAERLAAERQAKARAAAEGKKASRSKQEKEEVALKPAPSVTKPLPAE
PQRREEPIVEEADELAAISVNFDKAYGSLPLPVQGGVVTRRFGSVHDKDL
NIVTTSNGIDISVPAGTPVRAVSGGKVVQIAFLPTFGNIVILRHTNSYLT
VYANLGSLQVAKNEVVKSQQQLGVVGKSSDGASMLHFEIWKGRTKQNPAK
WLR
>Cag_1207 conserved hypothetical protein
MLARIQSLYLFVVALLAVASMALPIWSFNATPQLIVRDLASAPLDNALYN
LASTAGMVLSPLTAIVAGAAIFLFTNRALQTKLIMLAMLLFAGDLVAALA
AAHMMNEHFVALGNVVVHQPQAGLFILLPEPLLLFLALKGVKTDDKIANA
YKRL
>Cag_1247 Nitrogenase molybdenum-iron protein alpha chain
MEEKLMTSDPAQVRETLIQKYPPKVAKKRAKSIVINDPEIVPEVQANVRT
VPGIITQRGCAYAGCKGVVLGPTRDIVNIVHGPIGCSFYAWLTRRNQTRP
ESPEHANYITYCFSTDMQEENVVFGGEKKLKVAIQEAYDLFHPKSIAIFS
TCPVGLIGDDVHAAAREMKEKFGDCNVFGFSCEGYRGVSQSAGHHVANNG
VFKHMVGRDNTVKPGKFKLNLLGEYNIGGDAFELERIFERVGITLVASFS
GNSTVGALENSHTADLNIIMCHRSINYMGDMMETKYGIPWMKVNFVGAQS
TAKSLRKIGEYFGDEELKARIEAVIAEEMPKVEAVINEIRPRTEGKTAML
FVGGSRAHHYQDLFTELGMTTIAAGYEFAHRDDYEGREVLPKIKIDADSK
NIEELKVEADPELYKQRKSEAELEELKAKGLEINGYEGMMKQMTKKSLVV
DDVSHYESEMLIEMYKPDIFCAGIKEKYVVQKMGVPLKQLHSYDYGGPYT
GFEGALNFYRDIDRMVNNPVWKLIKAPWEKAENGGVLEAAYVQG
>Cag_1343 conserved hypothetical protein
MKPIVYLESSVISYLTSRPHRDVVIAGRQAITQEWWEYQRHQFELRISIL
VEEEISRGDAEAAAQRLASIADIPSLTLSDDAVMIAHLLLAKCAIPKGSE
DDALHIGIAAAQGVDFLLTWNFKHINNAVTRGYVTHIVEACGYNCPQLCS
PEELMGRYYEYD
>Cag_0788 transposase
MKDTVLFQQALCLPMPWFVKSSAFDIEQKRLTIQLDFQKGSTFSCPTCGQ
HDLKAYDTAEKQWRHLNFFQHECYLTARVPRISCPTCGVKAITDLPWARR
DSGFTLLFEAMIIALVPSMPCKTIANYVGEHDSRIWRIIHYYLDEALEQQ
DLSAVTKVGLDETASKRGHNYVTSFVDLESSKVLFVTEGKDATTVEKFHK
HLLAHKGKAENIKEICCDMSPAFIKGVTTNFPETHITFDKFHIIQVLTKA
VDEVRREEQKERPELAKSRYLWLKNQVHLNQSQQVKLEKLQLKKLNLKTA
RAYQVKLNFQEFFKQAPAYAQSFLNQWYYSASHSRLEPIKEAARTIKRHW
YGILRWFTSNITNGKLEGLNSMIQAAKARARGYRTTNNLIAMIYFIGSKF
EFALPALTHSK
>Cag_1319 transcriptional modulator of MazE/toxin, MazF
MTILKRGAIIDVNLDPTQGSETGKVRPCVVVTNNIYNERVPVIQVVPITA
WSEKKARITTNVTILPTQQNGLLKPSVADCLQTRPIDHRYRLVNIRGTIS
NEELKKIDAALKAVFAL
>Cag_1895 hypothetical protein
MLLGEMMIQGVAMSAYSWSGYNGTGYESLLQQAVEVGATSVLLGSVSIID
LNNGAVSAWVRDDGFTTTASMGDVEAAIQQAQAHGLQVFLKPQIHSYNPA
SAAFGGNPYNNLINPDPSNPLIIPNLDLFFEGYKAYIVEWAELAERYQVP
LFSVGNEMVAVTSAEFTPYWEDIIASVRNVYHGQLTYAAMTDVKWDSNDE
VSHIEFWDKLDYVGVDMYPDFDTGATIPTTPTVEQLNDIWVEQKWQSYLS
AIAEATGKPLLFTETGVASFLGGANRSRYTDALISQMGTVRDDATQTNWF
QSFAETWMGENQPEWFGGMYFWNNDPPYNAGLQDITGYTFFGKPAEVVVS
SLFDAVNSLDFDQTLFLASDSDDRIALYKYIAEADANPLTRAQSYHSTVI
IELNGTILEGAEAVTPTIHFYLNGKDYGAVTLSNVESEYSINKGIEAASK
GEAYYPHSTLIPFLFEIDELEVRDIHIVRDSVQVENSEVYISRVTIVPDM
GAATVNTTVNSLQNAWLAFEEPSQAWGGATGYQFPNGAIPYDVPSVTIDT
SPYKKTLATMSGTPDNPITVKGYEGFDTVYLLGSPEQYTITLEGDMLMVA
ESSGLGQNSQLGGVERLLFAEADYALLFGGMGNDTLYGGAGNDRFNGGDG
NDVVLLSGYATEYEVSDNEASATYTITDSVAGRDGSYQLSNMEALQFGAS
PMQWNLEEFRAALAASQLLPQQEEPFNVTGSVTFWKNGAAISNVATTLSL
HSVTNNGEELLFQHLQHHADGGYSVEVWANATDALHSLQFEFQLPTNAQA
AWHFSEEVPQGWQTGVNNQGADALLIGGMGATALPSGLVQLGTLSFVAPT
DADRLEIALTRGELGKQWLVPATITLESNVLASNGGYQHNALWQGSYHLS
VQHESTEEPTNMVTMSDAYAALQIAAGHNPNESEAPLQSWQFLAADINRD
GKVRASDALTILKMALNYHDAPSEELIFLPEWVGKSEMTRSSVDWSATEI
MLDVENYQIVNLIGVIQGDVDGSFS
>Cag_0048 conserved hypothetical protein
MVLGLLQVYNALSINELAKKSEKLREQIRLNNSMITTQKLTADELQSIHN
IEQEALLLGLEASHEPPIEIERTIEP
>Cag_1896 hypothetical protein
MKYPSLFPFILSAATMVASTQSIQATETKRHADAKELPQVTIPDLPFATA
PDAPQALGLRQSSITGSAEHDYIKLYGNGGQARHPEPVRNPSLTRSNTFS
STVAIELNGTILAGTEAATPTVRFFINGKDVGVATLSTEQSAYSKKTGGV
PHSDLQRFTFHVDELAIREIKLVVESAPVPQSEVYIHRVNINAEVNLDQN
LEANALRGAAVNFATPSALWEGRNGYQLPQGAIPSDVRSVTIDTALYQTT
LQRAPGTPSNPIIVQGGGGEDTLYLLGSSVQYLLAVDEGGTLVIAESQGL
DQNALATNIARLEFADGSFFLAPQTVAGKAITLATGSEGSKQLRAQLPAG
MGITVRPLSSSAMQEQLRSLVGKSVQPTIEQRLSALFVNQLEPQLVLRGL
DFATLPMLRHAADVELNGSTKGKQWVMVESNALPDGAHVTVKDVDGLLLS
GSRDLLVQSKGKSVTIVAGDGNQELRSEGGNDLLFGGNGNDRLFAGAGND
ELCGGNGDNLLDGGAGSDVAFFSGNVAEYRMTHNAATNMTSVVDSIPNRD
GSNQLINIEQLRFADRTELIAQGK
>Cag_0988 ATPase
MSATSAATPVIAMRELCKYYVMGDETVRALDGITLEFKHNDYAAIMGPSG
SGKSTLMNILGCLDSPTSGTYELNGEQVADMDDDELARIRNRDIGFVFQT
FNLLPRLNCLRNVELPLIYAGVPPEEREARAVAALQQVGLAERIRHKPSE
LSGGQIQRVAIARALVNKPAIILADEPTGNLDTATSHDIMRMFSKLSEEG
NTIILITHEEDIARCTQRLIRLRDGKIESDVRG
>Cag_1455 conserved hypothetical protein
MSVKPVDLNALRATHGNLYETVVAMSRRARKLHEEERSELEERLLPYKEM
IRNPASEAESDKVFPEQIAISLDFEVRQKASHRAVADYFDGKYDYMVEKP
VEKKIVLPTNDDDEADGH
>Cag_1302 ATPase-like
MIKQLTLTNWKSFAEATLYIDPLTILIGTNASGKSNTLDALLFLQRVSSG
IPIFQAIAGDVNLTPLRGGMEWVCRKPFNTFTLTVLTDGLSKDEEYRYTL
TVQVNGTKAEILHEELTLLIYGTNRTTSKEKRLFKTELDEINHPSIPTYC
YTGTQGRGRRFDLLRSHTILNQTETLNVRKEVQEGAKLVMTQLQRIFVFD
PIPSHMRNYAPLAETLLSDGSNLAGVLAGLEPSRKIEVEKTLTTYLKALP
ERDIKRVWTEHVGKFQSDAMLYCEEGWSNETTQEIDARGMSDGTLRYLAI
VTALLTRQSGSLLVIEEVDNGLHPSRAHLLIRMLKELGKQRGIDLIITTH
NPALLDAAGNRMIPFITVVHRNSSTGTSSLTLLEDIEQLPKLIAQGTLGD
LTSDGRLEEALQQKRGNGE
>Cag_0144 ATPase
MITKIQIKNFRQIRDQTLELKQVAVVIGPNNGGKTTLLQAISLFALGLRA
WGMQRINKKSKAQKRTGVAITLEEVLNIPISDFKELWSDLNVREGIINEE
GKPTSKNIRIEIHAEGYTQNVFWKIGFEFDFGRDSLIYVRLTQDENGELY
DFPEVLLEEKIGYLPSVAGLKPIEDKLEIGSVLRNIGNGNTSDVLRNICY
ILYNASDKELWQNFVKQIDELFKIELNPPQYYSLTGLLKMSYNEGAKKHI
DLSSLGSGAKQAILLFAYILAFPNTVNLLDEPDAHLEVIRQSNIYDRISD
IAKKNNSQIIIASHSESVLNRAFTKDQVISGIFGEFEEVSNKKYITNALR
TYGYEEFIIARQRPYIFYFEGTTDLDFIKAFCKRLGRSDVFRFIEDHVYP
YPVANDVVRVRNHFDTLKKFIPTLRGFALFDNLHKNLESNQPDLLLRQWG
RNEIENYIPIPQTLFAFIESADYGELWKNRFKELVESNVPPAAFIDMNHS
FWKKTKMSDDFLTPLFEGFYAEAQMHKGLMDKSKFYQLVDYVDVALLQQE
VIDLISAMYVHFCVK
>Cag_0641 NADH dehydrogenase I subunit 4L
MMEQLTTIGLNHYLTISALLLAFGLFAVMTRKNAIVVLMGVELILNAANL
NFLAFSKYNGGMEGVMFALFVIVIAAAEAAIALAIVINIYKLFKSVDVSS
VDTMKE
>Cag_0288 NADH dehydrogenase
MYCNCANTTQGYFISIPMKKRVVIVGGGFTGINAAKILGNKPDLEIILID
RKNYHLFQPLLYQVAMTALGEGDIAAPLRNMLANYRNITVFKGIVERIDV
ANKTIITDFNTISYDYLILACGVQHHYFGHNEWEEFAPGLKTLAQAKEIR
RRVMEAYERAERTTDPVERKKLLTFVIVGGGPTGVELAGSIGEMSRYTLS
KFYRNIDPKLTRIFIVEAAPRILGSFSPEMASKATRALEKLGVQVWTSSM
VSDVDVNGVQIGNERIEAATVLWAAGVTATSISRNMDGVETDAIGRIVVE
NDLSIFGHPEIFVGGDLAHVEREDGTTLPGLAPVAQQQGRAIAHNILRDM
QAKARKPFRYTDKGQMATIGKNKAIVEMGRLKFDGIMAWLTWLMVHIYFL
TSFRHRVFVLMQWGWSYFTYSFGARLIVNREWRFYPDNKSPE
>Cag_1720 thioesterase, menaquinone synthesis protein
MIMQPPLHIEIIGNKALPKIVFLHGFLGSGRDWLPLAEMLTSHYCCVLVD
LPGHGSATLSASDEHHAYFTATVEALATVIQPISPEPCRLVGYSMGGRIA
LALMLTHPELFHQAVIVSASPGLPTEEERAKRRAGDEGIARKIERNFPDF
LEAWYQQPLFSTLKNHPLFQEIERKRAINNSESLAAALRLLGTGQQPSFW
DALSKCAVPTLFIAGEKDERYVAIARQMVKLAPHATLSIVPNCGHTLHIE
NKESFVEQLHTFFNQ
>Cag_0279 conserved hypothetical protein
MNNVQLYTEISLLPASLKQEVKDFVDFLKTKSQSKSKITEREFGCAKGLF
TIHDDFDEPLDDFKEYM
>Cag_0277 Sel1-like repeat
MKRKFFTSILCLLLFSGTVHAESPQEIQQLRIAAEQGNAAAQFNLGIKYQ
FGKGVRQDYVEVIKWFRLAAEQNHVYAQLMLGTMYRNGEGVRQDYIEAIK
WFRLAAEQRYADSQYSLGLMYAGGKGVSKDYVEAIKWFRLAAEQGNVEAQ
AMLGSIFYVGKNVQRDEFEAIKWFKLAAQQNYAYAQMMLGTMYATGEGVR
QDYVEAIKWYRFAAEQGNVEAQYDLGLLYLNGYGVRQNKAIAKEWFGKAC
DSGSQEGCNQYRALNIPQKHR
>Cag_0076 Cell wall hydrolase/autolysin
MHYFSVRYYRWLFSCLLLCAVVLLTPSSAFAASATNSGSLSLSVQRSTFS
YTVRVLTIREGEQQLVDLESMARALRLSFSREPEAIVLKEPFTNSNVRCM
VAAGNPFVAVQPASSGGNPLLVQLQATPVMRQSRLYLPVEQACRLFSLWL
ERDVRYQPSSGRIQAMLKGKAVVPTFLADAKQRQRRATSIAATSNASSRT
STVITGVSVDERANGAIIRFTASGPPATFSLAPPQPDSSGVVQLQFEQTT
PTSRLRFQRFNGALVRSITPQQKSGQPLHFTIVLDSRFQFVTPLEAQYDK
ARNRYELLVRTEANVEEILRREKEQHIAQTLSHDVAKWKLDTIVLDAGHG
GKDPGAIGLRGTQEKDVVLNIVRDLGNFIEQQWSDVRVVYTRSNDAFVPL
HERGRIANKSGGKLFISVHCNASVNRSARGSEVYILGAHKNSAALNVAMM
ENAVIRNEVDYQESYKGFSEEYLIMSSMVQSAFSRQSTLLAQQIIRPVAE
KQEGNNRGVRQAGFMVLWTPSMPSALVEVGYISHPAEELLLRDRQRQKAV
AYAIFKGIERYRKSYESNVMAALN
>Cag_0269 hypothetical protein
MERVALTTDEQNLWDQIYFSEKTIQIDHDKPRESIEPAYQLAQSLLKRKV
IPQIRMRYFTDPKLNIGGRDKSRKEVFERNGTSGDMILRHPHFHKYLRYF
VLGPDIPLTAINEFVVLANDCDPITSGDTKEFCNLAWKQIRNSGQDTKYA
AEEYFKLGLELELGEDVAYAIRDTIMRMR
>Cag_1215 conserved hypothetical protein
MEQILQKLGIELNEQTRLSLDSTFQFGCHSKLSCYNSCCSNLDIFLTPYD
VLRMKNKLGITSTQFLSEYVEPVIQQESKLPFLRLKFQEGGQCRFVAPEG
CTIYSDRPVACRYYPVGFGIHKSQNAHGNDFYFLVREDHCKGFEETQEWT
VREWRKNQGIDEYDDNNRIWMDIILHKKLVSPDLEPDEKSLKMFFMASYD
IDSFKEFMFESRFLEIFEVEEEELELLKSNEAELMLFAHKWLQYALFKQP
TMTVRQQPKA
>Cag_0342 Diaminopimelate epimerase
MNIPFKKLSGAGNDFIVIDNRHNSIEFSTETIKALCTRRTGIGADGLILI
EASLSADFTMKYYNSDGLLGSMCGNGGRCAAYFAYNSGVPTQKSNEYLFE
ANGNYYHAWIVDKEQVKLQMNPPHDFKENLQVEGLLCYFLNTGSPHVVTY
VDDVNIINVYEKGDAIRHRTDIFNGGTNVNFVQQTSANELIIRTFERGVE
DETLACGTGAVASALISHMLGKTNSTSLNVTVRSGDTLQISFTSAMENIY
LSGPAKIIFSGTFEKTSNYA
>Cag_1378 Anion-transporting ATPase
MRNIIFTGKGGVGKTSVAAATALRAADMGYKTLIMSTDPAHSLGDSLDVK
LGPSPVKVAENLWGQEVSVFGDLNLNWDVVREHFAHIMETRGIQGVYAEE
MGVLPGMEELFSLSYIKRYNEEQKDYDLLVVDCAPTGETLRLLSLPETFG
WFIKMIRNVEKFMVKPVIRPLSKKIKKIDDFVAPEEVYEKVDNLFSSTEG
IIDLLADGSKTTMRLVMNPEKMVIKESMRALTYLNLYGITVDRITINRVM
PDQSPDPYFQKWRSIQQKYIEQIEEAFAPIPIAEVPLFDDEVVGLAMLRR
VGEKVYGNSNPLDVFFKENPIDISKVSDGHYKVRVRLPFMENMGLEPKIL
KLGDDLTIRIGDYQKVVALPTFLAGMESTGASYEDKWLSIDFAKSH
>Cag_1826 30S ribosomal protein S11
MATVSRKKKKVKVTPEGAVHIKASFNNIMVTITDVQGNTVSWSSAGKNGF
KGSKKNTPYASQVTSEAAAKEAFDLGMRHVHVFIKGPGAGRDAAIRALQG
AGLEVRTIKDITPLPHNGCRPPKRRRV
>Cag_1295 MCBG protein (microcin resistance protein)-like
MFNHFIAMECYQQTFEKKDFYENPLTMGTYEECHFHGCTFINVDLSHYIF
INCTFDGCDLSMVKLNNTSLQEVLFCNCKLLGVPFSDCRQLLLSFRLERC
MARLALFCRLKLKGSLWSECMLQEADFSEADLSNALFERCEFPQATFFHT
NLEGADFRTSWHYSINPATNRVRKARFSLAGIAGLLESFDVVIE
>Cag_1902 Acetohydroxy acid isomeroreductase
MNVYYEKDADLAYLQGKKIAVLGYGSQGHAHSLNLHESGLNVRVGLRPES
ASCAKAREAGLEVTSVAEATKWADIVMVLLPDQNQKAVYEAEIAPNLEPG
NTLAFGHGFNIHYKQIVPASSVNVIMIAPKSPGHLVRRTYTEGNGVPCLI
AVHQDPTGEAKQQALAWAKALGGTKAGVIETNFKNETETDLFGEQAVLCG
GSAELIKAGFETLVEAGYPEELAYFECMHELKLIVDLYYEGGLSRMNYSV
SDTAEYGGMTRGPRLITPAVKAEMKKILEEVQDGRFAKEFIDECNGGYQN
LSKLRESNSNHAIEKVGAKLRNMMSWLIKK
>Cag_1201 hypothetical protein
MSFDATSIEYAFAKLIGNTTGARSTSHDADVFRGGNPKNLAKALSDAADA
LEEKVKSVPFAAADTEPGGAKARIDVAISRLHKIAESMSKSATVSREDYH
WEIIGCLVSTIADLLEKAKC
>Cag_0889 transporter, putative
MSSNNSYKLGPITLAPSVLPRHALTYLYAAFFSIGLVTFVSIGQTYILNE
HLKIPTSQQGAISGDLVFWTEVVTLLFFVPAGMLMDRIGRKPVYSAGFLL
VALSYALYPLSRSIEEMTIYRMIYALGIVALTSALSTVMIDYAAERSRGK
LIAITGFLNGIGIVVINSFFGGLPQKLMAQGFSGIEAGLYTHFGIAAIAV
VAAVVVGLGLKGGTEVRKEDRPPLRSLFTSGIKCAKNPRILLSYAAAFVA
RGDQSIIGTFVPLWGTTTGIALGMEPAEAVKQGMMMFIISQAAALLWAPV
IGPLIDRWNRVTALFVCMALASVGYLSLGFIGNPHDANAYIFFILLGIGQ
ISSFLGAQSLIGQEAPKAERGSVVGMFNISGAIGILIITTLGGRLFDSWS
PKAPFLVVGAINVLVMLAAIYVRIKAPGKNLHVAEEG
>Cag_0007 Peptidase M41, FtsH
MPRPKIPFFYSLIALALIIGVQLAFFWSGSTPEVPYSTFRSLIAENKVES
VKLAPEKIMVQLKQGVTVSVGAAEGDQPSVSAPKSSSKPSNELVVTPVRD
DKLIELLESKGVRYQGVQGNSWIGELLQWIIPFGLLLGMYFFVFRRMGGP
GSQFMNIGKNKAALYENFDEHTRITFKDVAGLDEAKAEVMEVVDFLKDPK
KYTTLGGKLPKGVLLVGPPGTGKTLLAKAVAGEADVPFFSLSGSDFVEMF
VGVGAARVRDLFKQAKEKAPCIIFIDEIDAVGRSRGKGVMMGANDERENT
LNQLLVEMDGFATDKGVILIAATNRPDVLDSALLRPGRFDRQIMVDKPDL
KGRVDILKVHTKSLSLGNDVNLKTLASQTPGFAGAEIANAANEAALLASR
RNKQTIDMKDFEDAIERVVAGLEKKNKVINPRERQIVAYHEAGHAIVSWM
MVENDPVQKISIVPRGMSALGYTMNIPLEDRYLMTRRELFARICGLLGGR
IAEQVIFGEISTGAQNDLEKVTSIAYNMVMVYGMSEKLGNISFYDSHNSG
YGMEKKYGEETARLIDQEVRHIIEEARVAVLELLAEHRDKLERLASELLQ
KEMLQQSQIEEILGKRPGGNLFPSSYDDVEEVEVDAAPSVTCEAAVVENG
TAVSATTDKAKSQLSPCEQKALEEAVARIQRAREQREAQQQQQNPPSEV
>Cag_1834 Ribosomal protein S5
MAKTSAKSIRPGELNLKEKLVHINRTAKVVKGGKRFGFNAIVVVGDKEGH
VGYGLGKANEVQDAIAKGIEDGKKNVIKVPIVKGTIPHQIVVKYGSAKVM
MKPATPGTGLIAGGAVRAVLEMAGIHDILTKSLGSSNPHNVVKAAIKGLQ
YISDAYDVGERRSKSLSDVFES
>Cag_0136 hypothetical protein
MDMLVDSELAQIINERAKEKKDAIEVNIDEL
>Cag_0697 conserved hypothetical protein
MKILLDTHYLIWSFTNPEKLPQGTAELLMAEENEIFFSQASLWEISIKFN
LGKLVLMGITPEKLYNEIKLSYFQCLPLQNEELISFHRLPIKHRDPFDRI
MIWQCICHNISFLTVDNVIPNYKQYGLKIVTSS
>Cag_1010 CRISPR-associated helicase Cas3, core
MDAFDNKLYAHTLEGVKNKSQWQTLREHALSTAHLASDYATSFGLAECGY
WLGLIHDLGKSLPQFQQRLEDDRVKADHKHAGGLFLWDKLNNGTKPSHLA
AQCLALCVISHHGGLVDCLNQLGEDNFINTIENKLYQANLKDSLENLQLD
TELEDNIKKISGARSLVQDEIDFFFQNILQQAKKYWPTEDDKNKRKKLEL
FRIGLMTKMLFSCLIDADHTDTANFHDEERKNKNLPHLPKWDELRDMVER
YLETLPQTSSVDIERKRISDKCIQASVCESGTYLLTVPTGGGKTLASMRF
ALHHAVNREPYIPFKRIIYVIPYTTIIEQNAQAIRKVFVSQLNEDVLNEM
ILESHSNVLPNEENRNNRVLAENWDAPIIFTTNVQFLEAFYGVGTRNARK
LHNLANSIIIFDEAQTLPVRCLHLFCHAVNFLVEHCNCTAILCTATQPLL
HEIPAEHGALWLSKNFQILPDKFRKDSADSLKRVTVIDECKPQGWRLEEV
ADKVSCIHKQGNSCIIILNTKADTRELYTILRKRHGEELTYHLSTAMCAA
HRMDILSEVKTLLRNNQPVICVSTQLIEAGIDIDFDTGIRALAGIDSIAQ
AAGRINRNGKKPADSALYIQNISGENLKNLQDIAVAQVEAQKVLREFKEN
PNEFGNSLLSEAVMKRYFKFYIFNRKDEMTYKIKSDNLVNLLSSNVNAVG
EYKRTHKNQPYPNILRQSFATAAREFKVINSDTQGIFVPYNDEARGLLNQ
LRNTKSSEFQRYLFRRLQRYTVNVYPYMLKKLTKIHALEPLCENSGILAL
YEIFYDSRFGVNINSTISPDMLIQ
>Cag_0875 glutamine amidotransferase, class I
MPDPALLLIKNAPHEGAGLLENVLHERTISYHTVELTEGQAIPNPRNFSG
VVVFGGPQSVNDATPSMQAELRALEQILADEIPYLGICLGMQALVNAAGG
VVLPCPIREVGFYDNNSKPYTVTLTEAGKNDPLFANLDHTFRVFQLHGET
VEMPASGVTLLGYGSQCPIQAVRAGDCAYGLQCHFELTSDMFAYWCQFDT
DLKRMDQAMLQQHFAEIQQTYVATGKTLLTNFLTIAHL
>Cag_1990 2,3,4,5-tetrahydropyridine-2-carboxylateN-succin yltransferase
MSITIQEVQAAIERYLPLTPQELQANNDVRLMFEAFKQLLNNGVIRSAEK
VGDAWQVNMWVKQGILLGMKLGRLQEMLLPFGGRSGFTFIEKDTWPLKEV
GIGHNVRIVPGGSSVRDGVYLAPGVVMMPPAYVNVGAYVDEGTMIDSHAL
VGSCAQVGKKVHLSAGVQVGGVLEPVGALPVIIEDEVMVGGNCGIYEGTI
VKERAVIGTGVILNGSTPVYDTVNNCVYRKSAEAPLIIPAGAVVVAGSRP
LKGDFAAEHGLAIYTPIIVKYRDSRTDSATALEEALR
>Cag_0806 RfaE bifunctional protein, domain I
MEFDKPRTMPLSALPPSLPEFEALLALFQGKRIAVVGDIMLDAYIFGHVS
RISPEYPVPVVDVTREEHRLGGAANVAQNTRAMGAETILFGVTGNDRNRD
TLVELFKQQGLTTNALICDPSRPTTCKTRILSQNHHITRVDFESRQEVSA
DIEAQIVHTFESMIASLDAVVLEDYNKGMLTAHLIERIIAISRKHKVPVL
VDPKHRNFFAYKGCTIFKPNLSEMATSLGIAIPNCNAEVEAACKILRDKL
EAETIVVTRSEQGMSIYNGNFTHIAASSLDVADVSGAGDTVIGMLALGAA
AGMDIVTNTSLANLAAGTVCQEVGAVPVKSEKLLKAYRDYLLQQ
>Cag_0273 conserved hypothetical protein
MKKIILRQAFNELNDAIAYYEEQQPGLGVKMKDEVDQHVHWILNHPLIPR
LRHGGYRRVNLKVFPYYIAYLVHQETLWILAIAHTHRKPKYWIKRKNKI
>Cag_1770 conserved hypothetical protein
MNVSLAIDFNQLKSLIAQCGIEEKTQIVQMLEKDTFPLRFNALLEKVKTD
QLTLHDITTEIETVRQQRYSAKR
>Cag_0713 Protein of unknown function UPF0004
MPTHSLFLLSLGCSKNTVDSERLLAQAAAAAIRSVERVDEADTILINTCA
FIEDAKKESIEEMLAALDKKREGVVKQVFVMGCLPELYRRELQEELPEVD
AFFGTRELPQILASLGARYRSELFDERLLLTPSHYAYLKISEGCNRICSF
CSIPKIRGRYQSQPLEQLLREATRLQQQGVQELNLIAQDISLFGYDTTGH
SQLNELLLRLSDMDFLWIRLLYAYPVNFPLEVIDTMRDRSNICNYLDIPL
QHCNDRILRAMKRGVTKADTIRLLHEMRQRNPNIRLRTTMLVGFPGETRA
EFEELLDFVEEQRFDRLGCFPYNHEEHAPSAMLEDLLSIEEKEERVSELM
ELQEAVAESLNREFEGKEIEVVVDSFVEEMAFCRSEYDAPEVDNECLLTF
GAQNIQAGNFYRALINDSSAHELYGEIVQERSAGNSPQ
>Cag_0592 DedA family
MTDAFLFLTDFILHIDSHLQTLAAEYGLWLYLVLFLIVFGETGLVVLPFL
PGDSLLFAAGSLASMPNSALDPNVLFVVFFAAAVLGDTLNYHIGNKFGNK
LVHGGYTRFFKAEHLEKTNAFFTKHGGKTIIIARFVPIIRTFAPFVAGIG
NMPYRTFLLFNVIGAFVWVGFFCYSGYYFGQLMFVQENFKLLIIAIIAIS
LLPPIIEFLKHKFSATKQR
>Cag_1539 PAS/PAC sensor signal transduction histidine kinase
MNTIIPDFQQPDAVVIALLDSFSDALFIMDADSIILHANEAFATQCGKKP
EECIGFNACSLIESVFVMPHFAYECSATTNEVLDSAKQQSFETTYLHQSW
KTTITPLHLKEYDQGATQLLVRIEDISEKKNLEQQSQVTDTLHKTLLETI
PGFAIILDASGHLMAWNNYTRTIIFGKTDNEMYDVDPFGFICPTERTSIR
KKFFKTLRAGCEKSAEIKIFPQKEKHAIWLLIHAKPVFIDNQMCVVAVGI
NITERKRVEAELLENKLRYNHAMEAARAGIWEWNVKTDKLTWSEQLWGLY
GLQVNSEPLTHQLCVTTVHPDDREMASQMIKSAVQQQIPASIEYRVLHPD
GSVHWLNSRGIPLHDDDGTLHHYIGTIIDITERKETEIVLLENKIRLKQA
LKATRAGVWEWELATNENSWSDEIWALYGIEQDHQQPSFDLWVNTVHPDD
RLLAIKAVTEATQQGAEINVEYRVCHHDGSEHWLMSRGKPILDPQGNTIR
YIGTALDITERKKMEIALGESQRRFTFALEATNAGVWEWDIKTDVITWTE
RVWTLYGLEPFSATPSHQLCATHVHHDDYEKIFQNIMVAVQRESNINVEY
RVCHPDGTTHWLMCRGMPLRNANGTVSCYMGTVMDITKRRELLEKLRRSE
QRYRLLFDNMMNGFAYCNMIYDGKKPIDFVFLNVNQTFKKFCKTKDVVGK
KASEVIPNFMTLANELFVVCARVAHSCQSEQFEYFFKPFNEWLFISISSP
QKGYFMVIFDIISERKRAEQLILDSKLKLEAALESMSDAIFISDTAGRFI
EFNNAFAAFHKFERKEKCFMTLTEYPAILDLFTDNNQLTDLAEWPVSRAL
QGETMTNAEYTLHRKDTNHTWIGSYNYAPIRNNEGEITGSVVTARDITEQ
KLIEKTLKESELKFRSIFDYSPVAIGIGNATNGMLIDVNTFWLELFGYTK
EEVVNQTVTDIGIYVKPQERNEILDILAAHGRVTNKPVRLRKKSGEYITI
LYSSVFMKLDHKTILLVIITDVTLQEIQQQNITLLEKAVAERTQQLQDEV
EQLQRFISMISHEYRTPLAIIRGNLDLIGLKNKTKKISNEAEINKINRAI
DRLVEVMEVSMQESRMIESKQETVMTTFQIAAIVTSQIETFRAMWTERLI
IYSETVENSLVLGEPSQLKLAIFNLLDNARKYSPTDSPIKVTCSLDADTV
MIRIRNYGTSITKDEEKTLFEKFQRGSNSMNTSGAGLGLWLVKSIINRHN
GQVSLTCIDSGVETLVRLPLHTT
>Cag_0250 chlorosome envelope protein E
MATNISGAFTNGAAAYGRFLEVFIDGHWWVVGDALENVGKTTKRLGANAY
PHLYGGGAGSAGSLRGSSPTVSGYAQPSKPTESRFND
>Cag_1336 conserved hypothetical protein
MINHLFLPNQRDDYDSPWKEAIERYFPEFMALYFPTAYAAIDWSKPYHFL
DQELRTIVPQAENGKRIVDKLVQVELLDGKESWLYIHIEVQGNRETDFEK
RMFTCNYRIFDKYGKPVASFAILTDKDCNWRPTSYSYAFAGCKLTLEFEV
AKLLDFEPRMEELLASNNAFGLVTAAHLLTQKTRENMLQRLDAKSQLIRL
LYNKQWTKERVRELFRVIDWFMELPQELEHQLQTEIYNIEEEQKMKYISS
IERYAMEKGWSEGIEKGIAEGLEIGMEKGIEKGKLEVAERLLGVGMNIEQ
VAELTGVSVAQLRNKR
>Cag_0370 ferredoxin, 4Fe-4S, putative
MSSATTNTSNSTNTPRKRKRLLAPREEIPWFPTLDLLVCNGCADCIVYCK
PGVFELDEKKGVKRPKVKITNPFKCLVLCTRCVPICTSGAIKLPNPKDFE
HFVEEYEE
>Cag_0255 iron-sulfur cluster-binding protein
MTTSATHFVRHIRAEAARIGFVAIGFAAPELSPMAMQRYMEMLHDKRHGE
MAYLANYTAERANPTLLLSDVKTVISVALSYNHPITYCNGFPKISRYALI
DDYHTVMKSKLEELHVAIERIMGEPIAAIAAVDSAPLLEKSWAEQAGIGK
VGKNSLLKIPSAGSFVFLGELLIDKEIRVTPLALPNYCGSCNACIEHCPT
GALLAPSKLDATKCISYLTIELKRDFTADEAAMIGEWLFGCDICQEVCPY
NRQASIVAHSAFTVREELLNVAVDDLLGLTKSSFRKLFYGTPVFRIGLRR
LKRNARAVEENVRRRGKNDG
>Cag_0539 Imidazole glycerol phosphate synthase, glutamine amidotransferase subunit
MVFIADYGAGNLRSVLKAFEFLGIKAIVSNDPRKMAGYRKVLLPGVGAFG
QAMQSLEALGFVSALLEHVDKGGHLLGICLGMQLLLSESEEMGTHKGLNL
VPGKVKHFVSSSDKIPQIGWNAVDFSKQSDLFRNVADHSFFYFVHSYYCE
TESVEAVAATTLFAGQNFCSAIEKNGIFAVQFHPEKSADAGLKVLANFAE
L
>Cag_1885 putative transcriptional regulator
MTQQQIITELGLGNGYAVEFIPAIKIAKLGTVACGYLNCGGGYVICGVNE
DGSVVGCNLEQLEQLEEALHNDIAPKALITIEQETLEGKPILLIEVPAGH
DVPYAFANTIYLRTANYTHNANVEAIRDLVMRRNVEPERWERRFSLAAIE
SDLDVDELYKTIADAKSANRTFFRNERDVVQTLEDLAVSRYGRLTQGGDV
LFSRNPALRYPQVRVRAISYASDKTGNTYRDIKSFEGPLHRVFVEAYTFI
LRNTPTVAHFSKASPIRQEQPMYPEFALREALINAFAHRDYATASGGVAI
HIYPHRLEIWNSGTLPDGVTPNNFTNGQLSILRNPDIAHVLYLRGLMEKA
GRGIVQIVKACLNAGLPAPQWASDAARGVTLTFIAGEVSGEVSGEVSGEV
SGEVSGEVMRLLQAISGEMKRTDMQEQLGLKHEDYFRKAYLLPALHNGFI
EMTIPDKPKSRFQKYRLTTKGQQLMEQGVQA
>Cag_1918 transposase
MKDTVLFQQALCLPMPWFVKSSAFDIEQKRLTIQLDFQKGSTFSCPTCGQ
HDLKAYDTAEKQWRHLNFFQHECYLTARVPRISCPTCGVKAITDLPWARR
DSGFTLLFEAMIIALVPSMPCKTIANYVGEHDSRIWRIIHYYLDEALEQQ
DLSAVTKVGLDETASKRGHNYVTSFVDLESSKVLFVTEGKDATTVEKFHK
HLLAHKGKAENIKEICCDMSPAFIKGVTTNFPETHITFDKFHIIQVLTKA
VDEVRREEQKERPELAKSRYLWLKNQVHLNQSQQVKLEKLQLKKLNLKTA
RAYQVKLNFQEFFKQAPAYAQSFLNQWYYSASHSRLEPIKEAARTIKRHW
YGILRWFTSNITNGKLEGLNSMIQAAKARARGYRTTNNLIAMIYFIGSKF
EFTLPALTHSK
>Cag_1093 putative exonuclease
MFTFLHVADLHLDSPLKGLEEYPDAPLKQLRHATRRAFDNVVQMALDERV
AFVVVAGDLYDTDWRDYNTGLFFVSRMAKLREAGIPVIIVSGNHDAASQI
TRSLRLPDNVKILSHTHPESYLLEPYNVAIHGQSFATRFVRDDLARNYPQ
ADPSLFTIGLLHTSLETSGDVYAPTTLDLLRSKGYNYWALGHMHRHEVVH
RNPWVVYTGNIQGRHIREGGAKGCMLVTVENDAVVQTEWRAVDVLRWARC
AVLLEGCDSMEQVYHLVRERMEELRQQAEGRPLALRVQLRGATPLHHTLH
TKIGHVMEEIRAIAVSFGDCWLEKVELELSAPHAKSDLLGAASPLASLLE
AVDALELPDGSLTSLLPDFEKLRHKLPHELISDGDPFAPPADELEILRDE
VKQLLSATIEETIGGTTERAIRGSMGGTIGGRNA
>Cag_1649 30S ribosomal protein S9
MKEVIDTVGRRKTSVARAFLTPGKGTVTVNKRPVEEYFKDEFKRQQALRP
LTLCEKAEEFDVKINVQGGGLSGQSGAVSLAIARALTESDEALRAVLRTE
RLLTRDPRMVERKKFGRKKARKRFQFSKR
>Cag_0281 conserved hypothetical protein
MSQFTNVAITKEANIYFDGNVTSRSVHFADGTKKTLGIMLPGDYEFNTGA
KELMEILSGDLEIQLVGEEWRKISAGESFEVPANSSFKLKIYKITDYCCS
FLG
>Cag_1206 Homoserine O-acetyltransferase
MMCMKALEALVSPQTLFYTSNEPFITEFGATLPELQVAYRMWGRLNADKS
NVIVICHALTGSADADDWWEGMFGTGKAFDPSEYCIICSNVLGSCYGTTG
PTSPNPATGNRYGADFPLITIRDMVRVQHRLLTALGIERIKLVVGASLGG
MQVLEWGFLYPEMAQALMAMGASGRHSAWCIGQSEAQRQAIYADRHWNGG
SYAPEQPPNHGLAAARMMAMCSYRSFENYQERFGRTMQQGRQGALFSIES
YLHHQGRKLVERFDANTYVTLTKAMDMHDVARGRGEYEEVLRSMTLPIEV
LSINSDILYPMEEQEELAELMPNASILYLDEPYGHDAFLIEVEKVNQMVN
DFLNKLE
>Cag_1756 conserved hypothetical protein
MPVRKLRDFLDRHKVHYFVVSHSPAYTAQEIAASAHVPGNELAKTVMVML
DGEFAMAVLPASRLLDLRLLQEVSGATHVALAKEDEFAELFPECEVGAMP
PFGNLYGMKVFVSEELEADDDIAFNAGGHRELLILSYKRYKELVQPIVAK
LS
>Cag_1158 conserved hypothetical protein
MKKDAQLALLEKTIKALGYKIRYEKGSFLGGACRIKEDKVVVVNKFLPIE
GKLYTLAQVIANLNLANLPMEVEKIISRHYRQQLLFPPSNEK
>Cag_0299 conserved hypothetical protein
MQIVVFEDKRVEEFQPLVLLKPLYALFVGFRSLREKLEYAVRGRATLTYH
IRRYLAPCYQEQHPELVVNRLDEDDILLVNGRLLGDEAVAQVIVEGACEV
GSAFMQNGTMLFARVHRHMLIGENGLLPDVIDTERLAEQLRVEEVNGFRL
LEHIWDIIALHPDELLRDAETLELGRIEGEVHHAAALVNRSNIYVGAGAV
VRAGAVLDADDGFVAIGAGAIVEPQAVLMQNVVLAPWARAKIGAKLYSNV
AIGMASKVGGEVEDSILEPFVNKQHDGFLGHSYLSSWCNLGAGTNTSDLK
NNYSEVPLRRNGELVTTGLQFLGLLMGDHSKCSINSMFNTGTIVGAGANI
FGGGFVPKEVPSFAWGGSHGFEHYDVEKAVETARKVMARRKVTMSASYET
MFRSVAGLSGNSLFI
>Cag_1861 hypothetical protein
MGLFIKEFKMDKAILFNELLDAVDHLSLDDQESLIDVVRHRIAECHRQEI
FSLISSARKEYQQSKLSPETPQDIMNSILS
>Cag_0350 oxidoreductase, short-chain dehydrogenase/reductase family
MKVALITGASMGIGEAFARSLAEQGRTMVLVARSVDALHRLATELEQRYR
VAVYVMPADLSQHESATAVYNYCRQQQLEVELLINCAGFSVAGAFADIPP
ERIAEMVQVNSTSLALLTRHFLPNMLQRKSGTIINVASLGGLQGVPGMGL
YSATKSFVITFSEALAHEVRPYGIKVVALCPGFIATGLMESAGQNTKAIR
LPISQTDVVVKAMQRAFVTRYVRLYPTWLDSLLAFSQRLVSRSLAVRLAA
FFAGVLKKG
>Cag_0832 transposase
MKDTVLFQQALCLPMPWFVKSSAFDIEQKRLTIQLDFQKGSTFSCPTCGQ
HDLKAYDTAEKQWRHLNFFQHECYLTARVPRISCPTCGVKAITDLPWARR
DSGFTLLFEAMIIALVPSMPCKTIANYVGEHDSRIWRIIHYYLDEALEQQ
DLSAVTKVGLDETASKRGHNYVTSFVDLESSKVLFVTEGKDATTVEKFHK
HLLAHKGKAENIKEICCDMSPAFIKGVTTNFPETHITFDKFHIIQVLTKA
VDEVRREEQKERPELAKSRYLWLKNQVHLNQSQQVKLEKLQLKKLNLKTA
RAYQVKLNFQEFFKQAPAYAQSFLNQWYYSASHSRLEPIKEAARTIKRHW
YGILRWFTSNITNGKLEGLNSMIQAAKARARGYRTTNNLIAMIYFIGSKF
EFTLPALTHSK
>Cag_1004 Protein of unknown function DUF196
MMVLVTYDVNTESPDGKRRLRRIAKTCQNYGQRVQFSVFECNVDPAQWTK
LRAKLLREMDPNRDSLRFYFLGSNWQNHIEHEGAKEPRDLEGVLIL
>Cag_0130 Uroporphyrinogen decarboxylase HemE
MLKNDLFLRALKRQPCSRTPIWVMRQAGRYLPEYRAVREKTDFLTLCKTP
ELATEVTIQPVELVGVDAAIIFSDILVVNEAMGQEVNIIETKGIKLAPPI
RSQADIDKLIVPDIDEKLGYVLDALRMTKKELDNRVPLIGFSGAAWTLFT
YAVEGGGSKNYAYAKQMMYREPQMAHSLLSKISQTITAYTLKQIEAGADA
IQIFDSWASALSEDDYREYALPYIKDTVQAIKAKHPETPVIVFSKDCNTI
LSDIADTGCDAVGLGWGIDISKARTELNDRVALQGNLDPTVLYGTQERIK
IEAGKILKSFGQHNHHSGHVFNLGHGILPDMDPDNLRCLVEFVKEESAKY
H
>Cag_0123 AcrB/AcrD/AcrF family protein
MNEGIAGKLAKQFINSKLTPLLMLASLLVGIMATFMTPREEEPQIVVPMV
DVYIPYPGATPSEVEARVARPLERAITEIPGVDYVYATSMNDMALVTVRY
KVGEDSERSMVKLWATMMKNMDKMPHGVQFPLLKKVSIDDVPVLNLTFWS
KTQDAYQLRQAAAEVADELKKISNIGDVELKGGLKRQVRVLLDKGKLSQY
NVSALQIARQIQAANSQMTSGELKQPTEDIILRTGKFLESAEEVGNIVVG
LYGATPIYLRDVATITDGPEEVSNYSYFGFAAASGKNAPKGEYPAVTLTV
AKRQGGDASVLAGKVVERLEEMRKTTIPKSITVTETRNYGETATEKVFTL
LEHLIIAVVAVTIVVGFFLGWRGALVVLASVPITFALTLLIYYLLDYSLN
RVTLFALIFVTGIVVDDSIIIAENIHRHFAMKRQPKLLAAITAISEVGNP
TILATFTVIAAVLPMVFVSGLMGPYMSPMPIGASLAMIFSLLVALIATPW
LSYRLLKVEGGHHEEYDIKKTGYYKMFDKVLTPLIDNAQKRWIAFGVVGL
MLFGAIMLVPLKAVQMKMLPFDNKNEFQVILDMPEGTSLEQTSKVAKEIS
AYLKTVPEVSSYQYYVGTNAPINFNGLVRHYYLRQKGNMADIQVNLVHKG
ERSEQSHDVAKKVRAGVQKIANRYNGNAKIVEIPPGPPVLSTLVAEIYGP
SQEQQIALAKEVKKVFAQTPGVVDVDWMVEDDQKVYNLVVNKERAAFYGV
SAEQIAHTLRVSIAGMDVGLLHTDREIDPVTIQVRLPRANRTSMQDLSGI
FVQSQGMGSLTTRLRAGEMIPLANLVTVQEKVDDKSIFHKDLRRVVYVTA
DVAGITESPVYAMLDMDKKIEQIKVPGGYKISPLYTSTPTAEDKLAMKWD
GEWQITFEVFRDLGTAFAVVLVIIYLLIIGWFQSFKTPLIMMIAIPLSLV
GIIPGHFIHGAFFTATSMIGMIALAGIMVRNSVLLIDFIQIRRAEGVGLK
QAVIESAAVRTRPIILTSGAVVIGSFVMLFDPIFQGLAISLIWGGVLSTV
LTLVVVPLVYYIAEKKSEKKSVTN
>Cag_1764 HPr kinase
MIFDQKGIRKRSITVAYFYNTISQKCCDLKLRRVNNVDEQKRRIYERDLH
RPGLALAGFTNLFTYKRVQIFGNTETRFLNHLDDEKERNRLFENFVRFKI
PCIILTSNNKLPESLIEMATRAEIPVYSTRCSSTKAIYNITDFLDDQFSI
YQQYHGSMVDVYGVGVLLVGKSGLGKSEVALDLIERGHGLVADDAVVIRR
KGESTTLMARRNNIIDHFMEIRGLGVVDMKANFGIRAIREQKEVQVVAEL
MHWDSDTEYERLGLDAKSTKILGVELPLVQLPILPGKNITVIIEVVALNF
LLKRYSNYVAAEALHSRISQVINSERTNFDGDE
>Cag_1390 hypothetical protein
MPTTLLYPIILASQSPRRRELLALTLLPFETMSVNTPETLNPTLSPEENV
LAIGAIIGTLIFVDLNRRGWNNLQ
>Cag_0862 Malate dehydrogenase, NAD-dependent
MKITVIGAGNVGATAALKIAEKQFANEVVLIDVVEGIPQGKALDMYESGA
VSLFDTRVIGSNDYKDSADSDIILITAGLARKPGMTREDLLMKNAAIIKE
VTSQVMKYSTNPIIVMVSNPVDIMTYVAHRVSGLPKERVIGMGGVLDTAR
YKNFIAETLNISMQDISALVLGGHGDAMVPVVNYTNVAGIPLTELLPLDI
IDGLVERTRNGGIEIVNYLKSGSAYYAPAASTVEMIEAIARDRKRILPCT
TLLDGQYGINSVFCGVPVKLGKNGIEQVLEINLSAPERSALQRSADIVEK
NCKMLESLFA
>Cag_1967 conserved hypothetical protein
MPTHPDAIGYIRDLGHSEGKLWFEMICDLAAYRTTNLSPTDFEILVQLFT
KQVSYLRQPPPIIATTVSVETVFPSFERLETIGPFNGFKRLGNSLVACFP
KRATIIFGANGSGKSSLCEALKILASNDAPRRPLHDVHCSTTVTPSFKFK
FTTDTTAQTWSSTDTYGTRLSVIKYFDTGIAIHNIKNSVQPGRIIELAPF
KLDVFEIAKNHCEVLRTERQKRQRENTDQLTIVIEQIRAKFESFEGTILA
GLQISAKSILEAEIKLGENYSEENGLDEKLKKKSDLEKATSEEGIKLLKG
EIAALKALNAEIDPILTASEKLVEIDPVAKSNSLKDKETELKVLAEALIP
SGATLDKLMELIRPANEISILNSSELEECPLCKQPLQTRELELFKQYYTL
LTGELDAAITELRKILKTSEKNLKVISDSTPDEWAKGSVLSQDLIDAIKD
SGRAIQKYFKFGENISQNCKDAAVLLRKFSIKLSIKTEEKETLIDKSGKD
REELLKQLTQISNECKKLLYAKCIADNMDLVKNAHGKMLNATFWDTNLPN
FSPVLRKITSTAKKAHKELVVEDFKNRLNAEYLALAEKDMSAFGVELKDV
GSDVAVTVDHHVAGQRIESVLSEGEQRIHALALFFAELETCEQQVIVFDD
PISSFDYNYIGNYCNRLRDLIQQYPDRQIIVLTHNWEFFVQIQTTLNTAR
LNQHMSVHVLESCVAIDEYSENIDELKTNIDAILLGSGEPTKQQKEAMAG
KMRRLIEAVVNTHVFNKQRHQFKQKNQQVSAFDDFTKVVPLLLSEAQTLR
DLFSKLSITEHDDPRNAYVNTDKSMFLTRYNAIKSIETAIIGRK
>Cag_0107 Carbamoyl-phosphate synthase, small subunit
MQPTRAKLLLENGSLYSGTLFGHIGEAIGEVVFNTSLTGYQEILTDPSYA
GQMVTMTYPLIGNYGITPNDNESQSIWASALIVHEVSRIHSNYEASGSLA
ARLREAQVTGLAGIDTRRLVRELREHGAMRAIIAPAEVSDDELRQKVAAA
PNMSGRDLVSTVTTQKSYSLETEGAQYHVVAFDYGMKTNILRLLQNAGCR
VTVLPAQATMNDVLALNPDGVFLSNGPGDPAAVGYAIETIRALVEYNRTT
KPLPMFGICLGHQLLSLACGGTTYKLKFGHHGGNHPVRNLATGQIEITSQ
NHGFAVAMEALPEELTMTHLNLYDHTVEGVRHKNLPCFSVQYHPEAAPGP
HDSHYLFDEFTSLMARS
>Cag_1065 Molybdopterin cofactor biosynthesis MoaC region
MTFTHLDAQGTVAMVDVSSKPGTFRMARASGYIAMQPSTIALLTADALPK
GNVLTTAKVAAVMAAKQTANLIPLCHPLLLSWVDVQFTVKSDRILIESIV
KTKESTGVEMEALTAVSVAALTIYDMCKAVDKEMEIGVIRLLDKVGGKSS
HQESYRPQTAILVLSDSVSAGKSVDRSGATLQQGFEAAGCTVCDMQVIAD
NQEEIIAMVEQWVAAGIELIVTTGGTGLGARDVTIDALLPRFTRRLTGVE
QALLQWGQGKTRRAMLSRLAAGTIGNSLVICLPGSTGAVRDALEVLIPTI
FHAFQMMHGEGHSA
>Cag_1145 hypothetical protein
MLKSLLIQRALPLSISYILLIIAGIALDYLLHVAHLVWIGRYFGIVGTLF
LALSFGYSARKQKLIKNGALKFFLKFHCYSGWVGTLMILVHSGIHFNALL
PWIATALMMVVTASGHVGQYLVKKLKEEMKQKMKQLGITTSVDNEFEQQH
FWDSLTVKALDQWRGLHMPLVSFLLALTTIHILAILFFWNWR
>Cag_1475 glycosyl transferase
MKILFLCSAKKWGGNEKWTLLAAQALATKHKVIVGYRSEVIGNRCTTASI
KLPFINEIDLITIGKLIKVIKKEKIDIIIPTKRKEYVIAGILSRVCHIKN
IIRLGIDRPLTNTFLHKLIYQILCDGIIVNANKTKATLAQSPWLDASKIK
VIYNGLDKENLLKEAKNSFTPPHPFLILSVGSLISRKGFDFLLRAFALFI
KRNKNTDVGLIIAGDGPLLNHLKELTKSLAIDNNVHFTGFIANPYPIMKA
SNIYITASQAEGISNALLESIALGCVPISTYSGGAEEIIQNNDNGFLVEY
NHEEKLALLIENLYNHQEVREKITTKAKHMVETTFSSERMAIEITTFCQS
IISKKHQ
>Cag_0703 ribonucleotide reductase family protein
MKIERLFTQAGNNVFDAFEYSLRSSVLRNTDGSVVFEMNDIEVPVQWSQV
ATDILAQKYFRKTGVQQRDAEGNVLLDSEGNPLLGSERSLKQVVHRLAGC
WREWGEKHGYFDTPDDAQAFYDEVAFMILAQMGAPNSPQWFNTGLNFAYG
TTGPAQGHFYVDPATGEVTESGDAYSRPQAHACFIQSVNDDLVNDGGIFD
LAIREARVFKFGSGSGTNYSSLRSSGEKLSGGGSSSGLMSFLKIFDSAAG
AIKSGGTTRRAAKMVIVDIDHPDVEKFIEWKAKEEDKVASLVAGSRICSR
FLQAIVEEALANGADRKTNPRLQQLIQNALSRGVPMSYVIRVLALVEQGY
TSLDFNEYDTHYESEAYQTVGGQNSNNTVRVTNAFMLAVANDEVWELRQR
TTGKVSRVVKARELWEKILLSAWKCADPGLQFDTTINEWHTCPQSGRINA
SNPCVTADTLIATDRGLERIGNIVGESRGIKSIDGKLHWVENIFPTGTKP
VYQLRTKSGYQLKLTGDHVVFTENRGDVKACELRKDDMVRLVGAPFGKET
TGSTDIAQLIGLLTGDGCITTANEIAASGEQRRTAFLTVSKAEQEIAEWA
NQFINTLRPELGEHNKSGSVTETATTARVAVGSPRILKQFEAFAVLDKGS
VHKLFTDKVFQLAQSEQAALLRGLFTADGTVANYSDKSQYIALDATSLEL
LQQVQLLLFNFGIKSKIYENRRVGELVSLLPDGKGGIKEYPVQQMHSLRI
SRSSRILFEQQIGFMAESKKYEALAELNRTVSTYRDSAYDAVASLTYSGE
EAVFDLTEPETDHFIANGIGVHNCSEYMFLDDTACNLASLNLIHFVDEAS
GTIKINELRHAASLWTVVLEISVLMAHFPSQTIARLSYDFRTLGLGFANL
GTVLMVLGIPYDSAEALAMAGAIASLMTGQAYVTSAEMARDLGTFKRYSD
NANDMLRVMRNHRRAAQNAATTDYEGLSVIPHGINAAHCQTALAEAAGAV
WNEVLQKGEAHGFRNAQVSVIAPTGTIGLVMDCDTTGIEPEFAIVKFKKL
AGGGYFKIVNQSVHKALLRLGYSAEQIEDIERYCKGHGTLEGCPAINGEW
LKCKGFTEEKIAAVESQLATVFDIRFAFNKWILGEEFCASLGFSEEQLDN
PDFDMLLELGATTAEAAAANDYICGTMMIEGAPHLQPEHLAVFDCASKCG
KKGKRYIRHQAHINMMAAVQPFISGAISKTVNMPATATTAEIGNVYRDAW
QQMVKAVTIYRDGSKLSQPLNSSSYNDLDEVIMLGTEETLDETKGPQEVQ
ERIIERVYYRSERRMLPKRRKGYIREAYVGGHKVFLRTGEYDDGTLGEIF
IDMYKEGASFKGLLNCFAVLASKALQYGMPLEELVDSFTFTRFDPAGTVQ
GHNVIKNSTSILDYVFRSIGYDYLGRKDFVHVKAVDEVPEGTLVPAEQKK
ASHHSAANHSTDGAGVVSNKSQIYQAKVQGYTGEQCENCGSMHVKQNGTC
KVCEDCGMTTGCS
>Cag_1513 DNA repair protein RadC
MKLHDIDPDNRPRERFLQHGAAALSPAELLALILRSGSQQYNILDTCHHI
INRFSLEKLSDVSLKELQQIKGIGESKAMQIVAIFELNRRLHYSRNQLRK
IMAAGDVFEYMSGRIPDESKEHLFVLHLNTKNQVIKNELISIGTLNTAVI
HPREIFKSAIRESAHSIIVVHNHPSGDVNPSNADKKITNELKQAGAFMQI
EMLDHVIMSKTEWYSFRERGLL
>Cag_1976 hypothetical protein
MKKWQQFLQDGSGVFSSTRLAFLLWVVGTLVVWIFGCIEVINAHATLDMA
GKIKAIQFPVIPENVMLIIGALMTGKVWQSFSENSADKSTTSLTVQQTSS
AQQGTSENVASETVAK
>Cag_1312 conserved hypothetical protein
MNKKRKAMYYQERITINPNICHGKPCIRGLRYPVEHILELLAGGMNIEEI
LADYEDLEREDILAALAYAARLAHIKSTKEIAA
>Cag_1588 glutamine synthetase
MNSDSKVPVSTYFGAFTFDHKAMRAKLPKEEFVALQETIKAGKKITAEIA
GVVAHGMKEWAMEHGATHYTHWFQPMTGSTAEKHDAFLSIDRDGTPIERF
SGEQLIQGEPDASSFPSGGMRTTFEARGYTAWDPSSPAFLMKGGNGLTLC
IPTVFISYHGAALDAKTPLLRSMDAASKSALRLLGVLGVEGVKRVKTYAG
CEQEYFLVDKKFYTARPDLVMCGRTLLGALPPKGQQLEDHYFGSIPDRVL
EFMQEVEHELFLLGIPAKTRHNEVAPHQFEIAPIFEEANIAVDHNMLVME
VMRKIADKRGFALLLAEKPFAGINGSGKHNNWSIGTDTGINLLDPGDTPE
KSILFLIILVSVLKAVHKRADLLRMSIASMGNDHRLGANEAPPAVVTVFL
GDLLERVLDAIESGQVDLKTEKQVLNLGLSQVPEVNKDYTDRNRTSPFAF
TGNKFEFRAVGSSQAPSVANMVLNTIMAEALDEMSEAIEAKIQAGSDKDV
AVLETIREQITVTKPIRYPGDNYSEALQVAAHERGLPNLKNTPHSLRILL
KKDVQDMFIKYGVLSHEEIDARLHIRLERYIKGIDIEARTLLLMLKTYVI
PNVSEYQGDLGNSFNSLFAVSEAIGLSDKALDSQAKHLKMVAENLATLLD
MTNELEEAIEQIESCKSEFDKADYCADKLLPFMEEVRVVADRLEQVVDRS
RWQLPTYLEMLFEH
>Cag_0369 Alanyl-tRNA synthetase, class IIc
MKSHEIRQSFLDFFAQKGHTIVRSAPVIPADDPTLLFTNAGMNQFKDVFL
DKGSRPYVRAADTQKCIRASGKHNDLEDVGRDTYHHTFFEMLGNWSFGDY
YKLEAITWAWELFTVVWKLPKERLYATVYQDDDESFQIWKEQTDINPDHI
LRFGDKDNFWEMGETGPCGPCSEIHIDLTPDGSGKELVNVGDYRVIELWN
LVFIQYNRQSDGHLEPLPKKHVDTGMGFERIVAVMQGKGSNYDSDVFQPL
FDKITEITGVRYGASLDAPNDIAMRVIADHARTLTFALSDGAMPSNEGRG
YVLRRILRRALRYSRNLGYHQPILHQLVGTLALAMGEVFPELVQRQEAVS
AIIKAEEESFIVTLDRGIDLFSELVEKLRASNGKVIAGSDAFRLYDTYGF
PFDLTRLMAADEGFEVDGEGFEHCMQEQKNRARSDRREKQRVDDDGAKWQ
WFSDVRASQFVGYDHLVHSATITGIRQGNGKLLLVLDATPFYAESGGQTG
DNGWLENSRYRLHVADTVKDGDAIVHVVSDAFDKVQDSAVQPEDVEIVEQ
ELAVAASVHRTARQDAERNHTATHLMHAALRRILGEHVQQKGSFVSPERL
RFDFSHFSKLTDEEIEAVEIAVNTQIREAEPVVKHADVLFDDAVAKGALA
FFGDKYADRVRVVEIPGMSVELCGGTHVDNIGRIGLFKIISEASVASGVR
RIEAVTGKAAEKLLWQEYRELQQIRQLLKAKGDEAVVEKVAGLMDSKKEL
EKELAESRAAALVEQLTAALQAAPEVSGCRVLAIEVKNNDGETMRQATMA
LRDKAPCAVGLLATVEAGKVVLVSFATDEAVRTCSLDAGALIREAAKQVQ
GGGGGKAEFATAGGKQPENLSKALDAFTAAVRAKLG
>Cag_0787 Alkaline phosphatase
MFSKKIIRDAGIGLFLLGLSNGAHAVTHLASVPKPKIGNVIFMHPDGTSV
SHWTAARMAIAGPDGTINWDRLPAMAVYRGHMKNSTGATSHGGATTHAFG
VKVDADSYGMDRDRALTALSGKPMSIMREALAAGLAVGVVNSGNIDEPGT
GVFLASSRSRKDGQAIAEQIIYSGAQVIMSGGERWLLPKGVKGRHGEGER
SDGKNLIDVAQTLGYTIVYNRDELANLPKDTKRVLGVFAHHHTFFDHPEE
ELQKQKLPLYVANAPTVGEMTDAALRILSNQGKRFLLVVEEEGTDNFGNV
NNAVGTLEALRRADAAYGVARSYVQRNPATLLLTAADSDASGLEVVSVAP
DVAMKPLNKTTHNGAPLDGRTGTQSLPFIAEADQFGNRLPFGIAWTDMND
GAGGILARAEGINSEALRSGSCDNSDVYRLMYLTLFGRSADSKK
>Cag_1922 sulfur oxidation protein SoxX
MTHLITLAVTTALLLPAIVNGAPRAESIAKGKQLSFDGSKGNCLACHLIA
DGEMAGNLGPPLIQMKQRYPKRSVLKAKLTDATASNPQTLMPPFGKHGIL
SNEELEQVVDYLYTL
>Cag_1722 hypothetical protein
MKLDELYQVFLPTFTMVEEEWNNFLAEKNTALSEAQNYLNAIFSITGYNL
PTYEEISSTRYLGVYENGAQVTFEGSGIDKLFGGDMTTGTLSLSRIALDS
YANNIHMAMLGYNNGITVDLNSGAVSGMFNEYSLTTPWFELSAVGTVAVS
GASILSEMNIEAELTELSFTYPDDGVIVKLLGDIDYYQDELGNMEYSGDV
YTAYFTGWGADITLNGDFQCDFDINNTFILFSDELTGELYVSELSLVIPS
QHVVANFYNSLTGLSYDITSGSLGGSFNSFHFGTSLFDVYAHGSIYVEPV
SSSSTLGIHGILDSIEITYPESTFAITVVGDVEYIQIDQGEYTYFGTMSE
VYLENPTTTVSVLGDFSGSYNDSNGLHLAGNLYEFHWQREEAFISFVGDI
VFGEDQLVVNEVTTLEVYGDGRYYDASSLSVMNIVTDVVGNELLALGEDA
NWDISAALDELLWDVINKLNGDAEGVSSVNFDSVPASSTPVNAEFLDFYL
DLSKVGEAGYYASFRVGHLYDTNGDGLPDYVDEIHDSPATITWNNGMFTV
LSLDDSSTRATGSLAYDGNGNAVGLYAFDRASGDSETTPPTLIAATPSDN
AMGIEVESDLSFIFSENVQFGNGTIEIHRGSATGEFVESYNIGTPLSTNL
NIVGSTLTINPTSDLASNTHYFVTFSEGSIRDLDGNNYVASQPYDFTTGA
DPYPTHTLTGNITFWKTGEAITDVTTTLTTLPTNGTHAIELKNIHVQANG
SHTIEVWATTPNSTTGSFECEFALPTGTSVTWQDAAKLPSGWMTTNNVIA
TGAFRVASIGTHALAEGAVQLGTLTISQSANPGTFELAMTHAQLGNNDVA
GYAISSVSSTTGSGNEYQYHSLTDGHYALTGDKAAGDAGSAVHANDALAA
LKMAVELNPNEANANGLLGPVSPFQYLAADINRDGKVRANDALNILKMAV
GIESAPTDEWIFVAESVTGKTMDRSHVDWSDISPIVDFNQTAIELDLIGI
VKGDVDGSWVMVG
>Cag_0153 hypothetical protein
MTVLPQIIANKIDVKRDFPLLSERELDYINEAKGLFDSGFYSYSLLAIWN
AAVNNLKRKVEAYGVELWSSVVKDESGRKKYDKDAETIAERWSNVDDLVL
ITGATRLGLLNPKAGKSLEMINWMRNHASPAHDSDNRVEMEDAVGLILLL
QKNLFEQPFPDPGHSVSAIFEPIKNKTHTPDELSILRDHISSYKNQDIRN
VFGFFMDLLTKGDEPAKTNVTELFPVVWEKANEDLRKTLGVKYHTFVIDP
DSDDSPDKGAKTRVFELLVKLDAVNYIPDGTRARVFRRAAEKLAEAKNTG
YGWRLEESASRNLAQLGISVPSIAFEYVYQEILAVWCGNYWGRSDSYVTL
RPFVDSLNTDQIRLVLKMFRENERVKDELSQSRPNKIAVSLLKEFETKLT
IEAHKQELRETIDIVKDI
>Cag_1187 cytochrome c551 peroxidase
MKKHQMLAATLLAVGVLTSCSKKSEPVPEPAPVPPPAPVLQPRTGEPAQP
IEAPTVADAGMVELGKKLFFDPRLSKSGFISCNSCHNLSMGGSDNLKSSI
GHKWQQGPINSPTVLNSSMNLAQFWDGRAKDLKEQAGGPIANPGEMAFTH
ELAINVLKTVPGYVDEFKKVFKSDSLSIDQVTQAIASFEETLVTPNSRFD
KWLKGDDAALTAEELAGYQLFKSSGCTACHNGVALGGNSFQKMGVVQPYR
STNKAAGRFAVTKDNADRFAFKVPTLRNVELTYPYFHDGAAPTLAKAVEI
MGQVQLGRTFTPEENGSIVAFLKTLTGDQPSFSLPQLPPSSDTTPAPQPF
GK
>Cag_1316 glycosyl transferase
MEPSPTVEIIIPHYRRRDMLERCLASLSRCPFYASPSLSILVICNGTADA
ALQKLIANYPTVKLLALAENRGYAGGCNAGLQQSNAEYLIFLNDDTEHEA
DWLEALLAIAQSNQNIAALQPKILSLEHAKQGKRRFEYAGAAGGMIDKLG
YPWCWGRTFFRVETDGGQYDKARNIFWASGVALCVRRSVALEVGGFDEDF
FMQMEEIDLCWRIQLAGYPIYSAPSSVVYHAGGASLAEGSAEKIYYNHRN
NITMLLKNRSLVALLWIMPIRMVMELGAALFYLTKADGLKKSGMVFRALR
DQLRAMPTTLRKRRTIQTNRTVSDRQLFHHQPFFHLLNHLIPQLYTFAQP
LKNQQHHQ
>Cag_1110 Ribosomal protein L33
MAKGKENRIVITLECTEAKKEGKPVSRYTTTKNKKNTTERLVLKKYNPNL
QRHTLHKEIK
>Cag_0753 cob(I)alamin adenosyltransferase
MHSRRILLFTGNGKGKSTAAFGLLARALGHGMKAKVIQFVKSDDAVGEQR
FFSQHPAVEWEQFGRGFLPRDPASPKMEEHRAAAREGLDAAITALASSEY
HVVLLDEVCFALGKELIPIEPLLEALRSADDGKIVVLTGRNAPEALVALA
DTVSDIQMVKHGYQQGIPAQKGVEE
>Cag_1616 RfaE bifunctional protein, domain II
MAAISPKILSWQNAAEQVQAWRAMGNKVVFTNGCFDILHAGHVHYLQAAK
SLGQRLCIGLNSDASVQRLKGPKRPICNEVDRATLLAALEVVDMVVLFDE
DTPERLIAALLPNVLVKGADWAVEQIAGAATVLQHGGEVLTVPLLDGRST
TGVIERIIERYSL
>Cag_0732 TPR repeat
MNPPSSSSQVIPLQVRQLFDHARQLRKQGMLNEAIEAFREVIELQPDYVA
AYNNLANALQAQGDSDGAEAVYQQALHYAPMLPVLHCNYGSLLLARQEYD
AAIKSYQKALTLQADFFLAYTNLAKAYSVRGNFFAALQTYKAALRLKPQD
AELYLDCGQLYQQYGFIPQAVKYYRRSLQLAASARGYNALGAALQDWGNL
KLARASYHRALKLQPDFDLPQYNLAQLYENLGELETARRYYEQTLTVDAE
NAKLLLHLEMIKRRQADWSNYTERVEQLRHALERHVENDKGEAVPMLSVL
SSSLSPALYRALAEQMARQLTRNAQALNATFTFPNNVAPERLKIGYLSPD
FRGHAVGTLIADLFQYHERPDFEVFAYSLLPHHDEWTERVKAGCDHFIDV
SHKSPLAIAQQIHADGIHILVDLAGYTSYARPLVLALKPAPIQLQYLGYP
GTLGAEYVPTIIADKHLIPENHQSYYTEQLCLLPHAWVAAPMQIASLSLT
RAEFGLPEKGMVYCCFNGVYKLEPHVFSLWMEILSKVPNSVLWLIDGEES
GSNERLRAVAQEAGIAPERLVFAKKRSHEEYLALYRLADLFLDTLSYNAG
ATAVGAFSAGLPLLTCQGEHYATRMGSALCYAVGLPELVAPTPADYVEFA
VQLGSSPKKRAALKRKLAKKLPTAPLFQPQQFVVALEQQYRSLWNNYCEL
TNNMLG
>Cag_1284 hypothetical protein
MVQNERYFLFCVREYNHDLKRRVTMMQGVKQFGKCQGNATKNGQVRRVGV
PQGVVAAQAGQGGFALNQAVNGGHQTLRGGCNGQGRSQGQGMGQGQGQRC
RCTAQNSVSL
>Cag_0209 Bacteriochlorophyll/chlorophyll synthetase
MAASASIGLTNKIRAHLELLDPVTWLSGFPCLAAGVMASGGMRGTLHDYL
LLLLVFLMYGPFGTGFSQSVNDVFDLELDRVNEPSRPIPSGRLSEKEGLW
NSIIVLLLAAAIGFGLWLHIGGMRGWIILISILSALFVAYIYSAPPLKLK
KNILASAPAVGFSYGFVTFLSANALFGDIRPEALWLAGLNFFMTVALIIL
NDFKSVEGDKEGGLKSLTVMIGARNTFLVSFFIIDAVFGVMAWLAYTWSF
VGLSYFILAGLAINMFIQMQLLRDPKGGLSFVNMAVDDGFGNAIGKSNVQ
EHNTFLRFQVVNNIIFMISNLVLAGMVGMQYIQP
>Cag_0212 glycosyl transferase
MLFLLFQVAIAFALLVFLAIGVANRYELGRLRHAALQSRVPFVSILVPAR
NEAHNIERCINSLLQQRYESFEVLVLDDGSTDATPTLLAELAQHAGGVLQ
VLQGDPLPQGWHGKAWACQQLGEAAHGDLLLFTDADTVHHPTALARSVAA
LQASQASMLSMTPLQTMHSWWEKIVVPLVYVVLMNFLPLRFVRTTSIPAF
SFANGQFILIERTMYRQLNGHAAVRQQLVEDVWLCMAVKKAGGRVVAING
VDLVSCRMYRSGKEVWEGFSKNIFAGLGYYHSALFGLLALIALFYIIPIA
LLTTSVVQANYSATHFWLPLVQVVLAFANRWLVAFTFHQSRFMVFFHPLT
MVAFFAIACNSWYWIVSGKGAGWKGRRYQFTE
>Cag_0068 putative lipoprotein
MMLCSFLFKNLSILAMTFSRFIASVCLIALPVSALSLSGCSSSRQPTTAS
EQVSDGYARAEALIKKGDYRSAVLVLEPILFTSRATALEDDVLFRLGQAY
YHTEQYLLAADMFTKVQQLPASPYAATAQFMVASSYEKMSPPFELDQAYT
QKAIEEFALYRELYPLTDSVRSAEQAAFWKEMLKVDAANETYKKNYAQAM
VGMSRSDSVRYAGKAITTLREKLAHNAYSVALHYQQLGKLKAATIFLDEV
IARYPDTSYYKLAMREKVDLLVKREKWYDAALALAQYQQLAPENGGALQS
LQEQIARNTKK
>Cag_2023 Aspartate kinase region
MAVMKFGGTSVGNARAMQQAIAIVANKEKSGAPLVVLSACSGITNKLIHI
ADAAGRSALAEAMVLAAEVRAFHLALAAELVTTPERLHTLTTSITELVDR
LEMLIKGVDIVGELTERSKDMFCSFGELFSTTIFAAAMQERGHNAAWVDV
RTVMITDDNFGFARPLDSVCEANALSIIRPLLEQGTTVVTQGYIGATRDG
RTTTLGRGGSDFSAALLGAWLDDSVIEIWTDVDGVMTCDPRLVPDARSIR
VMTFTEAAELAYLGAKVLHPDTIAPAVQKNIPVYVLNSIHPEAKGTLITN
DSEHLSGMSYEGLVKSIAVKKNQCIINVRSNRMMGRHGFMSELFELFAHY
GVSVEMISTSEVSVSLTVDDKCVTSELIQALGSLGDTEIEHNVATISVVG
DNLRMSRGVAGRIFSSLKEVNLRMISQGASEINVGFVVDEAEVATAVNTL
HKEFFSTPNDHAIFEKPAGSH
>Cag_0835 hypothetical protein
MKTEVKIALIGGTVTIIAAVLPIVLGWYEPNKQHGEAPPPPSQTAPAWGS
ATPVLSSSNSPASEPTDETSRLQALATIIRNRDVDAAKASPQFQCFVKEH
YGKELDAVSRRELPAFWVKMAKSLAAERRAAKDAATK
>Cag_0415 Molybdate ABC transporter, ATP-binding protein
MNLQLDIHKKLGNFSFEVQSLIQGERIGIFGASGSGKSTLVSLIAGLMQP
DKGQIVLDNTILYSSSKSINLPPEQRRIAIVFQHAMLFPHLSVKTNLLYG
FKRCKKKYQRIKPEALIELLELQPLLTRGVQHLSGGEKQRVALGRAILAN
PRLLIMDEPLSALDNKLRFQIIPYLKSVSEEFKIPYLFISHSTIEMRLMS
DLILTIENGHHTTTITPEQLARTSMADGINGYLNLLSLTNPRQDKRLLLF
DWGQNTLAIIGQVSNSTTLFELSSHDIILFKKHPGAISARNLLECTVKEI
FDGGGRVGVVLAIGTETLVAEIVCQAAHELEVSVGTKLCAAIKASSFRQL
V
>Cag_1657 beta-N-acetylglucosaminidase
MSSYHSLLFPFFSLKKKWSRKSRRLFMLFMVLTLSVGNHASARVGGESWR
AEQIFSKQSDELEEQLHAMSLADKVGQMIIADIEPTAFSPRNKKVLLLSR
LAQEGKIGGVIFMKGDAKSTGAVVNHLQALAPLPLLFSSDMERGVAMRIS
GTTEFPPNMALGATADPKLAEDMATAIAQEATLLGMHHNYAPTVDLNSNP
RNPVINTRAFGDTIPLTIVMANAIIKGLQSHGVLATAKHFPGHGNVTVDS
HVALPVLQATREQLEAYELIPFRAAIEQGVATIMVGHLAVPALTGNMEPA
TISPAIVTTLLRQELGFKGLIITDALNMKALYNGSNVATLSVRAVQAGND
LLLFSPDPEATHSAVVQAVEAGQIPLEQINASVRRILQAKQWLKLEKHRE
VDSEDIEEDANPASHRELARKIAEHAVTLVSDVERNVPLKKSEQLLHLIV
QDRVNYQTGRNYLRQLSERYPTITHLRINPKSDALDYAIATELAMNASSV
LVTSYVQSLSSNGELKLTAEQQNFLHLLPTVVQRGTPMVLLSLGTPYISN
YFPEFTSYLCTYSFDEESERAALQVLQGELTPRGVLPIVLGQ
>Cag_1824 DNA-directed RNA polymerase, alpha subunit
MIYQMQMPAKIELDESSHSDSFGKFIAQPLERGYGVTLGNLMRRVLLASL
PGTAITGIKIENVYHEFSTIQGVREDVPEIVLNLKKVRFRSQCKRSCKTT
VTLVGPMEFTAGVIQPQEGEFEVLNKDLHIATINAGTTVTLDIFIGRGRG
YVPAEENRAEGMPLGFIPIDSIFTPIRNVKFTVENTRVGQRTDYEKMILE
VETDGSITPDDSISLAGRVISDHVLLFADFSPAEEEYTEEEFKQQDDEFE
TMRRLLATKIEDLDLSVRSHNCLRLAEIDTLGELVSHKEDELLNYKNFGK
KSLTELKEQLDKFDLKFGMDITRYQMK
>Cag_0795 Prolipoprotein diacylglyceryl transferase
MSSFLHWWQILPFSMNPVIFSVGSFAVRWYGMMYIIAFAVVYLVVRYRLA
TEKLPFQTTFVGDALTWAMVGVVIGGRLGYIFFYGFADFLANPLQSFFPW
ICSPDGSCRFSGISGMSYHGGVLGVIGAMWLFTRSQQQNFFQVFDLFMPA
IPLGYTFGRLGNFINGELYGRVTEAAIGMYFPTAPTIALRHPSQLYEAFF
EGIVLFVVLWMLRKHSPFAGFLSALYLFGYGFVRFFIEFFREPDAHLGFV
FFSFSMGQVLCIAMMVAGIALFAVAKKLSNNTIKA
>Cag_1210 hypothetical protein
MIAISPTELKRNLYKYLEQAQSEQVIIQCKNAETYAIVPTGKTSETDRLF
LHQNIKDRLRHSLEQVKEGKTYQLTKAEINSFLGHYDDK
>Cag_0704 hypothetical protein
MFITNLQDTVCRQTQLEQYYAKDVLRDKHFVCRCFDKCRASHAGTYYEGQ
VHYVGSNYDILVGPQPLRVVVVGQEYGHGPALVDSLMRAKMFQDSAHKSR
GFLDRNPHMRGTTTALRILFGIEPGEDKAGEWLETSTGRIHLFDAFSLVN
FLMCSATDGSSKGKATSTMLSNCSKHFVKVLEILQPTVLVCQGKGFFTYL
AESLGVSKQQKEMLFHYRFNGVDGVGVCLNHPSTPRWDSGWAQLTQPYLT
SRVLPLLNDVRCELGLDQVIWNL
>Cag_0177 conserved hypothetical protein
MYWNLELARYIADAPWPVTKDELISYANRTGAPQQVIDNLEDLPDSDEMY
ESLDEVWPDYPTDEDFGYGDEDPLN
>Cag_1807 Histidinol-phosphate phosphatase
MSEQQRIKVLFLDRDGTINHDTGSYINSIEQFKLIERADEAIALAKQAGF
RIVIVSNQAGIARGIATFEQVDTVNSHLQSLLARKNATYDRCYYCPYHPN
YPHPEYDLLAGFRKPETGMVEQAVAEFAAEGLQVDRSASFFVGDKVLDVE
CGERAGLTSVLVRTGHNEEALCYERNIHPAFVADDLYQAITEFILPHSQE
>Cag_1715 Threonyl-tRNA synthetase, class IIa
MSDHKESTGAIALTLPDRSVRNVAMGSTGYDVALSIGRKLAQDALAIKLN
GVVCDLNTLINSDAAIEIITFTSPEGPEIFWHSSSHLMAQAIEELFAGSK
FGAGPAIEQGFYYDVSSEHRFREEDLRAIEARMLEISKRDSSVQRQEMSR
EEAIAFFTSVRNDPYKVEILTETLKNVERVSLYHQGDFTDLCTGPHLPST
GKIKAVLLTNISASYWRGDSNREQMQRIYGITFPSEKLLKEHVARIEEAK
RRDHRKLGAELELFLLSPEVGSGLPMWLPKGAIIRSELESFLREEQRKRG
YVPVYTPHIGNIELYKRSGHYPYYSDSQFPPLTYHDEDGKQEQYLLKPMN
CPHHHLIYSSKMRSYRDLPLRFTEFGTVYRHEQSGELNGLARARGFTQDD
SHIYCRPDQLVDEICSAIELTQFVFKTLGFAEVQTRLSLHDPANQAKYGG
TAEVWEQAEKDVQEAAERMGVDYFIGIGEASFYGPKIDFIVRDAIGRKWQ
LGTVQVDYVMPERFDLTYVGSDGQKHRPVVIHRAPFGSMERFIGLLIEHT
AGNFPLWLAPVQVAVLPIAEENHDYATTVYRRLLAAGIRAELDTRSEKIN
RKIRDAEMSKTPCMLVIGQKEQANGEVSLRRHRQGDAGRFATDELIETLK
QEIANRQ
>Cag_1617 conserved hypothetical protein
MKIVQFVAMLLLFAGVNGAVVYRLWRLMPPLRWFRGSVTLLLLAVIAAPF
VVMAWGNALPLPLVSLLYMVGLSWLILLIYLILLFLLIDGLSLLGFFGVQ
RLKSLKLWTHESWVGTVGVVVLMAVLAVYGNYNYHQKERVELTMRVEKAM
AQPLRIVVVSDLHLGYSIGREELERWVVLINREEPDVVLLVGDVIDTSLR
PLEVERMAEVLRRLSSRYGVYAVAGNHEHYATLAKSAPFFSDAGIRLLRD
EVLLIDNRCYFVGRDDYMNKQRKPLSVLLSGVDVAKPIVLLDHQPRALGE
ARAAGADIHFAGHTHRGQIWPISLLVEQMYEQAYGYRRFGAMQSYVSSGL
GIWGGKFRIGTKSEYVVVTLQGR
>Cag_0968 hypothetical protein
MGAAWSGAFSFNHYKRTFVMQLTGKLIAILPEQTGAGKNGPWKKQDIVLE
TSGQYPKKVCVSFWGDKLDRQMLQLGTMLSISFDVESREYNGKWYTDVKG
WKAEVAGRAESAPYGGGDDAGSWEPPAFEPTSSNEECPF
>Cag_0672 pyridoxal phosphate-dependent enzyme
MILMNDFKAEPPELREAMLGAAQRVIESGWYVLGNEVVSFEKQWAAICGV
DYGIGVGNGMDAIEIALCSLSIGVGDEVITTPMTAFATVLAILRSGAIPV
LADIDSDTGLLSIESVRRCISKKTKAILLVHLYGQVRDMDKWTALCKATD
LYLVEDCAQAHYAQWQGNVAGSFGIAGAYSFYPTKNLGAIGDAGMLITND
ADIADKAKRLRNYGQSTRYYHPELGMNSRLDELHAAMLSERVKWLHSFTE
RRWQIAEYYREHIDNPLINLLSAPEERTAHVYHLYVVTTAYRDALQVYLQ
ENQIQALIHYPIPVHFQDPCKNILRDPKGLAKSEYHAAQCLSLPCHPQMS
DADIEHVANTVNSFKVS
>Cag_1566 hydrogenase/sulfur reductase, beta subunit
MIYKVIAKKEFYAFVDALVKNNKAFAPQQVAIDSHSHPIYQFKSVHAVDA
ISFDYTATSSSAKHFFLPFREELSHFTFHGKEWDQHIAYETNPIVLIGLR
ACDISALNILDEVLLHSHFPSPYYLARRKNSFVIGMDHLPLPDCFCKSMN
HHVVTSGFNLFCSDIGDDYYLAINSSKAFNYLKSFETREPTYDDNCRLTE
RRKLIQRSFQTEIDVTALPSILDIEFDSPVWEKWGSKCLNCGTCAMVCPS
CYCYALSETFDLDLQGAKREKQLYSCNLVDFAAVAGGHNFRPKNGDRLKY
RYYHQHRGFAENGNQQICVGCNRCGRACLSGINPKDVINDLRMEKESCMI
CVSSSPSKI
>Cag_1550 glutathione-regulated potassium-efflux system protein KefC, putative
MHDFSFLGELVLIGAAAIAIILIFQRLKIPSVIGLIFTGILIGPSGIGAV
YDQKMISTLAELGVVLLLFTIGLEFSLEELKKLRKVVLVGGVAQIIATTL
VISALASWLMPIMGEPISVPTAFFLGSTFSVSSTAICLKILAEREELSLP
YGRIALGILIFQDLAIVPLMLGINMIAPGASSSFYAMMKEIGLVIVLAVA
IVLGFRLLMPKLVRFIVSLKAREVLVLGALVICFGAAYLTSLAGLSLALG
SFVAGMVIASTDESHQISVIIDPFREAFSSLFFISVGLLLDVNMENLPLF
IAIAVVVLLVKGLVVAGIGLFLGNSLRVSMMAGMALAQIGEFSFVLAETG
LHNGLISRDIFQAMLVIIVVTMIVTPAMIAVAPKVADQVVPAFGFIPLVG
RYTAPPAPTQAVRPPNSAIICAGEIHAAIIGFGVNGQNVAAVLHATNISY
SALEIDRYIVKTMRRKGEPIFYGDCTEKKSLLRAGIDHARAVVIGISDHT
AISKSIALIRELNEKAYIVVRARTLDTVGDFYRAGADVVVTEKFETSIQI
FSQLLNHFTVAPDLILEQQEIIRRECEKIFLK
>Cag_0676 teichuronic acid biosynthesis
MFTDNLISIITPVYNTYPFLFRLVQSVQMQKVNVEHIIIDDASTDHSYET
LIEYAKKYSNIRLIRLPLNRGPVVARNEGIKIAQGRFLAFLDADDLWLPN
KLEIQLSLMRKNNWSISFTDYRFISFDGTLVGKIVNGPNIVDRDLHFATR
SGMGCLTVMVDRQKFLNFSFLETDPITTRAEDFWAWAELLKTTVAHRVPY
DLARYTVVPGSRSSNPWYKAKVIWTIYREIEKMSFIKALLYYISFSISAT
KKRLLSTPRYKIEDIDGDKGKEWLDLVNLSSNKHS
>Cag_1850 50S ribosomal protein L4
MELKVLNTAGTETGEVVILRDDIFGIEISEHAMYLDVKSILANKRQGTHK
AKTRSEVRGGGKKPFRQKGTGNARQGSSRSPIHVGGGTIFGPQPHTYEQK
VNKKVKLLARRSALSAKAQAGKIVVVDDFRFDAIKTKPFADILKNLGLDA
KKTLLLMPEYDMVVNRSGRNIAKLEIMTADKASTYDILYSNTLLVQKSAL
KTIDETLG
>Cag_1526 Restriction endonuclease S subunits-like
MKKEALGKLVDIKTGKLDVNAGTEYGKYPFFTCAKTVYRINQYAFDNEAI
LVAGNGDLNVKYFKGKFNAYQRTYVIENKEVNLLSMKYLYYFMETYMIHL
RNGAIGGIIKYIKIDHLTKAEIPLPPLDDQKRIAHLLGKVERLIAQRKQH
LQQLDQLLKSVFLEMFGFFDKTYTNWTIDTLTSHTEIVSGITKGKKYKTD
ELIEVPYMRVANVQDEHFVLDEIKTISVTKNEIKQYRLLAGDLLLTEGGD
PDKLGRGAVWQNQIENCIHQNHIFRVRVNDKSRINPDYLSALIGSPYGKS
YFFRSAKQTTGIASINSTQLKKFPIVIPPIELQNRFATIVEKVESIKTHY
QQSLNNLETLYNALSQKAFKGELDLSRVAVLVDVTP
>Cag_1410 Isocitrate dehydrogenase NADP-dependent, monomeric type
MAQNATIIYTKIDEAPALATYSLLPILQAYFKGTGVTIESSDISLAGRII
ANFPENLTESQRIPDNLSALGQLALTPEANIIKLPNISASIPQLQAAIKE
LQEHGYNIPNYPEAPANDAEKELQVRFAKVLGSAVNPVLREGNSDRRAPL
SVKNYAKKNPHKMSAWSADSASHIAHMEGGDFYGSEKSVTVADATNVKIE
FVGNDGSSKVLKEKTALLAGEVIDTSVMNVRSLRAFFDQQITDAKENGLL
LSLHLKATMMKISDPIMFGHAVTVYYKDVFEKHAATIAELGVNVNNGLGD
LYAKIEKLPADKKAAIEADIQAVYANRPPLAMVDSDKGITNLHVPNDIIV
DASMPVVIRDGGKMWGPDGKLHDTKAMVPDRCYATMYQAMVEDCKKNGAF
NPSTIGSVPNVGLMAQKAEEYGSHDKTFTAPANGVIRVVDASGKTLLEQN
VETGDIFRMCQTKDAPIRDWVKLAVNRARITGAPAIFWLDSNRAHDAQII
NKVNEYLKEHDTNGLDIRIMTPVDAMRFSLERFRAGEDTISVTGNVLRDY
LTDLFPIIELGTSAKMLSIVPLMNGGGLFETGAGGSAPKHVQQFQKEGYL
RWDSLGEFSALAASLDHLSRVFNNPKAAVLAETLDEAIGKFLDTNKSPAR
KVGQIDNRGSHYYLAMYWAEALAAQSKDAELQACFASVAKALAENEEKIN
AELIGAQGSPVDMGGYYHPNDELTSKAMRPSATLNAIIDAI
>Cag_0557 cobalamin biosynthesis protein CbiM, putative
MHMSNVLLSPLVGSAFLTISGALAGYSSGKMKQEKQPALTPLMGTLGAFV
FAAQMINVSIPATGSSGHLGGGLLLAALLGLHRAFLAMSLLLTLQSVLFA
DGGLLALGCNIFNMAFIPAFIAYPLIFKTLTSNNNSPNRLAVASIIAALA
GVLMGALAVTLQTAVSGISALPFTTFAAHMLPIHAAIGVAEGVITWAVLT
AVRLQATNLPLHPNAAANIGSVKRAFVMAALLIGGVGSWFASAEHDGLEW
SIARTFGSELPENSSALHNYAAAIQEQTALFPDYAVPFAPSTTTVAGVAI
SLESTIAALIGIGVMLLLTARKQAKG
>Cag_1651 Small GTP-binding protein domain
MKPLLALVGRPNVGKSTLFNRILRQRSAIVDPTPGVTRDRHIAEGEWQGK
QFKLMDTGGYNTDGDVLSKAMLEQTLHALADADSILFITDARAGLSYEDL
ELARILQRSFQHKQLFFVVNKVESPQLVIEAESFIKTGFTTPYFVSAKDG
SGVADLLDDVLEALPEAPEGEVKGDTAVHLAIVGRPNVGKSSFVNALLGT
NRHIVSNIPGTTRDAIDSRLMRNQQEYLLIDTAGLRKRTKIDAGIEYYSS
LRSERAIERCEVAIVMLDAEQGIEKQDLKIINMAIERKKGVLLLVNKWDL
IEKDSKTSIRYEEQLRMAMGNLSYVPVLFVSAMTKKNLYRALDTALQISR
NRSQNVSTSQLNKFLEQTLAQVHPATKSGRELKIKYMTQLKSAWPVFGFF
CNDPLLVQSNFRKFLENKLREAYNFEGVPISLRFLHKNKVKED
>Cag_1754 aldehyde dehydrogenase
MRFISTPFVTIPILLSVFQSISIPQQLATLRETFQHGQTQQLYWRHAQLQ
ALRAFLVEREAAIAEALRADLRKSSAESFLYENKVVQGEIHYALRHLTAW
TTLRRPKVPLLYQPAKAQVVREPYGVALIMGAWNYPLQLCLAPLVSALAA
GNCAIIKPSEHAPHTSALLAQELRRYLDANAVVVVEGAVDVAKALLAERF
DVIFYTGSYAVGREVMAAAAHHVTPLTLELGGKCPCLVEQTSNYQIVARR
IVWAKFLNAGQTCLAPDYVLVHEHEEEALLQALAAAIHHFYGSDPSQSPN
YSRIINRHHTERLAALLADGTIYTGGQAAIDDCYLAPTILSKVHPESALL
CEEIFGPILPIIIYRTLNEALAIMRTHSEPLAVYLFSDNRQVQAEVVHRS
RSGGVCINDVLMHAALHSMPFGGLGSSGFGAYHGKAGFDAFSYERSILHR
SLHPDPTLRYPPYHGWRYKLLRWATEHLGG
>Cag_1148 hypothetical protein
MRHQGASILLYNQQHEVLLVLRDNLPFIACPNTWDAPGGHLDAHETPLHC
IVREMMEEMELDVSTCSHFKSYEFSNRTEHIFTMQTDVLNTATTPLHEGQ
MIRWFTVADALQLSLASDMEVVLHDVGIWLEQQNNGTEDCGNV
>Cag_0925 Thymidylate kinase
MLITFEGIDGAGKSTQVAKLVAYLKQQSTPLLSLREPGGTPTAESIRSLL
LDNRKSITPIAELLLFSASRAELVETLIRPALAEGKTVILDRFFDSTTAY
QGHGRGISLEQLHSIIALSTGDLLPDVTFYLDLEPEEALRRAFSRSGLPL
DFAAQSNESDRMEQAGIAFYHKVRNGYLTLMQQHPSRFVALDATQPPDTL
HQHIIAEVETRLKLHISPLVAP
>Cag_0521 conserved hypothetical protein
MEDNMSKEKAMVLDEYESEILEAFENGKLKPVKSKTDFQAIARDTMKKKR
EINIRISENDLSALQRRAEKDGIPYQTLIGSVLHKFACGFLKEA
>Cag_1882 Transcription-repair coupling factor
MKTQLASEESSVAVVRHNPSWLFDILRQSAPYNQLTSLLSKSNAQQKGDI
LLPLAGLYGSFSSLLAATLFADATAPLMVVCSSNSFERYENDLEVLLPKG
SLCNSADELSHTIEALATKRRSLLLSLFDDLDVPLCSPSEVESRMFHVTL
DATIGYDALKHFLTANGFEQRDFVEDEGEFSLRGSIIDVYPFGAAEPLRI
ELFGDTITSLRLFDSNSQLSGKNLQQATLTANFTTPNSPITLLDYLAPET
VVLVDDVAELIAQSDGKELLERLCYFRCLSINHAEVQALNFGGEAQQKLQ
GNFRTLATLLHTAHHEARQPLFAMSSKREIGELNDFLAQESSQEALPQSG
WLPVTLHSGFRFGSLDLYTESDIFGKMHTHKVHRKRKVRGISLKELQKLK
VGDYVVHEDYGIGIFKSLETITAGNSEQESVLIEYANGDQLFVNVQNIHL
LSKYTASENSSPTLSKLGSSKWAAKKEKVRKKIRDIAINLIKVYAQRKMQ
PGFAFAPDSIFMREFEAAFIFDETPDQLRAINDVKKDMQASHPMDRLICG
DAGFGKTEIAMRSAFKAVESKKQAAVLTPTTILAHQHADSFTRRFANFPI
SIAVLSRFVSRKEQLSLLKKIEEGKIDIVIGTHRLVSKDVHFKDLGLLVI
DEEQHFGVEVKEKLREQFPTIDTLTMSATPIPRTLQFSMLGARDLSIVST
PPKNRQPVETIITDFDAALIQSAIQRELQREGQVFFLHNRIAGLETIAES
LRELVPSARIVYAHGQMPTRELEKIMMDFMQQEVDVLISTTIIGSGLDIS
NANTIIINRADLFGLSDLYQLRGRVGRSERKAYCYLITPPMKTLKKDALQ
RLAVIESFTELGSGFNIALRDLDIRGAGNLLGAEQSGYIHELGFDLYQKM
LEETVAELKTNEFSHMFEEEGNKPLRQQKPCDLLFFFDALIPDYYVAATQ
ERFAFYNRIAKATRNEQLDAIASELCDRFGKLPEEVTNLLMITKLKLIGT
LLGLEKIDIQPQSTMLYLPDQASEHVAQRHYLQYLFTAVQAEWMAEYKPG
FKMEKKMKLQLHHPTHADTTSAGLMERYSALLHKVYEEAKSEVEAAMVG
>Cag_1855 Ribosomal protein S7
MSKLRSYKKIGGDYRYGDESVARFINAVMLDGKKAVATKIVYDAFSIISE
RNNGEDALEIYRKAISNIAPVVEVRSKRVGGATYQIPMEVKPSRRSALAF
RWLKIYAGKRGGKSMAEKLAAELMDAANDQGASVKKRDEVHRMAEANKAF
AHFRF
>Cag_0859 oxaloacetate decarboxylase, alpha subunit
MKKIRFMDVSFRDGFQSCYGARVRTEDFLPVLEAAVAAGTDNFEIGGGAR
FQSLYFYCEEDAFEMMDRAREVVGPDINLQTLSRGANVVGLISQSRDIID
LHARLFKKHGISTIRNFDALMDVRNLAYSGQCIHNAGLKHQVVIALMGLA
PGLKETYCHTPQFYLDKLQEILDTGIPFDSVAFKDASGTTTPATVYETVK
GARKMLPSETVIEFHTHDTAGMGVACNFAAIEAGADIIDLSMAPVSGGTA
EVDILTMWQRLRGTDYTLDIDQEKYLEVESFFTEQMYKYYMPPEAIAVNP
IITFSPMPGGALTANTQMMRDNNTLHLFPEVINNMREVVAKGGFGASVTP
VSQFYFQQAFANTVQGAWKKITDGYGKMVLGYFGHTPSEPDEEVVRIASE
QLGLPPTKEDVHDINDRNPELGVAYNRALLEKEGLPTTEENIFIAATCGV
KGIDFLKGNCSNGIRYKADVDLEIKAKPKEEVVAAYHEAHKHDVHQTEVN
HRLTELFSKTAPAAAAPVAPATSAAKVISMAGSFSIFVDGTPYSVTFAEG
SSINPQHVAPATPMVPMAQAAPVAAPSPPVGTPVPAVMPGNIFKLDVKVG
DEVREGDEVAVLEAMKMENPVKAPCSGKVLSIAVAKGDTVGMGQPIMYIG
>Cag_1230 Molybdenum-pterin binding protein
MEQMEEQYGGQIEKQKGDAIGLEGDVWFQKAESCFLGGDRIALLEKIAAL
GSITSAAKAVGISYKTAWQLVDMMNNLASRPLVERTTGGKGGGGTIVTNE
GRKVIEQFRVVQEEHRHFLQQLEVRLGESQNVCQLLKRIAMRISARNIFA
GTVEHLTKGAVNAEVVLRLTGGQRIVSIITNTSADNLGLHEGMSAYAIIK
ASSILIGQEISPASLSASNILQGTITRVVEGAVNSEVDVAIGGGNSISAI
VTQSSLQHLALCEGAQVSAIFKASSVIVGTQ
>Cag_0345 dolichol-phosphate mannosyltransferase
MFEQLKFSREIVASMLYRVEGIFTNGRAAALIIIPTYNESDNIRRLLEEL
TCCYAGIADILVIDDNSPDGTADCVRALQNTKGSLALLVRDAKLGLGTAY
ITGFSYALQHGYQYVIEMDADYSHDPASVVDLLTASSSADLVIGSRYVNN
TVNVVNWPLSRLILSKMASLYTRLITGLPIADPTSGFKCFRAEVLRSIAF
EHVQSQGYSFQIEMNVRAWKKGFVLKEIPIVFVDRTVGKSKMSRNNIREA
IWMVWWLKVQALLGRL