Downloads/STRING: functional protein association networks

STRING 9.1

Download Area

STRING uses a relational database system (PostgreSQL) to store primary data and precomputed predictions. For convenience, we provide selected data-items as flatfiles below.
Please note: the complete dataset of STRING is also available - but it requires signing a license agreement (free for academics, see here for details).

Files that do not require a separate license agreement are published under a Creative Commons Attribution 3.0 License or a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License. For commercial use or customized versions, please contact biobyte solutions GmbH.

Protein mode (flatfiles)

- File -

- Description -

- Access -

protein.sequences.v9.1.fa.gz (1.2 Gb)

sequences of all proteins in STRING

protein.links.v9.1.txt.gz (4.1 Gb)

protein network data (scored links between proteins)

protein.links.detailed.v9.1.txt.gz (6.1 Gb)

protein network data (incl. subscores per channel); commercial entities require a license.

protein.actions.v9.1.txt.gz (728.3 Mb)

interaction types for protein links

protein.actions.detailed.v9.1.txt.gz (835.8 Mb)

interaction types for protein links (incl. subscores per type); commercial entities require a license.

protein.links.full.v9.1.txt.gz (6.6 Gb)

protein network data (incl. distinction: direct vs. interologs); all users require a license

license required

Files too large? Enter or select an organism to restrict the network before downloading:

COG mode (flatfiles)

- File -

- Description -

- Access -

COG.mappings.v9.1.txt.gz (83.7 Mb)

orthologous groups (COGs,NOGs,KOGs,...) and their proteins

protein.sequences.v9.1.fa.gz (1.2 Gb)

sequences of all proteins in STRING (can be used as a blast db)

species.mappings.v9.1.txt.gz (11.2 Mb)

presence / absence of orthologous groups in species

COG.links.v9.1.txt.gz (131.6 Mb)

association scores between orthologous groups

COG.links.detailed.v9.1.txt.gz (203.4 Mb)

association scores (incl. subscores per channel); commercial entities require a license.

General flatfiles & full database dumps

- File -

- Description -

- Access -

species.v9.1.txt (77.9 Kb)

organisms in STRING

species.tree.v9.1.txt (26.4 Kb)

STRING tree of species

database.schema.v9.1.pdf (119 Kb)

STRING database schema

protein.aliases.v9.1.txt.gz (445 Mb)

aliases for STRING proteins: locus names, accessions, descriptions...

mapping_files (FTP directory)

separate identifier mapping files, for several frequently used name_spaces...

items_schema.v9.1.sql.gz (empty)

full database, part I: the players (proteins, species, COGs,...)

license required

network_schema.v9.1.sql.gz (empty)

full database, part II: the networks (nodes, edges, scores,...)

license required

evidence_schema.v9.1.sql.gz (empty)

full database, part III: interaction evidence (datasets, abstracts, predictions, ...)

license required

homology_schema.v9.1.sql.gz (empty)

full database, part IV: homology data (all-against-all BLAST searches)

license required

Please note: STRING is subject to periodic updates. Therefore, do visit back on this page to get the latest associations whenever needed.
Protein identifiers in the above files contain two substrings each: 'NNNNN.aaaaaa'. The first substring is the NCBI taxonomy species identifier, and the second substring is the RefSeq/Ensembl-identifier of the protein.
Please note that some of the files are very large. You may experience problems downloading them, depending on your browser and/or operating system.