Biochemical Information

Binding Database
The BindingDB is a public database of measured binding affinities for biomolecules, genetically or chemically modified biomolecules, and synthetic compounds.The database currently contains data generated by isothermal titration calorimetry, enzyme inhibition, and receptor-ligand binding methods.

BioCyc Open Chemical Database
"The BioCyc Open Chemical Database (BOCD) is a collection of chemical compound data from the BioCyc databases. Most of the compounds act as substrates in enzyme-catalyzed metabolic reactions, but some compounds serve as enzyme activators, inhibitors, or cofactors. Chemical structures are provided for the majority of compounds."

Biological Macromolecule Crystallization Database (BMCD)
The Biological Macromolecule Crystallization Database (BMCD) contains crystal data and the crystallization conditions, which have been compiled from literature. The current version of the BMCD includes 5247 crystal entries from macromolecules for which diffraction quality crystals have been obtained. These include proteins, protein:protein complexes, nucleic acid, nucleic acid:nucleic acid complexes, protein:nucleic acid complexes, and viruses.

BRENDA: Comprehensive Enzyme Information System star
Free database containing information on over 3500 enzymes: nomenclature, EC and registry numbers, reaction and specificity, inhibitors, structure, isolation, literature references, etc. (Cologne University Bioinformatics Center)

Carcinogenic Potency Database
The Carcinogenic Potency Database (CPDB) is a unique and widely used international resource of results from 6153 chronic, long-term animal cancer tests on 1485 chemicals. CPDB provides a standardized and easily accessible database with qualitative and quantitative analyses of both positive and negative experiments that have been published in the general literature through 1997 and by the National Cancer Institute/National Toxicology Program through 1998. (UC Berkeley)

ChEBI: Chemical Entities of Biological Interest
Dictionary of small molecular entities that are natural or synthetic products used to intervene in the processes of living organisms. Data drawn from public files such as IntEnz and KEGG LIGAND, etc.

ChemBank "includes freely available data derived from small molecules and small-molecule screens, and resources for studying the data so that biological and medical insights can be gained. ChemBank is intended to guide chemists synthesizing novel compounds or libraries, to assist biologists searching for small molecules that perturb specific biological pathways, and to catalyze the process by which drug hunters discover new and effective medicines." Free registration required. (Harvard Medical School)

"ChEMBL is a database of bioactive drug-like small molecules, it contains 2-D structures, calculated properties (e.g. logP, Molecular Weight, Lipinski Parameters, etc.) and abstracted bioactivities (e.g. binding constants, pharmacology and ADMET data)." (European Bioinformatics Institute)

ChemDB is a suite of chemical datasets and learning tools created by UC Irvine. Includes a chemical search feature for about 4 million compounds from vendor catalogs.
Collection of links to cheminformatics programs and QSAR datasets (with structures) in about 90 cheminformatics and CADD categories, with some similarity searching tools.

"Bioinformatics and cheminformatics resource that combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information." The database contains over 6700 drug entries including FDA-approved small molecule drugs, FDA-approved biotech (protein/peptide) drugs, nutraceuticals and experimental drugs. Additionally, non-redundant protein (i.e. drug target/enzyme/transporter/carrier) sequences are linked to these drug entries. Each DrugCard entry contains drug/chemical data and drug target or protein data. (Univ. of Alberta)

ENZYME is a repository of information relative to the nomenclature of enzymes. It is primarily based on the recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (IUBMB) and it describes each type of characterized enzyme for which an EC (Enzyme Commission) number has been provided. (ExPASy)

Enzyme Explorer
Search by name, application, specificity or EC number. The Enzyme Explorer also features detailed sites on metabolic pathways, protein kinases, protease specificity and inhibition, and glycoprotein analysis as well as new cell signaling, analytical and diagnostic enzymes and reagents. (Sigma-Aldrich)

Enzyme Nomenclature
Subtitle: "Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the Nomenclature and Classification of Enzymes by the Reactions they Catalyse." Browse and search for enzyme names using EC numbers.

ExPASy Proteomics Server
The ExPASy (Expert Protein Analysis System) proteomics server of the Swiss Institute of Bioinformatics is dedicated to the analysis of protein sequences and structures. It provides access to a number of databases and tools useful for the biological chemist, including UniProtKB and ENZYME, plus links to many others.

GenBank star
GenBank is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences.

Genome Bioinformatics
Contains the reference sequence and working draft assemblies for a large collection of genomes. (UC Santa Cruz)

GenomeNet is a Japanese network of database and computational services for genome research and related research areas in biomedical sciences, operated by the Kyoto University Bioinformatics Center. It includes KEGG (Kyoto Encyclopedia of Genes and Genomes), and other biochemical tools such as KEGG LIGAND.

"The goal of the International HapMap Project is to develop a haplotype map of the human genome, the HapMap, which will describe the common patterns of human DNA sequence variation. The HapMap is expected to be a key resource for researchers to use to find genes affecting health, disease, and responses to drugs and environmental factors." (NIH)

Human Metabolome Database
"The Human Metabolome Database (HMDB) is a freely available electronic database containing detailed information about small molecule metabolites found in the human body. It is intended to be used for applications in metabolomics, clinical chemistry, biomarker discovery and general education. The database is designed to contain or link three kinds of data: 1) chemical data, 2) clinical data, and 3) molecular biology/biochemistry data."

NIH-funded consortium has developed a Lipid Metabolites and Pathways Strategy, termed LIPID MAPS, that applies a global integrated approach to the study of lipidomics.

National Center for Biotechnology Information (NCBI) star
Parent site of the major NLM databases, including PubMed, GenBank, Nucleotide and Protein Sequences, Protein Structures, Complete Genomes, Taxonomy, PubChem, and others. (NIH)

Nucleic Acid Database star
Databases of 3D structural information on nucleic acids. (Rutgers University)

OSIRIS Property Explorer
Calculates on-the-fly drug-relevant properties (cLogP, solubility, MW) from a valid structure. Prediction results are valued and color coded. (Actelion)

Protein Data Bank star
The single international repository for the processing and distribution of 3D structure data of biological macromolecules determined by X-ray crystallography and NMR. The Research Collaboratory for Structural Bioinformatics is a non-profit collaboration between Rutgers, SDSC and NIST.

A wiki site that aims to collect, organize and disseminate structural and functional knowledge about protein, RNA, DNA, and other macromolecules, and their assemblies and interactions with small molecules.

NIH cheminformatics database on biological activities of small molecules primarily of interest to pharmaceutical and chemical genomics researchers. PubChem includes substance information, compound structures, and BioActivity data in three primary databases. The Substance/Compound database, where possible, provides links to BioAssay description, literature, references, and assay data points. PubChem is integrated with Entrez, NCBI's primary search engine, and also provides compound neighboring, sub/superstructure, similarity structure, BioActivity data, and other searching features. Links from PubChem's chemical structure records to other Entrez databases provide information on biological properties. These include links to PubMed scientific literature and NCBI's protein 3D structure resource. Links to depositor web sites provide further information. (NIH does not take responsibility for the accuracy of the deposited data, and errors can be rapidly propagated across multiple online databases.)

SCOP - Structural Classification of Proteins
The SCOP database aims to provide a detailed and comprehensive description of the structural and evolutionary relationships between all proteins whose structure is known, including all entries in the Protein Data Bank (PDB). It is available as a set of tightly linked hypertext documents which make the large database comprehensible and accessible. In addition, the hypertext pages offer many representations of proteins, including links to PDB entries, sequences, references, images and interactive display systems.

STITCH: Chemical-Protein Interactions
"STITCH is a resource to explore known and predicted interactions of chemicals and proteins. Chemicals are linked to other chemicals and proteins by evidence derived from experiments, databases and the literature. STITCH contains interactions for over 74,000 small molecules and over 2.5 million proteins in 630 organisms."

UniProt (Universal Protein Resource) is a central repository of protein sequence and function data, merging the information contained in Swiss-Prot, TrEMBL, and PIR. The UniProt Knowledgebase (UniProtKB) is the central access point for curated protein information, including function, classification, and cross-reference. The UniProt Reference Clusters (UniRef) databases combine closely related sequences into a single record to speed searches. The UniProt Archive (UniParc) is a repository reflecting the history of all protein sequences.

Database of commercially-available compounds for virtual screening. ZINC contains over 8 million compounds in ready-to-dock, 3D formats. (UC San Francisco)