PDB is a good resource for answering such questions, since it will let you filter results by many additional parameters. To count and extract 3D structures of human proteins:
- Open
Advanced
search tab of the PDB website. - Select
Biology
->Source organism
from the menu. - Type
Homo sapiens (human)
. - You can reduce redundancy by checking
Remove Similar Sequences at n% identity
below. - Submit query.
To add further filters, click Refine Query with Advanced Search
. There you can extract structures by deposition date, quality (eg. resolution or R-factors for structures solved by X-ray diffraction), ligands, enzyme classification, etc. (by checking Add Search Criteria
)
Search for human proteins with removal of homologues with 90% identity cutoff fetches 7117 structures. The number of good quality X-ray protein structures (resolution < 2.5A) is currently 3964 (with the same identity cutoff).
You can then download the fetched list or create custom reports (menus below).
A good tool (also used by PDB) for generating non-redundant protein datasets is cd-hit.
No comments:
Post a Comment