API Documentation

RCSB PDB Search API

class rcsbsearchapi.Attr(attribute: str, type: Optional[str] = 'text')

A search attribute, e.g. “rcsb_entry_container_identifiers.entry_id”

Terminals can be constructed from Attr objects using either a functional syntax, which mirrors the API operators, or with python operators.

Rather than their normal bool return values, operators return Terminals.

Pre-instantiated attributes are available from the rcsbsearchapi.rcsb_attributes object. These are generally easier to use than constructing Attr objects by hand. A complete list of valid attributes is available in the schema.

  • The range dictionary requires the following keys:

  • “from” -> int

  • “to” -> int

  • “include_lower” -> bool

  • “include_upper” -> bool

__contains__(value: Union[str, List[str], rcsbsearchapi.search.Value[str], rcsbsearchapi.search.Value[List[str]]])rcsbsearchapi.search.Terminal

Maps to contains_words or contains_phrase depending on the value passed.

  • “value” in attr maps to attr.contains_phrase(“value”) for simple values.

  • [“value”] in attr maps to attr.contains_words([“value”]) for lists and tuples.

__eq__(value: Attr)bool
__eq__(value: Union[str, int, float, datetime.date, Value[str], Value[int], Value[float], Value[date]])rcsbsearchapi.search.Terminal

Return self==value.

__ge__(value: Union[int, float, datetime.date, rcsbsearchapi.search.Value[int], rcsbsearchapi.search.Value[float], rcsbsearchapi.search.Value[datetime.date]])rcsbsearchapi.search.Terminal

Return self>=value.

__gt__(value: Union[int, float, datetime.date, rcsbsearchapi.search.Value[int], rcsbsearchapi.search.Value[float], rcsbsearchapi.search.Value[datetime.date]])rcsbsearchapi.search.Terminal

Return self>value.

__le__(value: Union[int, float, datetime.date, rcsbsearchapi.search.Value[int], rcsbsearchapi.search.Value[float], rcsbsearchapi.search.Value[datetime.date]])rcsbsearchapi.search.Terminal

Return self<=value.

__lt__(value: Union[int, float, datetime.date, rcsbsearchapi.search.Value[int], rcsbsearchapi.search.Value[float], rcsbsearchapi.search.Value[datetime.date]])rcsbsearchapi.search.Terminal

Return self<value.

__ne__(value: Attr)bool
__ne__(value: Union[str, int, float, datetime.date, Value[str], Value[int], Value[float], Value[date]])rcsbsearchapi.search.Terminal

Return self!=value.

__weakref__

list of weak references to the object (if defined)

contains_phrase(value: Union[str, rcsbsearchapi.search.Value[str]])rcsbsearchapi.search.AttributeQuery

Match an exact phrase

contains_words(value: Union[str, rcsbsearchapi.search.Value[str], List[str], rcsbsearchapi.search.Value[List[str]]])rcsbsearchapi.search.AttributeQuery

Match any word within the string.

Words are split at whitespace. All results which match any word are returned, with results matching more words sorted first.

equals(value: Union[int, float, datetime.date, rcsbsearchapi.search.Value[int], rcsbsearchapi.search.Value[float], rcsbsearchapi.search.Value[datetime.date]])rcsbsearchapi.search.AttributeQuery

Attribute == value

exact_match(value: Union[str, rcsbsearchapi.search.Value[str]])rcsbsearchapi.search.AttributeQuery

Exact match with the value

exists()rcsbsearchapi.search.AttributeQuery

Attribute is defined for the structure

greater(value: Union[int, float, datetime.date, rcsbsearchapi.search.Value[int], rcsbsearchapi.search.Value[float], rcsbsearchapi.search.Value[datetime.date]])rcsbsearchapi.search.AttributeQuery

Attribute > value

greater_or_equal(value: Union[int, float, datetime.date, rcsbsearchapi.search.Value[int], rcsbsearchapi.search.Value[float], rcsbsearchapi.search.Value[datetime.date]])rcsbsearchapi.search.AttributeQuery

Attribute >= value

in_(value: Union[List[str], List[int], List[float], List[datetime.date], Tuple[str, ], Tuple[int, ], Tuple[float, ], Tuple[datetime.date, ], rcsbsearchapi.search.Value[List[str]], rcsbsearchapi.search.Value[List[int]], rcsbsearchapi.search.Value[List[float]], rcsbsearchapi.search.Value[List[datetime.date]], rcsbsearchapi.search.Value[Tuple[str, ]], rcsbsearchapi.search.Value[Tuple[int, ]], rcsbsearchapi.search.Value[Tuple[float, ]], rcsbsearchapi.search.Value[Tuple[datetime.date, ]]])rcsbsearchapi.search.AttributeQuery

Attribute is contained in the list of values

less(value: Union[int, float, datetime.date, rcsbsearchapi.search.Value[int], rcsbsearchapi.search.Value[float], rcsbsearchapi.search.Value[datetime.date]])rcsbsearchapi.search.AttributeQuery

Attribute < value

less_or_equal(value: Union[int, float, datetime.date, rcsbsearchapi.search.Value[int], rcsbsearchapi.search.Value[float], rcsbsearchapi.search.Value[datetime.date]])rcsbsearchapi.search.AttributeQuery

Attribute <= value

range(value: Dict[str, Any])rcsbsearchapi.search.AttributeQuery

Attribute is within the specified half-open range

Parameters

value – lower and upper bounds [a, b)

class rcsbsearchapi.Group(operator: typing_extensions.Literal[and, or], nodes: Iterable[rcsbsearchapi.search.Query] = ())

AND and OR combinations of queries

__and__(other: rcsbsearchapi.search.Query)rcsbsearchapi.search.Query

Intersection: a & b

__invert__()

Negation: ~a

__or__(other: rcsbsearchapi.search.Query)rcsbsearchapi.search.Query

Union: a | b

_assign_ids(node_id=0)Tuple[rcsbsearchapi.search.Query, int]

Assign node_ids sequentially for all terminal nodes

This is a helper for the Query.assign_ids() method

Parameters

node_id – Id to assign to the first leaf of this query

Returns

The modified query, with node_ids assigned node_id: The next available node_id

Return type

query

to_dict()

Get dictionary representing this query

class rcsbsearchapi.Query

Base class for all types of queries.

Queries can be combined using set operators:

  • q1 & q2: Intersection (AND)

  • q1 | q2: Union (OR)

  • ~q1: Negation (NOT)

  • q1 - q2: Difference (implemented as q1 & ~q2)

  • q1 ^ q2: Symmetric difference (XOR, implemented as (q1 & ~q2) | (~q1 & q2))

Note that only AND, OR, and negation of terminals are directly supported by the API, so other operations may be slower.

Queries can be executed by calling them as functions (list(query())) or using the exec function.

Queries are immutable, and all modifying functions return new instances.

__and__(other: rcsbsearchapi.search.Query)rcsbsearchapi.search.Query

Intersection: a & b

__call__(return_type: typing_extensions.Literal[entry, assembly, polymer_entity, non_polymer_entity, polymer_instance, mol_definition] = 'entry', rows: int = 10000, return_content_type: List[typing_extensions.Literal[experimental, computational]] = ['experimental'], results_verbosity: typing_extensions.Literal[compact, minimal, verbose] = 'compact')rcsbsearchapi.search.Session

Evaluate this query and return an iterator of all result IDs

abstract __invert__()rcsbsearchapi.search.Query

Negation: ~a

__or__(other: rcsbsearchapi.search.Query)rcsbsearchapi.search.Query

Union: a | b

__sub__(other: rcsbsearchapi.search.Query)rcsbsearchapi.search.Query

Difference: a - b

__weakref__

list of weak references to the object (if defined)

__xor__(other: rcsbsearchapi.search.Query)rcsbsearchapi.search.Query

Symmetric difference: a ^ b

abstract _assign_ids(node_id=0)Tuple[rcsbsearchapi.search.Query, int]

Assign node_ids sequentially for all terminal nodes

This is a helper for the Query.assign_ids() method

Parameters

node_id – Id to assign to the first leaf of this query

Returns

The modified query, with node_ids assigned node_id: The next available node_id

Return type

query

and_(other: Query)Query
and_(other: Union[str, Attr])PartialQuery

Extend this query with an additional attribute via an AND

assign_ids()rcsbsearchapi.search.Query

Assign node_ids sequentially for all terminal nodes

Returns

the modified query, with node_ids assigned sequentially from 0

count(return_type: typing_extensions.Literal[entry, assembly, polymer_entity, non_polymer_entity, polymer_instance, mol_definition] = 'entry', return_content_type: List[typing_extensions.Literal[experimental, computational]] = ['experimental'])int

Get the number of results found by this query

exec(return_type: typing_extensions.Literal[entry, assembly, polymer_entity, non_polymer_entity, polymer_instance, mol_definition] = 'entry', rows: int = 10000, return_content_type: List[typing_extensions.Literal[experimental, computational]] = ['experimental'], results_verbosity: typing_extensions.Literal[compact, minimal, verbose] = 'compact')rcsbsearchapi.search.Session

Evaluate this query and return an iterator of all result IDs

facets(return_type: typing_extensions.Literal[entry, assembly, polymer_entity, non_polymer_entity, polymer_instance, mol_definition] = 'entry', facets: Optional[Union[rcsbsearchapi.search.Facet, rcsbsearchapi.search.FilterFacet, List[Union[rcsbsearchapi.search.Facet, rcsbsearchapi.search.FilterFacet]]]] = None)List

Perform a facets query and return the buckets

or_(other: Query)Query
or_(other: Union[str, Attr])PartialQuery

Extend this query with an additional attribute via an OR

abstract to_dict()Dict

Get dictionary representing this query

to_json()str

Get JSON string of this query

class rcsbsearchapi.Session(query: rcsbsearchapi.search.Query, return_type: typing_extensions.Literal[entry, assembly, polymer_entity, non_polymer_entity, polymer_instance, mol_definition] = 'entry', rows: int = 10000, return_content_type: List[typing_extensions.Literal[experimental, computational]] = ['experimental'], results_verbosity: typing_extensions.Literal[compact, minimal, verbose] = 'compact', facets: Optional[Union[rcsbsearchapi.search.Facet, rcsbsearchapi.search.FilterFacet, List[Union[rcsbsearchapi.search.Facet, rcsbsearchapi.search.FilterFacet]]]] = None)

A single query session.

Handles paging the query and parsing results

__init__(query: rcsbsearchapi.search.Query, return_type: typing_extensions.Literal[entry, assembly, polymer_entity, non_polymer_entity, polymer_instance, mol_definition] = 'entry', rows: int = 10000, return_content_type: List[typing_extensions.Literal[experimental, computational]] = ['experimental'], results_verbosity: typing_extensions.Literal[compact, minimal, verbose] = 'compact', facets: Optional[Union[rcsbsearchapi.search.Facet, rcsbsearchapi.search.FilterFacet, List[Union[rcsbsearchapi.search.Facet, rcsbsearchapi.search.FilterFacet]]]] = None)

Initialize self. See help(type(self)) for accurate signature.

__iter__()Iterator[str]

Generator for all results as a list of identifiers

__weakref__

list of weak references to the object (if defined)

static _extract_identifiers(query_json: Optional[Dict])List[str]

Extract identifiers from a JSON response

_make_params(start=0)

Generate GET parameters as a dict

_single_query(start=0)Optional[Dict]

Fires a single query

iquery(limit: Optional[int] = None)List[str]

Evaluate the query and display an interactive progress bar.

Requires tqdm.

static make_uuid()str

Create a new UUID to identify a query

rcsb_query_builder_url()str

URL to view this query on the RCSB PDB website query builder

rcsb_query_editor_url()str

URL to edit this query in the RCSB PDB query editor

class rcsbsearchapi.Terminal(service: str, params: Dict[str, Any], node_id: int = 0)

A terminal query node.

Used for doing various types of searches. Accepts a service type and a dictionary of parameters. The set of parameters differs for different search services.

Terminal can be built by passing in a service and parameter dictionary, but it’s tedious work. Typically, it’s built by child classes that each represent a unique type of search. This allows for more concise searching.

Examples

>>> Terminal("full_text", {"value": "protease"})
>>> Terminal("text", {"attribute": "rcsb_id", "operator": "in", "negation": False, "value": ["5T89, "1TIM"]})
__invert__()

Negation: ~a

_assign_ids(node_id=0)Tuple[rcsbsearchapi.search.Query, int]

Assign node_ids sequentially for all terminal nodes

This is a helper for the Query.assign_ids() method

Parameters

node_id – Id to assign to the first leaf of this query

Returns

The modified query, with node_ids assigned node_id: The next available node_id

Return type

query

to_dict()

Get dictionary representing this query

class rcsbsearchapi.TextQuery(value: str)

Special case of a Terminal for free-text queries

__init__(value: str)

Search for the string value anywhere in the text

Parameters

value – free-text query

class rcsbsearchapi.Value(value: T)

Represents a value in a query.

In most cases values are unnecessary and can be replaced directly by the python value.

Values can also be used if the Attr object appears on the right:

Value(“4HHB”) == Attr(“rcsb_entry_container_identifiers.entry_id”)

__eq__(attr: Value)bool
__eq__(attr: rcsbsearchapi.search.Attr)rcsbsearchapi.search.Terminal

Return self==value.

__ge__(attr: rcsbsearchapi.search.Attr)rcsbsearchapi.search.Terminal

Return self>=value.

__gt__(attr: rcsbsearchapi.search.Attr)rcsbsearchapi.search.Terminal

Return self>value.

__le__(attr: rcsbsearchapi.search.Attr)rcsbsearchapi.search.Terminal

Return self<=value.

__lt__(attr: rcsbsearchapi.search.Attr)rcsbsearchapi.search.Terminal

Return self<value.

__ne__(attr: Value)bool
__ne__(attr: rcsbsearchapi.search.Attr)rcsbsearchapi.search.Terminal

Return self!=value.

__weakref__

list of weak references to the object (if defined)

rcsbsearchapi.rcsb_attributes: SchemaGroup = <rcsbsearchapi.schema.SchemaGroup object>

Object with all known RCSB PDB attributes.

This is provided to ease autocompletion as compared to creating Attr objects from strings. For example,

rcsb_attributes.rcsb_nonpolymer_instance_feature_summary.chem_id

is equivalent to

Attr('rcsb_nonpolymer_instance_feature_summary.chem_id')

All attributes in rcsb_attributes can be iterated over.

>>> [a for a in rcsb_attributes if "stoichiometry" in a.attribute]
[Attr(attribute='rcsb_struct_symmetry.stoichiometry')]

Attributes matching a regular expression can also be filtered:

>>> list(rcsb_attributes.search('rcsb.*stoichiometry'))
[Attr(attribute='rcsb_struct_symmetry.stoichiometry')]a