API Documentation¶
RCSB PDB Search API
-
class
rcsbsearchapi.
Attr
(attribute: str, type: Optional[str] = 'text')¶ A search attribute, e.g. “rcsb_entry_container_identifiers.entry_id”
Terminals can be constructed from Attr objects using either a functional syntax, which mirrors the API operators, or with python operators.
Rather than their normal bool return values, operators return Terminals.
Pre-instantiated attributes are available from the
rcsbsearchapi.rcsb_attributes
object. These are generally easier to use than constructing Attr objects by hand. A complete list of valid attributes is available in the schema.The range dictionary requires the following keys:
“from” -> int
“to” -> int
“include_lower” -> bool
“include_upper” -> bool
-
__contains__
(value: Union[str, List[str], rcsbsearchapi.search.Value[str], rcsbsearchapi.search.Value[List[str]]]) → rcsbsearchapi.search.Terminal¶ Maps to contains_words or contains_phrase depending on the value passed.
“value” in attr maps to attr.contains_phrase(“value”) for simple values.
[“value”] in attr maps to attr.contains_words([“value”]) for lists and tuples.
-
__eq__
(value: Attr) → bool¶ -
__eq__
(value: Union[str, int, float, datetime.date, Value[str], Value[int], Value[float], Value[date]]) → rcsbsearchapi.search.Terminal Return self==value.
-
__ge__
(value: Union[int, float, datetime.date, rcsbsearchapi.search.Value[int], rcsbsearchapi.search.Value[float], rcsbsearchapi.search.Value[datetime.date]]) → rcsbsearchapi.search.Terminal¶ Return self>=value.
-
__gt__
(value: Union[int, float, datetime.date, rcsbsearchapi.search.Value[int], rcsbsearchapi.search.Value[float], rcsbsearchapi.search.Value[datetime.date]]) → rcsbsearchapi.search.Terminal¶ Return self>value.
-
__le__
(value: Union[int, float, datetime.date, rcsbsearchapi.search.Value[int], rcsbsearchapi.search.Value[float], rcsbsearchapi.search.Value[datetime.date]]) → rcsbsearchapi.search.Terminal¶ Return self<=value.
-
__lt__
(value: Union[int, float, datetime.date, rcsbsearchapi.search.Value[int], rcsbsearchapi.search.Value[float], rcsbsearchapi.search.Value[datetime.date]]) → rcsbsearchapi.search.Terminal¶ Return self<value.
-
__ne__
(value: Attr) → bool¶ -
__ne__
(value: Union[str, int, float, datetime.date, Value[str], Value[int], Value[float], Value[date]]) → rcsbsearchapi.search.Terminal Return self!=value.
-
__weakref__
¶ list of weak references to the object (if defined)
-
contains_phrase
(value: Union[str, rcsbsearchapi.search.Value[str]]) → rcsbsearchapi.search.AttributeQuery¶ Match an exact phrase
-
contains_words
(value: Union[str, rcsbsearchapi.search.Value[str], List[str], rcsbsearchapi.search.Value[List[str]]]) → rcsbsearchapi.search.AttributeQuery¶ Match any word within the string.
Words are split at whitespace. All results which match any word are returned, with results matching more words sorted first.
-
equals
(value: Union[int, float, datetime.date, rcsbsearchapi.search.Value[int], rcsbsearchapi.search.Value[float], rcsbsearchapi.search.Value[datetime.date]]) → rcsbsearchapi.search.AttributeQuery¶ Attribute == value
-
exact_match
(value: Union[str, rcsbsearchapi.search.Value[str]]) → rcsbsearchapi.search.AttributeQuery¶ Exact match with the value
-
exists
() → rcsbsearchapi.search.AttributeQuery¶ Attribute is defined for the structure
-
greater
(value: Union[int, float, datetime.date, rcsbsearchapi.search.Value[int], rcsbsearchapi.search.Value[float], rcsbsearchapi.search.Value[datetime.date]]) → rcsbsearchapi.search.AttributeQuery¶ Attribute > value
-
greater_or_equal
(value: Union[int, float, datetime.date, rcsbsearchapi.search.Value[int], rcsbsearchapi.search.Value[float], rcsbsearchapi.search.Value[datetime.date]]) → rcsbsearchapi.search.AttributeQuery¶ Attribute >= value
-
in_
(value: Union[List[str], List[int], List[float], List[datetime.date], Tuple[str, …], Tuple[int, …], Tuple[float, …], Tuple[datetime.date, …], rcsbsearchapi.search.Value[List[str]], rcsbsearchapi.search.Value[List[int]], rcsbsearchapi.search.Value[List[float]], rcsbsearchapi.search.Value[List[datetime.date]], rcsbsearchapi.search.Value[Tuple[str, …]], rcsbsearchapi.search.Value[Tuple[int, …]], rcsbsearchapi.search.Value[Tuple[float, …]], rcsbsearchapi.search.Value[Tuple[datetime.date, …]]]) → rcsbsearchapi.search.AttributeQuery¶ Attribute is contained in the list of values
-
less
(value: Union[int, float, datetime.date, rcsbsearchapi.search.Value[int], rcsbsearchapi.search.Value[float], rcsbsearchapi.search.Value[datetime.date]]) → rcsbsearchapi.search.AttributeQuery¶ Attribute < value
-
less_or_equal
(value: Union[int, float, datetime.date, rcsbsearchapi.search.Value[int], rcsbsearchapi.search.Value[float], rcsbsearchapi.search.Value[datetime.date]]) → rcsbsearchapi.search.AttributeQuery¶ Attribute <= value
-
range
(value: Dict[str, Any]) → rcsbsearchapi.search.AttributeQuery¶ Attribute is within the specified half-open range
- Parameters
value – lower and upper bounds [a, b)
-
class
rcsbsearchapi.
Group
(operator: typing_extensions.Literal[and, or], nodes: Iterable[rcsbsearchapi.search.Query] = ())¶ AND and OR combinations of queries
-
__and__
(other: rcsbsearchapi.search.Query) → rcsbsearchapi.search.Query¶ Intersection: a & b
-
__invert__
()¶ Negation: ~a
-
__or__
(other: rcsbsearchapi.search.Query) → rcsbsearchapi.search.Query¶ Union: a | b
-
_assign_ids
(node_id=0) → Tuple[rcsbsearchapi.search.Query, int]¶ Assign node_ids sequentially for all terminal nodes
This is a helper for the
Query.assign_ids()
method- Parameters
node_id – Id to assign to the first leaf of this query
- Returns
The modified query, with node_ids assigned node_id: The next available node_id
- Return type
query
-
to_dict
()¶ Get dictionary representing this query
-
-
class
rcsbsearchapi.
Query
¶ Base class for all types of queries.
Queries can be combined using set operators:
q1 & q2: Intersection (AND)
q1 | q2: Union (OR)
~q1: Negation (NOT)
q1 - q2: Difference (implemented as q1 & ~q2)
q1 ^ q2: Symmetric difference (XOR, implemented as (q1 & ~q2) | (~q1 & q2))
Note that only AND, OR, and negation of terminals are directly supported by the API, so other operations may be slower.
Queries can be executed by calling them as functions (list(query())) or using the exec function.
Queries are immutable, and all modifying functions return new instances.
-
__and__
(other: rcsbsearchapi.search.Query) → rcsbsearchapi.search.Query¶ Intersection: a & b
-
__call__
(return_type: typing_extensions.Literal[entry, assembly, polymer_entity, non_polymer_entity, polymer_instance, mol_definition] = 'entry', rows: int = 10000, return_content_type: List[typing_extensions.Literal[experimental, computational]] = ['experimental'], results_verbosity: typing_extensions.Literal[compact, minimal, verbose] = 'compact') → rcsbsearchapi.search.Session¶ Evaluate this query and return an iterator of all result IDs
-
abstract
__invert__
() → rcsbsearchapi.search.Query¶ Negation: ~a
-
__or__
(other: rcsbsearchapi.search.Query) → rcsbsearchapi.search.Query¶ Union: a | b
-
__sub__
(other: rcsbsearchapi.search.Query) → rcsbsearchapi.search.Query¶ Difference: a - b
-
__weakref__
¶ list of weak references to the object (if defined)
-
__xor__
(other: rcsbsearchapi.search.Query) → rcsbsearchapi.search.Query¶ Symmetric difference: a ^ b
-
abstract
_assign_ids
(node_id=0) → Tuple[rcsbsearchapi.search.Query, int]¶ Assign node_ids sequentially for all terminal nodes
This is a helper for the
Query.assign_ids()
method- Parameters
node_id – Id to assign to the first leaf of this query
- Returns
The modified query, with node_ids assigned node_id: The next available node_id
- Return type
query
-
and_
(other: Query) → Query¶ -
and_
(other: Union[str, Attr]) → PartialQuery Extend this query with an additional attribute via an AND
-
assign_ids
() → rcsbsearchapi.search.Query¶ Assign node_ids sequentially for all terminal nodes
- Returns
the modified query, with node_ids assigned sequentially from 0
-
count
(return_type: typing_extensions.Literal[entry, assembly, polymer_entity, non_polymer_entity, polymer_instance, mol_definition] = 'entry', return_content_type: List[typing_extensions.Literal[experimental, computational]] = ['experimental']) → int¶ Get the number of results found by this query
-
exec
(return_type: typing_extensions.Literal[entry, assembly, polymer_entity, non_polymer_entity, polymer_instance, mol_definition] = 'entry', rows: int = 10000, return_content_type: List[typing_extensions.Literal[experimental, computational]] = ['experimental'], results_verbosity: typing_extensions.Literal[compact, minimal, verbose] = 'compact') → rcsbsearchapi.search.Session¶ Evaluate this query and return an iterator of all result IDs
-
facets
(return_type: typing_extensions.Literal[entry, assembly, polymer_entity, non_polymer_entity, polymer_instance, mol_definition] = 'entry', facets: Optional[Union[rcsbsearchapi.search.Facet, rcsbsearchapi.search.FilterFacet, List[Union[rcsbsearchapi.search.Facet, rcsbsearchapi.search.FilterFacet]]]] = None) → List¶ Perform a facets query and return the buckets
-
or_
(other: Query) → Query¶ -
or_
(other: Union[str, Attr]) → PartialQuery Extend this query with an additional attribute via an OR
-
abstract
to_dict
() → Dict¶ Get dictionary representing this query
-
to_json
() → str¶ Get JSON string of this query
-
class
rcsbsearchapi.
Session
(query: rcsbsearchapi.search.Query, return_type: typing_extensions.Literal[entry, assembly, polymer_entity, non_polymer_entity, polymer_instance, mol_definition] = 'entry', rows: int = 10000, return_content_type: List[typing_extensions.Literal[experimental, computational]] = ['experimental'], results_verbosity: typing_extensions.Literal[compact, minimal, verbose] = 'compact', facets: Optional[Union[rcsbsearchapi.search.Facet, rcsbsearchapi.search.FilterFacet, List[Union[rcsbsearchapi.search.Facet, rcsbsearchapi.search.FilterFacet]]]] = None)¶ A single query session.
Handles paging the query and parsing results
-
__init__
(query: rcsbsearchapi.search.Query, return_type: typing_extensions.Literal[entry, assembly, polymer_entity, non_polymer_entity, polymer_instance, mol_definition] = 'entry', rows: int = 10000, return_content_type: List[typing_extensions.Literal[experimental, computational]] = ['experimental'], results_verbosity: typing_extensions.Literal[compact, minimal, verbose] = 'compact', facets: Optional[Union[rcsbsearchapi.search.Facet, rcsbsearchapi.search.FilterFacet, List[Union[rcsbsearchapi.search.Facet, rcsbsearchapi.search.FilterFacet]]]] = None)¶ Initialize self. See help(type(self)) for accurate signature.
-
__iter__
() → Iterator[str]¶ Generator for all results as a list of identifiers
-
__weakref__
¶ list of weak references to the object (if defined)
-
static
_extract_identifiers
(query_json: Optional[Dict]) → List[str]¶ Extract identifiers from a JSON response
-
_make_params
(start=0)¶ Generate GET parameters as a dict
-
_single_query
(start=0) → Optional[Dict]¶ Fires a single query
-
iquery
(limit: Optional[int] = None) → List[str]¶ Evaluate the query and display an interactive progress bar.
Requires tqdm.
-
static
make_uuid
() → str¶ Create a new UUID to identify a query
-
rcsb_query_builder_url
() → str¶ URL to view this query on the RCSB PDB website query builder
-
rcsb_query_editor_url
() → str¶ URL to edit this query in the RCSB PDB query editor
-
-
class
rcsbsearchapi.
Terminal
(service: str, params: Dict[str, Any], node_id: int = 0)¶ A terminal query node.
Used for doing various types of searches. Accepts a service type and a dictionary of parameters. The set of parameters differs for different search services.
Terminal can be built by passing in a service and parameter dictionary, but it’s tedious work. Typically, it’s built by child classes that each represent a unique type of search. This allows for more concise searching.
Examples
>>> Terminal("full_text", {"value": "protease"}) >>> Terminal("text", {"attribute": "rcsb_id", "operator": "in", "negation": False, "value": ["5T89, "1TIM"]})
-
__invert__
()¶ Negation: ~a
-
_assign_ids
(node_id=0) → Tuple[rcsbsearchapi.search.Query, int]¶ Assign node_ids sequentially for all terminal nodes
This is a helper for the
Query.assign_ids()
method- Parameters
node_id – Id to assign to the first leaf of this query
- Returns
The modified query, with node_ids assigned node_id: The next available node_id
- Return type
query
-
to_dict
()¶ Get dictionary representing this query
-
-
class
rcsbsearchapi.
TextQuery
(value: str)¶ Special case of a Terminal for free-text queries
-
__init__
(value: str)¶ Search for the string value anywhere in the text
- Parameters
value – free-text query
-
-
class
rcsbsearchapi.
Value
(value: T)¶ Represents a value in a query.
In most cases values are unnecessary and can be replaced directly by the python value.
Values can also be used if the Attr object appears on the right:
Value(“4HHB”) == Attr(“rcsb_entry_container_identifiers.entry_id”)
-
__eq__
(attr: Value) → bool¶ -
__eq__
(attr: rcsbsearchapi.search.Attr) → rcsbsearchapi.search.Terminal Return self==value.
-
__ge__
(attr: rcsbsearchapi.search.Attr) → rcsbsearchapi.search.Terminal¶ Return self>=value.
-
__gt__
(attr: rcsbsearchapi.search.Attr) → rcsbsearchapi.search.Terminal¶ Return self>value.
-
__le__
(attr: rcsbsearchapi.search.Attr) → rcsbsearchapi.search.Terminal¶ Return self<=value.
-
__lt__
(attr: rcsbsearchapi.search.Attr) → rcsbsearchapi.search.Terminal¶ Return self<value.
-
__ne__
(attr: Value) → bool¶ -
__ne__
(attr: rcsbsearchapi.search.Attr) → rcsbsearchapi.search.Terminal Return self!=value.
-
__weakref__
¶ list of weak references to the object (if defined)
-
-
rcsbsearchapi.
rcsb_attributes
: SchemaGroup = <rcsbsearchapi.schema.SchemaGroup object>¶ Object with all known RCSB PDB attributes.
This is provided to ease autocompletion as compared to creating Attr objects from strings. For example,
rcsb_attributes.rcsb_nonpolymer_instance_feature_summary.chem_id
is equivalent to
Attr('rcsb_nonpolymer_instance_feature_summary.chem_id')
All attributes in rcsb_attributes can be iterated over.
>>> [a for a in rcsb_attributes if "stoichiometry" in a.attribute] [Attr(attribute='rcsb_struct_symmetry.stoichiometry')]
Attributes matching a regular expression can also be filtered:
>>> list(rcsb_attributes.search('rcsb.*stoichiometry')) [Attr(attribute='rcsb_struct_symmetry.stoichiometry')]a