Design Concepts¶
Search Concepts¶
The pyesgf.search interface to ESGF search reflects the typical workflow of a user navigating through the sets of facets categorising available data.
Keyword classification¶
The keyword arguments described in the ESGF Search API have a wide veriety of roles within the search workflow. To reflect this pyesgf.search classifies these keywords into system, spatiotemporal and facet keywords. Responsibility for these keywords are distributes across several classes.
System keywords¶
API keyword |
class |
Notes |
|---|---|---|
limit |
SearchConnection |
Set in |
offset |
SearchConnection |
Set in |
shards |
SearchConnection |
Set in constructor |
distrib |
SearchConnection |
Set in constructor |
latest |
SearchContext |
Set in constructor |
facets |
SearchContext |
Set in constructor |
fields |
SearchContext |
Set in constructor |
replica |
SearchContext |
Set in constructor |
type |
SearchContext |
Create contexts with the right type using |
from |
SearchContext |
Set in constructor. Use “from_timestamp” in the context API. |
to |
SearchContext |
Set in constructor. Use “to_timestamp” in the context API. |
fields |
n/a |
Managed internally |
format |
n/a |
Managed internally |
id |
n/a |
Managed internally |
Temporal keywords¶
Temporal keywords are supported for Dataset search. The terms “from_timestamp” and “to_timestamp” should be used with values following the format “YYYY-MM-DDThh:mm:ssZ”.
Spatial keywords¶
Spatial keywords are not yet supported by pyesgf.search however the API does have placeholders for these keywords anticipating future implementation:
Facet keywords¶
All other keywords are considered to be search facets. The keyword “query” is dealt with specially as a freetext facet.
Main Classes¶
SearchConnection¶
SearchConnection instances represent a connection to an ESGF Search web service. This stores the service URL and also service-level parameters like distrib and shards.
SearchContext¶
SearchContext represents the constraints on a given search. This includes the type of records you are searching for (File or Dataset), the list of possible facets with or without facet counts (depending on how the instance is created), currently selected facets/search-terms. Instances can return the number of hits and facet-counts associated with the current search.
SearchContext objects can be created in several ways:
From a SearchConnection object using the method
SearchConnection.new_context()By further constraining an existing FacetContext object. E.g. new_context = context.constrain(institute=’IPSL’).
From a Result object using one of it’s foo_context() methods to create a context for searching for results related to the Result.
Future development may implement project-specific factory. E.g. CMIP5FacetContext().
ResultSet¶
ResultSet instances are returned by the SearchContext.search() method and represent the results from a query. They supports transparent paging of results with a client-side cache.
Result¶
Result instances represent the result record in the SOLr response. They are subclassed to represent records of different types: FileResult and DatasetResult. Results have various properties exposing information about the objects they represent. e.g. dataset_id, checksum, filename, size, etc.