Design Concepts¶
Search Concepts¶
The pyesgf.search
interface to ESGF search reflects the typical workflow of a user navigating through the sets of facets categorising available data.
Keyword classification¶
The keyword arguments described in the ESGF Search API have a wide veriety of roles within the search workflow. To reflect this pyesgf.search
classifies these keywords into system, spatiotemporal and facet keywords. Responsibility for these keywords are distributes across several classes.
System keywords¶
API keyword |
class |
Notes |
---|---|---|
limit |
SearchConnection |
Set in |
offset |
SearchConnection |
Set in |
shards |
SearchConnection |
Set in constructor |
distrib |
SearchConnection |
Set in constructor |
latest |
SearchContext |
Set in constructor |
facets |
SearchContext |
Set in constructor |
fields |
SearchContext |
Set in constructor |
replica |
SearchContext |
Set in constructor |
type |
SearchContext |
Create contexts with the right type using |
from |
SearchContext |
Set in constructor. Use “from_timestamp” in the context API. |
to |
SearchContext |
Set in constructor. Use “to_timestamp” in the context API. |
fields |
n/a |
Managed internally |
format |
n/a |
Managed internally |
id |
n/a |
Managed internally |
Temporal keywords¶
Temporal keywords are supported for Dataset search. The terms “from_timestamp” and “to_timestamp” should be used with values following the format “YYYY-MM-DDThh:mm:ssZ”.
Spatial keywords¶
Spatial keywords are not yet supported by pyesgf.search
however the API does have placeholders for these keywords anticipating future implementation:
Facet keywords¶
All other keywords are considered to be search facets. The keyword “query” is dealt with specially as a freetext facet.
Main Classes¶
SearchConnection¶
SearchConnection
instances represent a connection to an ESGF Search web service. This stores the service URL and also service-level parameters like distrib and shards.
SearchContext¶
SearchContext
represents the constraints on a given search. This includes the type of records you are searching for (File or Dataset), the list of possible facets with or without facet counts (depending on how the instance is created), currently selected facets/search-terms. Instances can return the number of hits and facet-counts associated with the current search.
SearchContext objects can be created in several ways:
From a SearchConnection object using the method
SearchConnection.new_context()
By further constraining an existing FacetContext object. E.g. new_context = context.constrain(institute=’IPSL’).
From a Result object using one of it’s foo_context() methods to create a context for searching for results related to the Result.
Future development may implement project-specific factory. E.g. CMIP5FacetContext().
ResultSet¶
ResultSet
instances are returned by the SearchContext.search()
method and represent the results from a query. They supports transparent paging of results with a client-side cache.
Result¶
Result
instances represent the result record in the SOLr response. They are subclassed to represent records of different types: FileResult
and DatasetResult
. Results have various properties exposing information about the objects they represent. e.g. dataset_id, checksum, filename, size, etc.