ElasticSearch: search box (DRAFT under development)

This documentation applies to the general Indicia ElasticSearch (ES) search box that can currently be found on the new verifiers' page, and will shortly be available on other pages.

It does not apply to the filter boxes that appear at the tops of the columns when looking at a grid of records.

Searching text

By default, the Indicia ES search operates across all fields in the ElasticSearch index. Consider this search:

Crow

That will retrieve records of Crows, records made by Fred Crow, records made at Crow Point and many more.

To restrict searches to particular fields, qualify the search term with the name of the ES field. For example to restrict the search to records made by people called Crow, use this:

event.recorded_by:Crow

That will retrieve records by anyone with the word 'Crow' in their name. (Only the whole word – it wouldn't retrieve records for someone with the surname 'Crowmore'.)

To further restrict records to those made by Fred Crow, you need to quote the search string:

event.recorded_by:"Fred Crow"

But beware, this will not retrieve records where the name is represented as "Crow, Fred" for example. When you enclose a string in quotes, only exact matches are retrieved.

If more than one word is used in a search term and these are not quoted, then an implicit OR is used. Consider this search:

Smithills Estate

That will retrieve all records where either the word 'Smithills' or 'Estate', appears in a field.

To retrieve only those with the phrase 'Smithills Estate' in a field, use this search:

"Smithills Estate"

And to be absolutely sure that the search was restricted to the Location field, you could use this:

location.name:"Smithills Estate"

For information on using explicit OR and AND operators, see 'combining values' below.

Searching dates

To search for records from 2020 you could enter this search term:

2020

That will indeed retrieve all records created in 2020, but it will also retrieve all records last updated in 2020, even where the actual record was made many years before. In fact it will retrieve records where '2020' appears in any field - date or otherwise. So it is good practice to qualify searches on dates with the name of a field, like this for example:

event.date_start:2020

That search will only retrieve records for the year 2020. To specify a particular month in a year, you must specify a field like this:

event.date_start:2019-05

That will retrieve all records for May 2019. (Note that the search term '2019-05' on its own - i.e. without the field name - will fail.) A full date is specified similarly:

event.date_start:2019-05-11

That will only retrieve records for 11th May 2019. The search terms above that specify only a year or a month within a year are implicit date ranges since all records with a value for event.date_start that fall within that year (or month) will be retrieved. But date ranges can also be specified explicity. For example:

event.date_start:[2018-12 TO 2019-02]

That search will retrieve any records made between December 2018 and February 2019 inclusive. Note the uppercase 'TO' - lowercase won't do. Consider another:

event.date_start:[2018-12-03 TO 2018-12-05]

That search will retrieve any records made between 3rd and 5th December 2018 inclusive.

Remember that iRecord/Indicia actually stores both a start date and and end date for records. For records that specify a single day - the majority of records - the start date and end date are the same. Consider this search:

event.date_end:[2018-12-03 TO 2018-12-05]

Notice that it is searching on the field event.date_end, unlike the previous search on event.date_start, but the results are quite likely to be the same in this case since a search over such a short time range is likely to retrieve only records that specify a single day and the start and end dates for these are the same.

But sometimes you might want to retrieve records that have been specified for a date range - in those cases you will likely want to carefully distinguish between start and end date and probably use a combined search term that specifies both the start and end dates, e.g:

event.date_start:2018-01-01 AND event.date_end:2018-12-31

That search will retrieve all records for which the date was specified as the range 1st Jan 2018 to 31st December 2018 - in other words the year 2018. (For more on the logical AND operator see the next section.) Note the difference between this and a search like 'event.date_start:2018': whilst the latter will retrieve any record whose start date is somewhere in the year 2018, the search above will only retrieve records where the start date is 1st Jan 2018 and the end date 31st December 2018.

You can use a wildcard in a date range. For example:

event.date_start:[* To 2016]

That would retrieve all records before, and including 2016.

Combining search terms

Search terms can be combined with the logical operators AND and OR (they must be specified in uppercase). For example:

event.recorded_by:"Fred Crow" AND event.date_start:2019

That search will retrieve all records made by Fred Crow in 2019 (where the name is specified as 'Fred Crow'). Or works like this:

event.recorded_by:("Fred Crow" OR "Crow, Fred")

That will retrieve records by Fred Crow whether the name is specified as 'Fred Crow' or 'Crow, Fred' in the ES index. You can combine ORs and ANDs

event.recorded_by:("Fred Crow" OR "Crow, Fred") AND event.date_start:2019

That will retrieve all records made by Fred Crow in 2019, whether the name is specified as 'Fred Crow' or 'Crow, Fred'.

Available search fields

A full list and description of the avalialble search fields is available here: https://github.com/Indicia-Team/support_files/blob/master/Elasticsearch/.... There are a large number of them derived from all sorts of information about the record including the survey it belongs to, the website or app it came from etc.

Below are just a few of them:

  • taxon.family
  • taxon.genus
  • taxon.species
  • taxon.accepted_name
  • taxon.vernacular_name
  • event.recorded_by
  • event.date_start
  • event.date_end
  • event.day_of_year
  • event.month
  • identification.identified_by
  • identification.verifier.name
  • location.name
  • location.output_sref (use this for grid reference)
  • identification.verification_status
  • identification.verification_status

Further information

If you are really curious about the ElasticSearch search syntax, you can have a look at this: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-ds.... Note however that this a general ElasticSearch help page - it is not geared to our iRecord/Indicia implementation. However, you might find some useful information there if you want to do a search which is beyond the scope of this help page.