Automated REtrieval from TExt [R package arete version 0.2]

arete: Automated REtrieval from TExt

A Python based pipeline for extraction of species occurrence data through the usage of large language models. Includes validation tools designed to handle model hallucinations for a scientific, rigorous use of LLM. Currently supports usage of GPT with more planned, including local and non-proprietary models. For more details on the methodology used please consult the references listed under each function, such as Kent, A. et al. (1995) <doi:10.1002/asi.5090060209>, van Rijsbergen, C.J. (1979, ISBN:978-0408709293, Levenshtein, V.I. (1966) <https://nymity.ch/sybilhunting/pdf/Levenshtein1966a.pdf> and Klaus Krippendorff (2011) <https://repository.upenn.edu/handle/20.500.14332/2089>.

Version:	0.2
Depends:	R (≥ 4.3.0)
Imports:	terra, cld2, stringr, reticulate, pdftools, fedmatch, kableExtra, dplyr, gecko, methods, ggplot2, jsonlite, googledrive, irr, rmarkdown
Suggests:	knitr
Published:	2026-05-11
DOI:	10.32614/CRAN.package.arete
Author:	Vasco V. Branco [cre, aut], Vaughn Shirey [ctb], Thomas Merrien [ctb], Pedro Cardoso [aut]
Maintainer:	Vasco V. Branco <vasco.branco at helsinki.fi>
License:	GPL-3
NeedsCompilation:	no
CRAN checks:	arete results

Documentation:

Reference manual:	arete.html , arete.pdf
Vignettes:	Package workflow (source, R code)

Downloads:

Package source:	arete_0.2.tar.gz
Windows binaries:	r-devel: arete_0.2.zip, r-release: arete_0.2.zip, r-oldrel: arete_0.2.zip
macOS binaries:	r-release (arm64): arete_0.2.tgz, r-oldrel (arm64): arete_0.2.tgz, r-release (x86_64): arete_0.2.tgz, r-oldrel (x86_64): arete_0.2.tgz
Old sources:	arete archive

Linking:

Please use the canonical form https://CRAN.R-project.org/package=arete to link to this page.