-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
I found an undocumented behavior of the package, and as such, I wasn't expecting it.
While I understand from an API standpoint why it is important to avoid unnecessary duplicate queries and save resources.
However, TNRS doesn't document the fact that same input names are going to be lumped together in the query.
It would be nice to document this behavior to avoid any bad surprises when using the ID columns to make joins after matching names.
reprex:
# Test twice the same name with different
taxa_frame = data.frame(
ID = paste0("test-", 1:2),
name = c("Helianthus", "Helianthus")
)
matched = TNRS::TNRS(taxa_frame)
# IDs are mixed
matched[, 1:5]
#> ID Name_submitted Overall_score Name_matched_id Name_matched
#> 1 test-2,test-1 Helianthus 1 668749 Helianthus
# It's the same for sequential match
seq_match = TNRS::TNRS(taxa_frame$name)
seq_match[, 1:5]
#> ID Name_submitted Overall_score Name_matched_id Name_matched
#> 1 2,1 Helianthus 1 668749 HelianthusCreated on 2023-02-14 with reprex v2.0.2
Metadata
Metadata
Assignees
Labels
No labels