Accuracy of Artificial Intelligence Platforms on Equine Topics
Category: Oral Presentation
Author(s): Sonya Aldworth-Yang
Presenter(s): Sonya Aldworth-Yang
Mentors(s): Devan Catalano
Artificial intelligence (AI) is becoming increasingly popular as a resource for equine-related information. However, AI models pull from various sources and do not always distinguish between fact and opinion. This study evaluated the accuracy of AI-generated answers on equine topics from three AI platforms. We hypothesized that AI could answer basic questions well but would struggle with more complex topics. The AI platforms (P) evaluated were ChatGPT (CGPT), Microsoft Co-Pilot (MicCP), and Extension Bot (ExtBot). Researchers asked 40 questions covering horse care, facilities management, nutrition, genetics, and reproduction (topics; T) at four levels (L): beginner (beg.), intermediate (int.), advanced (adv.), and “hot topics” (HT). Answers were scored on accuracy, relevance, thoroughness, and source quality (10 pts each, total 40 pts). Data were analyzed using PROC GLM in SAS (v. 9.4). Both CGPT and MicCP answered 40/40 questions, while ExtBot answered 33/40. Total score was not affected by P (p=0.197) or T (p=0.536) but was affected by L (p=0.002), with beg. and int. scoring higher than adv. or HT. Accuracy varied by P (p<0.001), L (p<0.001), and T (p=0.015), with ExtBot scoring lower than CGPT and MicCP. Relevance was affected by P (p=0.042) and L (p<0.001), with CGPT providing more irrelevant details. Thoroughness differed by P (p<0.001) and L (p=0.002), with CGPT ranking highest. Source quality was affected by P (p=0.037), with ExtBot using the best sources. Overall, AI struggled with complex topics, highlighting the value of Equine Extension specialists.