Data Validation in ChatGPT

Thoughts on data validation while using LLMs.

Another one of those Dan's New Leaf Posts, meant to inspire thought about IT Governance . . . .

Misinformation, hallucinations, inaccuracies, and contextual relevance have become a pervasive issues in today’s digital landscape. We all like to blame social media, and don’t get me wrong. I do place most of the blame there. But overreliance on LLMs like ChatGPT, or more importantly, under-reliance on a process I call “humanization of results,” is a growing risk. (I have been running into cyber blog posts where information is simply not accurate).

A 2024 report by Statista highlights that a substantial portion of the global population encounters false information online, contributing to widespread public concern. “In the United States, 44% of news consumers express strong concern about fake news, underscoring the critical need for effective strategies to combat misinformation.”

Data validation serves as a crucial process in ensuring the accuracy, reliability, and integrity of information. By systematically verifying data against predefined standards, organizations can prevent errors and inconsistencies that may lead to misguided decisions. Implementing robust data validation techniques not only enhances data quality but also streamlines operations, ultimately supporting informed decision-making and maintaining trust in data-driven processes.

Misinformation Controls in ChatGPT

Given that I write a lot, I’m monitoring the risk of my presenting information that is not properly vetted. We have plenty of controls in place in our second set of ice review process, but they are not going to catch hallucinations, inaccuracies, irrelevance, or misinformation.

To mitigate this risk, I have implemented a series of practices designed to mitigate misinformation and reduce the likelihood of LLM hallucinations. These strategies focus on ensuring source quality, promoting transparency, encouraging cross-verification, and leveraging reliable tools for accuracy.

Misinformation Controls

I asked ChatGPT to add the following instructions to its memory:

Include links to sources that prove what you are saying in your responses, whenever possible.
When reformatting responses in plain text, please include the links at the end of the response, like a bibliography.
Prioritize FFIEC.gov, ISACA.org, .edu or .gov domains, then mainstream websites dedicated to the topic in question.
Prioritize cross-verification by relying on multiple corroborating sources for complex or disputed topics when possible. Include links to both sources.
When I ask you to format in plain text Meta information about your response should be in all caps.
In the Meta information about your response, indicate when no reliable source exists, and specify if information is inferred or a best guess based on related facts.
Reference established databases such as JSTOR or PubMed for academic or medical information.
Encourage follow-up validation using tools like Snopes, FactCheck.org, or PolitiFact for contentious claims.
Use up-to-date sources to avoid reliance on outdated or obsolete information.
Indicate when your response is based on opinions that are controversial.

Summary

These controls work together to promote accuracy and reduce errors in information. By ensuring transparency when sources are unavailable, encouraging cross-verification, and leveraging established databases and fact-checking tools, this approach enhances reliability. It focuses on both the quality of sources and the integrity of the information, building trust in responses provided.

We’d love to hear what other tactics you may have come across. Please feel free to comment.

Original article by Dan Hadaway CRISC CISA CISM. Founder and Information Architect, infotex

”Dan’s New Leaf” – a fun blog to inspire thought in IT Governance.

To see more content like this in your inbox, sign up for our newsletter here!

The Magnificent Seven 2023

Seven Trends . . . …that small bank Information Security Officers face in 2023 Another one of those Dan’s New Leaf Posts, meant to inspire thought about IT Governance . . . . Welcome t...

New NIST Guidance Open for Public Comment

A Focus on Artificial Intelligence An Article Review ...

AI Slamming

An old trick in a new way . . . Another one of those Dan's New Leaf Posts, meant to inspire thought about IT Governance . . . . ...

“AI Phishing” – Awareness Poster

Another awareness poster for YOUR customers (and users). Now that we have our own employees aware, maybe it’s time to start posting content for our customers!Check out posters.infotex.com for th...

Data Validation in ChatGPT

Thoughts on data validation while using LLMs.

Another one of those Dan's New Leaf Posts, meant to inspire thought about IT Governance . . . .

Misinformation Controls in ChatGPT

Misinformation Controls

Summary

To see more content like this in your inbox, sign up for our newsletter here!

Leave a Reply Cancel reply

Related Posts

The Magnificent Seven 2023

New NIST Guidance Open for Public Comment

AI Slamming

“AI Phishing” – Awareness Poster

infotex

MANAGING TECHNOLOGY RISK

SERVICES

EDUCATION

COMPANY

CONTACT

SIGN UP FOR OUR BLOG

As part of infotex‘s service to our clients, find “non-mainstream” articles about relevant security issues.

General: (800) 466-9939

General: (800) 466-9939

General: (800) 466-9939