Learning Systems rollout of AI, a learning lesson from Workday’s AI rollout (what not to do); plus why beta testing isn’t the panacea
I rarely talk about and dive into HCMs or any HR solution. It’s just not my scene. Nevertheless, several readers of the blog contacted me regarding AI and a system they use which is rolling out a few features with AI in two weeks. The vendor? Workday.
Workday is rolling out AI in two weeks (this would be around 9-23). A few said in two weeks, others within a week and a half. Anyway, the rollout will be the Search for tasks and reports, Job Descriptions Generate and separately, the Workday Assistant (exists already, but the new update will have AI in it).
For starters, they have yet to receive any information from Workday about the AI implications—nothing on hallucinations, potential issues, what to look for, how they can see the outcomes from their end-users, and other bits of nothing.
If none of them were readers of the blog – and thus the extensive coverage about AI – gen AI, which is provided, they would be without any idea of anything.
To me, Workday landed a colossal fail.
It would be easy to send all leaders who oversee the HCM information about hallucinations—fake and false information—how to offset the potential issues if their end-user receives such information, be aware that it is not 100% accurate, and so on.
Workday plans to do further rollouts of AI (Generative AI) to their customer base, allowing them to opt in or out. I can’t wait to see how well that works, knowing the typical MO of Workday sending out updates to their customers.
Workday likely tested Gen AI with a group of customers.
Now, I am unsure of that—i.e., there is no verification—but this is a common MO for any vendor, regardless of system, target markets, or anything else, when they are going to launch something to the greater audience.
Switch to Learning Systems, Learning Tech, Content Creator Tools, et al. & Beta Testers
The vendor may refer to it as a beta test (a common term used). The bigger piece here is the selected group of customers.
For the most part, vendors in the learning system space, learning tech, and content creator tools (AKA authoring tools) do the same thing. With learning system vendors, it is a given (albeit a few just roll).
Their methodology for selecting customers for the select group is all over the place, but it’s only some customers.
I like to think of it as the early days of kickball in elementary school when there were two teams, and two people got to be the heads of the teams (I did once—I felt total power!).
These two people then pick the players for their team. You never know why they chose you, but if you go early, you are in the “group of talent.” The downer is that there are always two people who are picked last.
I surmise it isn’t a good feeling – who wants to be picked last?
No, there isn’t a pick-last angle here. Instead, the team leaders (using the kickball analogy) decide to join the team to test the feature, UI/UX (yep, that is common), and so forth.
Based on my research, any vendor I have spoken with that is rolling out something or planning on; I always ask them their methodology and why they picked these customers over others.
The Common Ones
- The customers that use the system the most
- A select group of customers that are the biggest clients (i.e. by users)
- Customers who are well-known that use our system (what – Mike’s Beef and Cheese isn’t in there?)
The Less Common
- A selection of customers who are in the large enterprise/enterprise, a group from mid-market, and a group of small business
- Customers we have selected in the past, that we found provided positive feedback prior to
Take a look at the common ones. One vendor told me that the biggest ones are those in the F500 that they have as clients (one of the worst ideas ever) by size—number of users and usage shouldn’t be the methodology here.
First, it is difficult to ascertain how many people are beta-testing the system. Is it 20? 10? 50? 200?
Next, if you have 75 F500 clients but only 10 are using the system heavily, and of those 10, nine have over 100,000 users – do they get to be in your select group or not?
Finally, the biggest miss, and the one I would zero in on, are the customers who are not using the system – or have the lowest usage rate.
I would include those customers who have contacted support the most.
Why the relevance of the support thing?
These are the customers who are likely to bolt at the end of your contract—and who knows, they could be Big Name, with 200,000 end users.
For me, that’s more relevant than Sammy, over at WidgetLand, who has 50,000 users and is on the F50 side.
What is the Beta Group?
It’s part focus group, part people playing around with the system, who may not be experts in the system, to begin with (the people normally testing are the heads of the departments—such as the head of L&D, a CLO, or the head of training, etc.), nor may they be wizards in tech.
Sure, they know how to do a bunch of things in the system, to begin with, but all of it?
Even if it is the admin, you cannot assume that said admin knows the entire system inside and out, even before you rumble in ZCG.
With Gen AI going into the systems, the idea of a beta group testing it out and providing feedback could be better.
You are dealing with folks (as one would expect) who overwhelmingly do not know the pluses and minuses of Gen AI. Hallucinations? What’s that?
If I put only in all my content, it will always be accurate with AI, right? (Wrong). If the vendor lacks a feedback loop, how many beta testers will ask about it? How many even know?
Are they aware of how AI learns? Do they know anything about the LLM(s) you are using, and specifically the strengths and weaknesses of the LLM (because all of them have strengths and weaknesses – even if it is looking only at your own content)?
Do they understand how token fees work? Sure, it costs minutia these days, but I want to know if I get charged for whatever touches the Gen AI.
Content Creator tools are proof that whatever group got to see it in the beta test were unaware of feedback loops? How do I know this?
Because the vendors who have an AI content creator that I have seen (and I have seen a lot of systems that have one) all lack a feedback loop, it’s bonkers.
Sure, you can edit the content and click save, but that isn’t a feedback loop, and the AI has no idea what it pushed out was wrong (if it did occur). It learns from itself.
The same thing applies to when the learner can ask questions or inquires to the Gen AI offering in the system.
Of the systems I have seen that have this capability, a whopping two have a feedback loop. One only offers it when the learner enters one of their sims.
A couple of vendors track this data behind the scenes, i.e. what was wrong – if wrong, how many times people selected thumbs up or thumbs down (assuming the system has this, and again, it is very rare) and count that—big deal.
Do they do anything else? Nope. Then what’s the importance of that?
The people overseeing your learning system, learning tech, content creator, e-learning tool
How many of them are going to know all about Gen AI that they need to know, and how many will send out and train their end-users on it?
If they don’t know, how will the others know?
Even if they have some knowledge, as I have said before they need to keep up, because that LLM the vendor is using, is at some point going to have an update, and then the change occurs within the system itself – at some level.
The feedback I heard from the readers using Workday and learning about the Gen AI rollout, included questions such as whether they will get to access the AI to test it out, and why didn’t Workday explain more about Gen AI with the potential for fake and false information, when you conduct a search for tasks/reports, and with Workday Assistant?
Needlessly to say, people will find out the hard way about AI, the moment they ask a question to Workday Assistant, and it pushes out information that is wrong, and that person doesn’t know it’s wrong. In the case – that is a big, big, Godzilla problem.
And who exactly will find out this is happening? Even if Workday is watching from behind the scenes, that isn’t going to help the person who just received the wrong or inaccurate retorts mixed in with some correct information.
They will go back try to do it, find out it doesn’t work, try it again, still doesn’t work, then gets irritated and is left to talking to Workday Assistant which is where they are asking those questions.
Anyone who has used a chatbot knows it is awful. I mean it is as awful as you going into a hotel room that is freezing (and it is warm outside) and you can’t change the dial because it is set by the hotel? YES.
Is it as awful as you grabbing the wrong shoes after you go through the airport security line? Well, it is if you that did it – walking back will make you feel awful. (Side note – I was at an airport where that actually happened, and everyone was seeing if the person came back)
Is it as awful as you shanking your golf ball into the woods, where crocodiles are waiting to eat you? Yes.
Because the scenario of receiving such erroneous information without knowing it and assuming it is right, never will push out the right outcome. And in this case, could have real ramifications to the rest of the company that relies on that HCM (including HRIS and other ilk).
I want to make it clear that this AI rollout from Workday is across the entire system, and not just learning.
Bottom Line
When I hear of beta testers, I immediately think are there testers whose responses weigh more heavily than others? My gut says yes.
The others are added for their feedback too, but if KingKONG client, who is generating a lot of revenue says this needs to be changed, and the vendor doesn’t see the response as being unreasonable or confirmation of what they are debating about, then that one person changes everything.
I hear about it with roadmaps, so why wouldn’t it occur with beta testers?
I’ve been a beta tester before, it is interesting to take a look at it, and provide feedback, but it gets back to how relevant is that info compared to the group itself, and to say a favorite or two clients?
I always find it interesting that the vendor relies so heavily on the beta testers when it comes to say UI/UX. My assumption is that the people who designed this, were hired for this role, because they had the knowledge and experience.
Then my assumption is that someone or multiple people including the CPO (Chief Product Officer) would have reviewed it, provided feedback or not, and at some point, green-lighted it.
I would then assume that these are the experts. Otherwise, what is the point in hiring them?
When you can have your biggest and best customer give you the feedback and knowledge
Instead.
E-Learning 24/7