VC firms are pioneering a new investment strategy: acquiring established businesses and optimizing them with AI to boost efficiency and customer reach.
This is because auto regressive LLMs work on high level “Tokens”. There are LLM experiments which can access byte information, to correctly answer such questions.
Also, they don’t want to support you omegalul do you really think call centers are hired to give a fuck about you? this is intentional
I don’t think that’s the full explanation though, because there are examples of models that will correctly spell out the word first (ie, it knows the component letter tokens) and still miscount the letters after doing so.
No, this literally is the explanation. The model understands the concept of “Strawberry”, It can output from the model (and that itself is very complicated) in English as Strawberry, jn Persian as توت فرنگی and so on.
But the model does not understand how many Rs exist in Strawberry or how many ت exist in توت فرنگی
I’m talking about models printing out the component letters first not just printing out the full word. As in “S - T - R - A - W - B - E - R - R - Y” then getting the answer wrong. You’re absolutely right that it reads in words at a time encoded to vectors, but if it’s holding a relationship from that coding to the component spelling, which it seems it must be given it is outputting the letters individually, then something else is wrong. I’m not saying all models fail this way, and I’m sure many fail in exactly the way you describe, but I have seen this failure mode (which is what I was trying to describe) and in that case an alternate explanation would be necessary.
The model ISN’T outputing the letters individually, binary models (as I mentioned) do; not transformers.
The model output is more like
Strawberry
<S-T-R><A-W-B>
<S-T-R-A-W-B><E-R-R>
<S-T-R-A-W-B-E-R-R-Y>
Tokens can be a letter, part of a word, any single lexeme, any word, or even multiple words (“let be”)
Okay I did a shit job demonstrating the time axis. The model doesn’t know the underlying letters of the previous tokens and this processes is going forward in time
For usage like that you’d wire an LLM into a tool use workflow with whatever accounting software you have. The LLM would make queries to the rigid, non-hallucinating accounting system.
I still don’t think it would be anywhere close to a good idea because you’d need a lot of safeguards and also fuck your accounting and you’ll have some unpleasant meetings with the local equivalent of the IRS.
The LLM would make queries to the rigid, non-hallucinating accounting system.
And then sometimes adds a halucination before returning an answer - particularly when it encournters anything it wasn’t trained on, like important moments when business leaders should be taking a closer look.
There’s not enough popcorn in the world for the shitshow that is coming.
You’re misunderstanding tool use, the LLM only queries something to be done then the actual system returns the result. You can also summarize the result or something but hallucinations in that workload are remarkably low (however without tuning they can drop important information from the response)
The place where it can hallucinate is generating steps for your natural language query, or the entry stage. That’s why you need to safeguard like your ass depends on it. (Which it does, if your boss is stupid enough)
lol accounting….
How easy will it be to fool the AI into getting the company in legal trouble? Oh well.
Some would call it effortless, even.
This is because auto regressive LLMs work on high level “Tokens”. There are LLM experiments which can access byte information, to correctly answer such questions.
Also, they don’t want to support you omegalul do you really think call centers are hired to give a fuck about you? this is intentional
I don’t think that’s the full explanation though, because there are examples of models that will correctly spell out the word first (ie, it knows the component letter tokens) and still miscount the letters after doing so.
No, this literally is the explanation. The model understands the concept of “Strawberry”, It can output from the model (and that itself is very complicated) in English as Strawberry, jn Persian as توت فرنگی and so on.
But the model does not understand how many Rs exist in Strawberry or how many ت exist in توت فرنگی
I’m talking about models printing out the component letters first not just printing out the full word. As in “S - T - R - A - W - B - E - R - R - Y” then getting the answer wrong. You’re absolutely right that it reads in words at a time encoded to vectors, but if it’s holding a relationship from that coding to the component spelling, which it seems it must be given it is outputting the letters individually, then something else is wrong. I’m not saying all models fail this way, and I’m sure many fail in exactly the way you describe, but I have seen this failure mode (which is what I was trying to describe) and in that case an alternate explanation would be necessary.
The model ISN’T outputing the letters individually, binary models (as I mentioned) do; not transformers.
The model output is more like Strawberry <S-T-R><A-W-B>
<S-T-R-A-W-B><E-R-R>
<S-T-R-A-W-B-E-R-R-Y>
Tokens can be a letter, part of a word, any single lexeme, any word, or even multiple words (“let be”)
Okay I did a shit job demonstrating the time axis. The model doesn’t know the underlying letters of the previous tokens and this processes is going forward in time
deleted by creator
For usage like that you’d wire an LLM into a tool use workflow with whatever accounting software you have. The LLM would make queries to the rigid, non-hallucinating accounting system.
I still don’t think it would be anywhere close to a good idea because you’d need a lot of safeguards and also fuck your accounting and you’ll have some unpleasant meetings with the local equivalent of the IRS.
And then sometimes adds a halucination before returning an answer - particularly when it encournters anything it wasn’t trained on, like important moments when business leaders should be taking a closer look.
There’s not enough popcorn in the world for the shitshow that is coming.
You’re misunderstanding tool use, the LLM only queries something to be done then the actual system returns the result. You can also summarize the result or something but hallucinations in that workload are remarkably low (however without tuning they can drop important information from the response)
The place where it can hallucinate is generating steps for your natural language query, or the entry stage. That’s why you need to safeguard like your ass depends on it. (Which it does, if your boss is stupid enough)
deleted by creator
But ERP is not a cool buzzword, hence it can fuck off we’re in 2025
deleted by creator
Hey boss. Think they’re using chatgpt for that?