Google AI Errors

On a bayes probability question, Google/Gemini spat out this whopper:

P(double-headed | 20 heads) =

= 1 / (1 + (9999 x 0.0000009537) )
= 1 / (1 + 0.009536)
=~0.5061

In other words, it asserted that 1 / (1+0.009536) was approximately equal to 0.50 (i.e. 1/1.0095 =~ 0.50), when in fact the correct answer is closer to 0.99.

Original as-is copy from Gemini:

Observing 20 heads in a row is strong evidence that significantly shifts the probability compared to the initial 1-in-10000 chance, making it slightly more likely than not that you picked the double-headed coin. Bayes’ theorem calculation
The probability is calculated using Bayes’ theorem:

Prior probabilities:
- P(double-headed)=1/10000cap P open paren double-headed close paren equals 1 / 10000𝑃(double-headed)=1/10000
- P(fair)=9999/10000cap P open paren fair close paren equals 9999 / 10000𝑃(fair)=9999/10000
Likelihoods:
- P(20 heads | double-headed)=1cap P open paren 20 heads | double-headed close paren equals 1𝑃(20 heads | double-headed)=1
- P(20 heads | fair)=(1/2)20≈9.537×10-7cap P open paren 20 heads | fair close paren equals open paren 1 / 2 close paren to the 20th power is approximately equal to 9.537 cross 10 to the negative 7 power𝑃(20 heads | fair)=(1/2)20≈9.537×10−7
Posterior probability: The updated probability is calculated as:
- P(double-headed | 20 heads)=P(20 heads | double-headed)×P(double-headed)P(20 heads)cap P open paren double-headed | 20 heads close paren equals the fraction with numerator cap P open paren 20 heads | double-headed close paren cross cap P open paren double-headed close paren and denominator cap P open paren 20 heads close paren end-fraction𝑃(double-headed | 20 heads)=𝑃(20 heads | double-headed)×𝑃(double-headed)𝑃(20 heads)
- P(double-headed | 20 heads)=1×(1/10000)(1×1/10000)+((1/2)20×9999/10000)cap P open paren double-headed | 20 heads close paren equals the fraction with numerator 1 cross open paren 1 / 10000 close paren and denominator open paren 1 cross 1 / 10000 close paren plus open paren open paren 1 / 2 close paren to the 20th power cross 9999 / 10000 close paren end-fraction𝑃(double-headed | 20 heads)=1×(1/10000)(1×1/10000)+((1/2)20×9999/10000)
- P(double-headed | 20 heads)=11+(9999×(1/2)20)≈11+(9999×0.0000009537)cap P open paren double-headed | 20 heads close paren equals the fraction with numerator 1 and denominator 1 plus open paren 9999 cross open paren 1 / 2 close paren to the 20th power close paren end-fraction is approximately equal to the fraction with numerator 1 and denominator 1 plus open paren 9999 cross 0.0000009537 close paren end-fraction𝑃(double-headed | 20 heads)=11+(9999×(1/2)20)≈11+(9999×0.0000009537)
- P(double-headed | 20 heads)≈11+0.009536≈0.5061cap P open paren double-headed | 20 heads close paren is approximately equal to the fraction with numerator 1 and denominator 1 plus 0.009536 end-fraction is approximately equal to 0.5061𝑃(double-headed | 20 heads)≈11+0.009536≈0.5061
- – Gemini

This gives a final probability of approximately 50.6%.

When I specifically asked it (in that same session), it explained that the _previous_ problem had a percentage that was 50.6, and that it updated this problem with the new formula, but left the previous answer as-is.

Final quote:

I mistakenly used the parameters from the previous problem iteration (10 flips with 1000 coins) while using the number of coins and flips from the current problem (20 flips with 10000 coins) in the denominator calculation.
My original reasoning failed spectacularly at the next line of arithmetic. I performed the division 1/(1+0.009536) incorrectly in my head or by hallucinating a similar previous result, arriving at 0.50610 instead of the correct 0.990558.
I apologize for that substantial arithmetic error. Thank you again for providing the correct calculation.
– Gemini

As an aside, my session with ChatGPT started out rough – it was told to use a library, and the first call threw a ValueError Invalid Key found in step_kwargs. It’s answer:

Thanks — that error is caused by how PyMC tried to auto-select samplers (NUTS) even though the model contains a discrete latent (the Bernoulli is_double). NUTS can’t handle discrete variables, and passing NUTS-related kwargs caused the Invalid key 'nuts' complaint.
You must remove target_accept=0.95, because PyMC tries to run NUTS, but NUTS cannot be used for discrete variables.
– ChatGPT

Then there was the problem with pz.Bernouli().plot() [which doesn’t exist], where it switched to matplotlib code that was 15 lines longer.

Overall, exactly what I’ve come to expect from AI: it works, as long as you already know the answer.

Recent Posts

Recent Comments

Archives

Categories

Meta