There's no way for a single user to do it by herself in that case. The best means of approach is to collect aggregate data. Since unopened loot boxes are fungible (i.e. all loot boxes are functionally equivalent until opened), it doesn't matter if one person opens a thousand loot boxes or a thousand people open one loot box apiece. Thanks to the combined powers of statistics and math, we can calculate the sample size we would need for an average result at a given confidence level and interval step.
Mathematically, I believe we expect a sample size of ~381 draws to have a 95% confidence rate and an error margin of +/- 1%. The more precise we become, the higher the necessary sample size becomes (e.g. a 99% confidence rate with an error margin of +/- 1% requires around 660 draws). If players tracked and aggregated all of their combined results from the same loot boxes, they could run statistical analysis on them and see whether the results lined up with the posted odds.
[Join us on Discord] and/or [Support us on Patreon]
Got a burning question you want answered?
- Short questions: Ask a Game Dev on Twitter
- Long questions: Ask a Game Dev on Tumblr
- Frequent Questions: The FAQ