It’s a combination of two primary reasons. The first reason is because there’s no physicality or collision inherent in video games. Animating physical action without being able to touch anything is like miming - believable miming is really quite difficult. The second reason is because we humans are intimately familiar with physical touching. We’re going to notice any subconscious visual oddities, so it sets the bar of acceptability incredibly high.
When we render 3D models, there’s nothing that can physically stop them from interpenetrating each other. The only way it works is when the program does the math to figure out whether the polygons actually penetrate at that moment, and too many of [those calculations can get really expensive]. For animations of two or more characters physically touching for longer than a single frame, we cannot depend on procedural collision detection. The animators have to make the characters to look right similar to making shadow puppets “physically” interact with each other with no real physicality. Try making two shadow puppets touch each other believably (e.g. have one punch another) and you’ll see what I mean.
The other issue is that humans in general are extremely familiar with what touching things and people look like. We’ve seen it constantly for our entire lives. We’ve seen our own bodies touching things and we’ve seen other people touching things. We know how clothing is supposed to hang on peoples’ bodies. We understand how skin deforms when pressed. We know which joints can bend in which direction. Anything that doesn’t obey all of these little rules at all times will trigger alarm bells in the viewers’ subconscious. You might not be able to explain exactly why, but the motions will seem awkward, stilted, and just… off somehow. This is a variation on the uncanny valley effect that we see quite often with CG in movies and television. We’re so used to seeing natural motion that small errors become magnified. Even if the image looks ok when frozen, it’s the motion that gives it away.
Using motion capture circumvents both of these problems to some extent because motion capture must take the physicality into account when capturing the animations and has the added benefit of being able to capture all of the motions involved with the action (including the small secondary and tertiary movements). It isn’t a perfect fix - as the various off-putting CG that was built from motion capture can attest - but it can get us closer to our goal.
Got a burning question you want answered?