This is due to the level of probable term sequences boosts, plus the designs that notify benefits turn into weaker. By weighting terms within a nonlinear, distributed way, this model can "study" to approximate words and phrases rather than be misled by any mysterious values. Its "understanding" of a provided phrase just isn't as tightly tethered on