One of they key requirements for the semantic annotation service was that we wanted it to be client driven and thus self learning. We did not want it to be fundamentally rules-based which would require indefinite ongoing maintenance of knowledge rules to ensure the F1 scores would remain high (> 90%) within the ever-changing context of news. To meet this requirement, outside of simple JAPE grammars to match dictionary terms in the text, the key entity disambiguation and text analysis processes in the semantic annotation pipeline are based around (1) ontological proximity and (2) statistical models