Until recently the checking process had only been focused on transactions and reference data. Thanks to document scanning the checking is now extended to documents such as business reports and searched by SMARAGD TCM for conspicuities, e.g. names from embargo lists. Thereby, the depth of the checking is increased. The file types that can be scanned are Word (doc & docx), unencrypted PDF, Excel and TXT. They are transferred into the system via an adapter. Technically the document scanning is a demanding requirement, based on intelligent algorithms. In order to condense the vast amount of text to relevant passages, each document is divided logically and reduced to the content that is relevant for checking, e.g. names, companies, location, banks. Fillers, verbs, adjectives and redundancies are not considered, because they do not contribute to the recognition of conspicuous content. The automatic speech recognition determines at the beginning whether the text is predominantly German, English or French. The checking afterwards uses the linguistic checking mechanisms that are determined in the algorithm.
Reduction of checking run and performance enhancement
The duration of the checking process is drastically reduced by the exclusion of irrelevant content and thus the overall system performance is enhanced. The checking server functionality goes into action here. At the moment, the system is capable of checking 50 pages in 40 seconds. Thanks to an optionally employable delta configuration only the new hits are displayed with versioned documents. The rest is taken over from material that is already available. When there is a hit inside the system, users have the opportunity to add a comment for further processing at a later date. Moreover, users can deposit an e-mail address that the checking result is sent to automatically. That way, several editors can be informed about the current state.