“Vitok-OSINT” – information retrieval system.
“Vitok-OSINT” provides search in open sources (files/media/forums/blogs /social networks (Facebook, Twitter, etc)/TOR-network), developing relationships between key concepts, and display of the constructed relationships between objects.
Key concepts: events, organizations, dates, persons and other parameters.
According to the international classification, these systems belong to OSINT (Open Source Intelligence).
The system “Vitok-OSINT” is intended for:
- Aggregation of data from different sources (including the Internet) and processing of the data arrays provided on electronic media for loading in the system;
- Development of the database and creation of history of events (facts);
- Real-time monitoring of data sources on key concepts or surges of information activity;
- Development of relationships between key parameters in time;
- Notifications for operators of the most important events or changes in key parameters.
The system “Vitok-OSINT” enables the user to:
- Form his own archive of documents, which does not depend on security of information in the primary source;
- Search, assess and systematize data in the archive;
- Carry out overview, comparative and dynamic analysis of the data with regard to historical retrospective;
- Automatically form complex chains of relationships between objects which might have been overlooked by the analyst. Monitor changes in chains of data and events, new available data;
- Define the significance of events during the given period with regard to historical retrospective, report results of data processing to the operator in a graphical form (charts, diagrams, maps).
The system “Vitok-OSINT” is used to:
- Follow the real events on the given subjects;
- Gather data and information from news feed;
- Assist analysis department specialists with forecasts, reports, hypotheses;
- Monitor general data indexes and the information provided by the system.
Users of the system are provided with the service functions that enable adding sources and analyzing of accumulated information. The system realizes automatic analysis of the received data (for example, a text, photos or pictures are extracted from an e-mail) with no operator involved that reduces considerably the time to scan the document.
The system also implements text recognition in graphic images, their indexation, search and processing. Repeated data and information are compressed (aggregated) to minimize time for the analyst to study the matter.
The mechanism of filtering by type, properties and types of relationships between the received data reduces analyst’s work time. The information is saved in the database, and the analyst can get access to data by any period. The visualization of the appearance of new objects in the scheme enables identifying the directions for the subsequent investigation.
The system supports morphological and semantic analysis of the text for the languages of the European group, the Arabic language and partly Eastern languages. If necessary, it is possible to connect additional modules for other languages. Key concepts are distributed according to certain headings (subjects). The analyst can use ready subject heading lists, create his own, expand a list of terms and concepts.
Incoming documents are automatically distributed by headings if there are key concepts that facilitate operation with information. Multilanguage is supported when creating key concepts, which ensures processing of documents in languages of the European group, the Arabic language and partly Eastern languages.
The system provides the view of relationships revealed in documents:
- person – person
- person – organization
- organization – organization
- person – event
- organization – event
- person – organization – event
The system enables multi-stage analysis of data: draft analysis, probabilistic and statistical analysis, extraction of formalized and structured data (addresses, phones, air flights, authors of the materials, etc.), semantic analysis and extraction of the facts from unstructured data.
Viewing information in graphical representation in one window allows to reveal unobvious patterns related to object interaction frequency, geographical location, information origin points and other signs being “hidden” knowledge.
Social networks monitoring (Facebook, Twitter, etc) enables getting access to information about events before its publication in official sources. This data combined with the additional information related to geographical location of the respondent, his biography and other facts allows to define a level of reliability of received data.
The monitoring system automatically issues alerts to events which stand out by their characteristics from the general mass of incoming information over a period of time. History of each key concept is maintained in the electronic file of objects which supports automatic filling / extraction of information about an object, and static filling of data by operators.
The system is equipped with advanced visualization tools of the processed data and operator interfaces which can minimize the quantity of operator’s errors due to graphic “highlighting” of key events and provides comfortable and fast use of the system:
- Display of the given information in a cartographic type, for example:
• Natural cataclysms;
• Air flights with indication of routes, flight maps, characteristics of the airports, departure/arrival time. The system works for both past and future time periods;
• Visualization of the intensity of events (as well as sources of information about them) and information flows using heat maps;
• Time simulation with visualization of events on the map.
- Display of data quantitative and statistical characteristics in the form of a developed system of graphs and histograms. Possibility of data filtering and change of time frames when creating diagrams and histograms.
- Dynamic graphs to view the identified relationships and dependencies between the entities extracted from the data. Possibility to filter displayed data in the graph.
- Search interface for operational work enables getting access to the requested information in arbitrary sections. Possibility to narrow search results due to the developed system of hierarchical facets to obtain the required data in a few mouse clicks. Possibility to detail information regarding the found materials.
- Operator interfaces for verification of automatically retrieved facts, their correction and training of the system.
- Operator interfaces for information analysis and classification rule management, classifier management.
Based on this system, you can easily create situation-dependent center with simultaneous dynamic mapping on several screens (there are examples of simultaneous display on the video wall containing 100 55-inch screens).
All system technical components have the property of linear scalability, and system can be used in the distributed mode in several data processing centers on thousands nodes.
Technical components of the system allow to:
- Automatically distribute loading and data between nodes;
- Simply and quickly add new nodes, increasing system operation bandwidth and speed;
- Ensure that there is no single point of failure of the system;
- Provide automatic data recovery and backup.