SENTINEL

Predictions

Problem

Money laundering, terrorism financing, and fraud are important and current topics. However, despite ongoing efforts made by the financial institutions to avoid these problems, organized crime is constantly finding more advanced methods to try and go unnoticed.

The market is in continuous renovation and there are increasingly more technological advances. Innovation is required in the prevention strategies and the tools used to always try to be one step ahead. The emergence of advanced Machine Learning models offers options of highly sophisticated algorithms, like Deep Learning, Neural Networks, Decision Trees, among others, that drive the institutions to go further and expand the big picture to detect any unusual movement.

prevencion-emisor

Benefits

Sentinel Predictions is a visual design environment to build predictive analysis models for fraud prevention, money laundering, risk analysis, and customers of behaviors, among others. It provides a complete library of learning algorithms, data set-up and exploration, model validation tools, and the model assessment server integrated with Sentinel.

Sentinel Predictions models generate a score that can be used by the Sentinel transactional engines, both in a "real time" approach and "near real time", combined with the different analytical tools and decision-making process already provided by the system.

Product: Sentinel Predictions

Characteristics

  • Easy to use visual environment for the design of analytical processes.
  • It includes a set of data exploration tools and intuitive visualizations.
  • It includes 1500 operators for all the data analysis and preparation tasks.
  • It has an innovating model with auto-generation functionality that allows Sentinel to create and train multiple models automatically, compare their results and enable the user to determine which it desires to set up in a productive environment.
  • It supports the use of the R statistical language.

Stages of the "Machine Learning" model generation process in Sentinel

Data Access and Management

Data Exploration

Data Set-Up

Modeling

Test Validation

Data Access and Management

  • It offers access to the Sentinel database and its transactional structures.
  • It has the capacity to interact with Excel and CSV files, and relational databases, like Oracle, MS SQLServer, IBM DB2, MySQL, among others.

Data Exploration

Data exploration through a series of statistical techniques makes it possible to understand the data composition, the generation of different groups and behaviors of the subjects in analysis: cardholders, channels, customers, transactions, branches, ATM’s, etc.

  • It enables descriptive statistical functions:
    • Univariate statistics and plots:
      • Numerical attributes: mean, median, minimum, maximum, standard deviation, and number of missingvalues.
      • Nominal / categorical attributes: number of categories, counts, mode, number of missing values.
      • Date attributes: minimum, maximum, number of missing values.
    • Bivariate statistics and plots: covariance matrix, correlation matrix
    • Transition matrix
    • Transition graph
    • Mutual information matrix
  • Graphics and information: it allows the easy configuration of different types of graphics to conduct a fact data analysis. The following is included: dispersion, linear, bubble, parallel matrices, deviation, 3-D, density, histograms, area, bars, and stacked bars, pie, Andrews curves, Pareto

Data Set-Up

In many cases, the generation of predictive models requires data set-up, because it does not necessarily have the optimum quality, it has incomplete values, it requires screening and mix of different information groups, or the generation of new data from the existing data:

  • It allows the aggregation of multiple functions like: addition, average, median, standard deviation, variance, count, minimum, maximum, product, product logarithm
  • It enables the use of operators like ‘join’, ‘merge’, ‘append’, ‘union’, or ‘intersection’
  • It allows the screening of distant values through distances, densities, correlations
  • It identifies and removes duplicates
  • It allows the generation of example data through various statistical techniques and functions like: absolute, relative, or based on probability, balanced, stratified, bootstrapping, Kennard-Stone
  • It enables the data transformation through various techniques like: normalization and standardization, Z transformation, scales by weights, logarithm and exponential functions, trigonometric functions, among others
  • It allows data partitioning creating subsets for training, cross validation, and testing
  • It has functions and techniques for the selection of attributes for the model in design, like: Chi square and correlation, by weight schematics like Gini index, principal component analysis (PCA), independent component analysis (ICA), generalization Hebbian algorithm (GHA), dimensional reduction with self-organization maps (SOM), among others
  • It allows the automatic generation of new attributes through 15 different techniques, including genetic programming

Modeling

Sentinel Predictions has a broad variety of supervised and non-supervised learning algorithms for the generation of models. The use of each algorithm usually depends on what is to be predicted, as well as on the data quality and quantity. On many occasions for the same objective, for example, fraud prevention, multiple models are generated through different algorithms so that they compete to obtain the best results and ultimately have a ‘champion’.

  • Calculation of similarities in order to calculate the similarities among the points of two data sets, with numerical distance measurements: Euclidean, Camberra, Chebychev, Correlation, Cosine, Dice, Dynamic Time Warping, Inner product, Jaccard, Kernel-Euclidean, Manhattan, Max-Product, Overlap; and categorical and nominal distance measurements: nominal, Dice, Jaccard, Kulczynski, Rogers-Tanimoto, Russel-Rao, Simple Matching
  • Clustering: defined by the user or automatic selection of the best, among: Support Vector Clustering, K-Means, K-Medoids, Kernel k-Means, X-Means, Cobweb, Clope, DBScan, Expectation Maximization Clustering, Self-organizing maps, Agglomerative Clustering, Top Down Clustering
  • Decision Trees: classification and regression algorithms (CART), CHAID, ID3, C 4.5, Random Forest, Multi-way Trees, Gradient Boosted Trees. (GBT), Pre-pruning and pruning, and 9 algorithms more.
  • Rule induction: through 12 different algorithm types
  • Bayesian Networks
  • Regressions: linear, logistic, generalized linear model (H2O), nucleus logistic regression, linear discriminating analysis (LDA), quadratic discretional analysis (QDA), among many more
  • Neural Networks: with flexible network architecture with different activation functions, multiple layers with different node numbers, different training techniques, perceptron, multilayer perceptron, deep learning (H2O)
  • Support Vectors Machines: which enable strong modeling techniques for a great number of dimensions, with more than 10 different methods for the support, regression, and grouping support vector
  • Memory-based modeling, like: k-Nearest Neighbors for classification and regression

Test Validation

Estimating the model performance and its accuracy is essential to determine if it is possible to set it up in production, assessing the online information supplied by Sentinel or if any fine tuning is required.

  • Performance criteria: Many performance criteria for numerical and nominal or categorical objectives include: accuracy, classification error, Kappa, area under the curve (AUC), precision, false positive/false negative, sensitivity, specificity, Youden index, correlation, spearman rho, kendall tau, square correlation, among many more.
  • Validation techniques: it shows multiple results in the history to help and better assess the model yield, like: cross validation, division validation, bootstrapping, elevation graphic, ROC curves, confusion matrix, among others.

CONTACT US

  • We will respond to your inquiries as quickly as possible. Write today.
  • This field is for validation purposes and should be left unchanged.
© 2020 SmartSoft. All rights reserved.