基于聚类的CBLOF算法在反洗钱中的应用

需积分: 10 138 浏览量更新于2024-09-12 收藏 270KB PDF 举报

本文主要探讨了在反洗钱（AML）领域中，基于簇的局部异常因子（Cluster-Based Local Outlier Factor, CBLOF）算法的应用。作者高增安在国家社会科学基金（项目号：08BGJ013）的支持下，结合距离依赖的无监督聚类和局部异常检测技术，提出了一种新的CBLOF方法来识别可疑的洗钱交易行为模式（Suspicious Money Laundering Transactional Behavioral Patterns, SMLTBPs）。研究的背景是金融机构对于检测此类异常活动的能力对有效执行反洗钱政策至关重要。论文首先介绍了AML的重要性，即通过识别异常交易行为来防止非法资金流动。作者的研究着重于如何利用CBLOF算法的优势，该算法能够处理大量数据，并在不依赖预定义标签的情况下自动发现数据集中的潜在异常点。这种方法有助于金融机构提高对SMLTBPs的识别精度，从而增强反洗钱策略的有效性。在理论部分，论文回顾了距离度量、无监督聚类和局部异常因子（LOF）的基本概念，这些都是CBLOF算法构建的基础。LOF是一种衡量一个点相对于其邻居异常程度的方法，它通过比较对象周围的密度来确定其是否为异常点。CBLOF在此基础上进一步扩展，将局部异常值的概念融入到基于簇的框架中，增强了异常检测的灵活性和准确性。接下来，作者详细描述了CBLOF算法的设计流程，包括数据预处理、聚类阶段、计算LOF值以及异常点的识别。实验部分则通过真实数据和合成数据的对比分析，验证了CBLOF在实际反洗钱情境中的应用效果和性能。通过对比分析，结果显示CBLOF在发现SMLTBPs方面的表现优于传统的异常检测方法，提高了金融机构在异常交易检测上的敏感性和精确度。关键词包括：聚类、异常检测、局部异常因子、可疑洗钱交易行为模式（SMLTBPs）以及反洗钱（AML）。这篇研究为反洗钱领域提供了一个创新的工具，帮助金融机构更有效地应对日益复杂和隐蔽的洗钱威胁，为打击金融犯罪做出了重要贡献。

Application of Cluster-Based Local Outlier Factor

Algorithm in Anti-Money Laundering

Gao Zengan

The research is supported by the National Social Science Foundation of China (No. 08BGJ013).

Post Doctoral Station of Theoretical Economics

China Center for Anti-Money Laundering Studies

Fudan University

Shanghai, P. R. China

School of Economics and Management

Southwest Jiaotong University

Chengdu, P. R. China

E-mail address: gaozengan133@163.com

Abstract—

Financial institutions’ capability in recognizing

suspicious money laundering transactional behavioral patterns

(SMLTBPs) is critical to anti-money laundering. Combining

distance-based unsupervised clustering and local outlier

detection, this paper designs a new cluster-based local outlier

factor (CBLOF) algorithm to identify SMLTBPs and use

authentic and synthetic data experimentally to test its

applicability and effectiveness.

Keywords-clustering; outlier detection; local outlier factor

(LOF); suspicious money laundering transactional behavioral

patterns (SMLTBPs); anti-money laundering (AML)

I. INTRODUCTION

Anti-money laundering (AML) in financial industry is

based on the analysis and processing of Suspicious Activity

Reports (SARs) filed by financial institutions (FIs), but the

very large number of SARs usually makes financial

intelligence units’ (FIUs’) analysis a waste of time and

resources simply because only a few transactions are really

suspicious in a given amount [1], so financial AML is far from

a real-time, dynamic, and self-adaptable recognition of

suspicious money laundering transactional behavioral patterns

(SMLTBPs). Literature review finds that artificial intelligence

[2], support vector machine (SVM) [3], outlier detection [4],

and break-point analysis (BPA) [5] are used to improve FIs’

ability in processing suspicious data, various approaches to

novelty detection on time series data are examined in [6],

outlier detection methodologies are surveyed by [7], and a

data mining-based framework for AML research is proposed

in [8] after a comprehensive comment is made on relative

studies. But the effectiveness and efficiency of SMLTBP

identification remains a hot spot for research since the passage

of the USA Patriot Act and the creation of the U.S.

Department of Homeland Security signaled a new era in

applying information technology and data mining in detecting

money laundering and terrorist financing [9].

As SMLTBP recognition is short of training data, the

number of clusters is usually unknown, and the result of

clustering is always changing dynamically [10, 11], this paper

designs a cluster-based local outlier factor (CBLOF) algorithm

to help FIUs concentrate on a desirable number of SMLTBPs

having a proper degree of suspiciousness as determined by

their actual needs and resources endowments. Following the

introduction, Section II describes the design of the algorithm,

Section III is about the experimental process, and Section IV

ends the paper with a suggestion for future research.

II.

ALGORITHM DESIGN

The CBLOF algorithm combines distance-based

unsupervised clustering and local outlier [12] detection, and

clustering is for the purpose of pre-processing data for the

consequent anomaly identification.

A. Clustering

As far as the nature of money laundering (ML) is

concerned, the chosen clustering algorithm should be able to

generate the number of clusters automatically (with no need

for pre-establishment) and all the clusters are to be ranked

according to the number of the components in each. Thus we

propose the following procedures:

1) Start with any object (say p) in a dataset and create a

cluster. The initial cluster is supposed to be C

2) Choose any other object q, calculate its distance to the

existing clusters C

, C

, …, C

and denote it by

(, )distance q C

, and then figure out the minimal distance

value

(, )distance q C

min

Let the threshold be ε. If ( , )distance q C

min

≤ holds

and “q has never been clustered” satisfies, add q to the cluster

which is assumed to be nearest to q when compared with all

下载后可阅读完整内容，剩余3页未读，立即下载

weixin_43720930

粉丝: 0
资源: 2

基于聚类的CBLOF算法在反洗钱中的应用

AML8726 驱动程序adb_USBDriver.rar

SSDT-GPRW.aml

aml-spec:AML词汇和AML方言规范

Armbian-5.67-Aml-s805-Debian-stretch-default+EMMC直刷

y570-DSDT.aml

Advanced-Machine-Learning:AML课程的作业

\DSDT-GA-P67X-UD3-B3-F4.aml

ALC662-dsdt.aml文件

贝塞尔曲线matlab代码-cadesign:AML710

bcmdhd-sdio-aml-master.zip_WIFI开发 FLOW_android_ap6212中文资料_bcmdhd

最新资源