Title: Federated Learning and Knowledge Distillation in Rule-Based Models: An Environment of Granular Computing

Witold Pedrycz

Professor, University of Alberta, Canada

URL: http://www.ece.ualberta.ca/~pedrycz/

Talk Abstract: The visible challenges that are inherently associated with real-world data give rise to new and impactful directions of realization of system modeling such as federated learning and knowledge distillation. We advocate that to conveniently address these quests and enhance the existing learning paradigms, it becomes beneficial to engage the fundamental framework of Granular Computing. It is demonstrated that various ways of conceptualization of information granules as fuzzy sets, sets, rough sets among others may lead to efficient solutions. To establish a sound conceptual modeling framework, we include a brief discussion of information granules-oriented design of rule-based architectures. A way of forming such rules through unsupervised federated learning is discussed along with algorithmic developments. A granular characterization of the model formed by the server vis-a-vis data located at individual clients in federated learning is presented. It is demonstrated that the quality of the rules at the client’s end is described in terms of granular parameters and subsequently the global model becomes represented as a granular model with parameters in the form of information granules of type-2. The roles of granular augmentations of models in the setting of logic-oriented knowledge distillation are discussed.

Title: Tensor Multi-Elastic Kernel Self-Paced Learning for Time Series Clustering

Wensheng Zhang

Professor,University of Chinese Academy of Sciences, China

URL:  http://people.ucas.ac.cn/~wenshengzhang

Talk Abstract: Time series clustering has attracted growing attention due to the abundant data accessible and extensive value in various applications. The unique characteristics of time series, including high-dimension, warping and the integration of multiple elastic measures, pose challenges for the present clustering algorithms, most of which take into account only part of these difficulties. We make an effort to simultaneously address all aforementioned issues in time series clustering under a unified multiple kernels clustering (MKC) framework. Specifically, we first implicitly map the raw time series space into multiple kernel spaces via elastic distance measure functions. In such high-dimensional spaces, we resort to the tensor constraint based self-representation subspace clustering approach, involving in the self-paced learning paradigm, to explore the essential low-dimensional structure of the data, as well as the high-order complementary information from different elastic kernels. The proposed approach can be extended to more challenging multivariate time series clustering scenario in a direct but elegant way. Extensive experiments on 85 univariate and 10 multivariate time series datasets demonstrate the significant superiority of the proposed approach beyond the baseline and several state-of-the-art MKC methods.

Title: 面向数据安全的智能优化建模方法研究

Zhihua Cui

Professor,Taiyuan University of Science and Technology, China

URL:  https://yjjg.tyust.edu.cn/info/1133/1521.htm

Talk Abstract: 近年来,信息科技革命极大地改变了全球各经济体的经济社会发展。数据已成为各经济体实现创新发展、重塑人们生活、乃至国家经济社会发展的重要支撑动力。 然而,随着人工智能化水平的提升,数据安全问题给各行各业带来了严峻的挑战。一方面,针对医疗行业面临的供应链挑战,数据的碎片化和数据源的隐私问题,以全局模型精度、通信成本、 全局模型损失和ROC曲线下面积为评价指标,提出了一种基于联邦学习与深度生成模型相结合的检测模型。其次,受到适应度值估计策略的启发,提出了一种高效的联邦学习算法,动态、 自适应地改变聚合权重,极大地提高了检测性能。另一方面,针对大数据和物联网飞速发展,造成恶意代码的爆炸式增长,引入SPP金字塔结构改进神经网络结构并用于恶意代码检测。 同时,为了降低恶意代码数据稀疏性的影响,结合了生成对抗网络和卷积神经网络,提出一种高效的数据增强方法,有效克服了需要统一输入的局限性。其次,为了进一步降低数据不平衡 带来的负面影响,利用智能优化算法对不平衡恶意数据进行优化处理,并设计了一种新颖的参数优化后的多目标RBM 模型。最后,考虑到RBM 模型在处理大量图像数据集时的局限性以及 CNN训练速度较快且分类效果较好的优势,我们在池化层中引cui入SPP 策略,采用参数优化后的RBM 模型用于全连接层以作为生成模型,提出了一种基于SPP 策略的多目标CRBM 模型。 通过大量的实验,验证了上述研究方法的有效性,为数据安全问题提供了理论基础。

Title: How to perform clustering in the data imbalanced environment?

Yiu-ming Cheung

Professor, Hong Kong Baptist University, China

URL:  https://www.scholat.com/xmzhang8.cn

Talk Abstract: In many practical problems, the number of data forming difference classes can be quite imbalanced, which could make the performance of the most machine learning methods become deteriorate to a certain degree. In general, the problem of learning from imbalanced data is nontrivial and challenging in the field of data engineering and machine learning, which has attracted growing attentions in recent years. In this talk, we will introduce the imbalanced data learning and its related techniques, as well as its applications.

Title: TBA

Yong Liu

Professor,System Intelligence Laboratory,University of Aizu

URL:  https://www.u-aizu.ac.jp/research/faculty/detail?cd=90020&lng=en

Talk Abstract: TBA

Title: Some Theoretical Problems of Random Heuristic Search Algorithms

Jun He

Professor, Nottingham Trent University, Clifton Campus,Nottingham NG11 8NS, UK

Talk Abstract: Inspired by nature, many intelligent optimization algorithms have been designed to solve various optimization problems, such as simulated annealing algorithm, genetic algorithm and particle swarm optimization algorithm. These algorithms have some commonalities: randomness, heuristics, search algorithm, so they are called random heuristic search algorithm as its name implies in theoretical research. Compared with the traditional optimization algorithm, the stochastic heuristic search algorithm is intuitive and easy to implement. However, due to the randomness and heuristics of the algorithms, it is not easy to analyze and evaluate the performance of these algorithms in theory.

Title: Evolutionary Computing and Complex Networks for Metabolomics and Precision Medicine

Ting Hu

Professor, Department of Computer Science, Memorial University, Canada

Talk Abstract: Metabolomics studies use quantitative analyses of metabolites from body fluids or tissues in order to investigate a sequence of cellular processes and biological systems in response to genetic and environmental influences. This promises an immense potential for a better understanding of the pathogenesis of complex diseases. Most conventional metabolomics analysis methods exam one metabolite at a time and may overlook the synergistic effect of combining multiple metabolites. In this article, we proposed a new bioinformatics framework that infers the non-linear synergy among multiple metabolites using a symbolic model and subsequently, identify key metabolites using network analysis. Such a symbolic model is able to represent a complex non-linear relationship among a set of metabolites associated with osteoarthritis (OA) and is automatically learned using an evolutionary algorithm. Applied to the Newfoundland Osteoarthritis Study (NFOAS) dataset, our methodology was able to identify nine key metabolites including some known osteoarthritis-associated metabolites and some novel metabolic markers that have never been reported before. The results demonstrate the effectiveness of our methodology and more importantly, with further investigations, propose new hypotheses that can help better understand the OA disease.