实例
数据集
笔记本

笔记本

AI 辅助直觉 (DeepMind)

数学实践涉及发现模式并使用这些模式来制定和证明猜想,从而产生定理。自 1960 年代以来,数学家一直使用计算机来帮助发现猜想的模式和公式,机器学习的帮助下发现的纯数学新基本结果的示例——展示了一种机器学习可以帮助数学家发现新猜想和定理的方法。
Dave上传于 3 years ago
标签
暂无标签
浏览962
笔记本内容

发表在 Nature 的封面文章AI 辅助直觉(AI-guided intuition) 希望通过机器学习辅助发现纯数学的猜想和定理。

Dave 用大白话概括一下:

  • 对于两个数学对象 X(z), Y(z) ,如果 机器学习 能够学到 f 使得 f(X(z)) 约等于 Y(z) ,说明X与Y之间有一定关系
  • 其中利用 归因技术(Attribution Techniques) 来辅助发现哪些特征更加重要。(归因技术:计算输入关于输出的梯度,梯度大说明该输入(特征)重要,梯度小说明该输入(特征)不重要)

流程 #

  1. 提出猜想(数学家)
  2. 生成数据(采样)(AI)
  3. 训练监督学习模型 (AI)
  4. 发现模式,归因技术减小空间 (AI)
  5. 猜想候选(数学家)
  6. 证明理论(数学家)

image.png

论文中的实验 #

通过节(Knot)的几何不变量预测它的 signature (https://knotinfo.math.indiana.edu/descriptions/signature.html)

该关系是在之前的研究中没有发现过的

image.png

代码 #

# 安装必要的包
from IPython.display import clear_output

!pip install dm-haiku
!pip install optax
clear_output()
# 导入各种包
import tempfile

import haiku as hk
import jax
import jax.numpy as jnp
import matplotlib.pyplot as plt
import numpy as np
import optax
import pandas as pd
import seaborn as sns
from sklearn.model_selection import train_test_split
import seaborn as sns
import matplotlib.pyplot as plt
# 下载数据集
!featurize -t [token] dataset download ea22c102-a9c5-4ce5-aa31-576bd52ff7c1
100%|█████████████████████████████████████| 18.3M/18.3M [00:00<00:00, 38.0MiB/s]
🍬  下载完成,正在解压...
🏁  数据集已经成功添加
# 对数据进行载入以及前处理
# 对于一个节 k, X(k) 是一个由这些量组成的向量,在这种情况下,这些量是这个节的几何不变量

full_df = pd.read_csv('/home/featurize/data/knot_theory_invariants.csv')
display_name_from_short_name = {
    'chern_simons': 'Chern-Simons',
    'cusp_volume': 'Cusp volume',
    'hyperbolic_adjoint_torsion_degree': 'Adjoint Torsion Degree',
    'hyperbolic_torsion_degree': 'Torsion Degree',
    'injectivity_radius': 'Injectivity radius',
    'longitudinal_translation': 'Longitudinal translation',
    'meridinal_translation_imag': 'Re(Meridional translation)',
    'meridinal_translation_real': 'Im(Meridional translation)',
    'short_geodesic_imag_part': 'Im(Short geodesic)',
    'short_geodesic_real_part': 'Re(Short geodesic)',
    'Symmetry_0': 'Symmetry: $0$',
    'Symmetry_D3': 'Symmetry: $D_3$',
    'Symmetry_D4': 'Symmetry: $D_4$',
    'Symmetry_D6': 'Symmetry: $D_6$',
    'Symmetry_D8': 'Symmetry: $D_8$',
    'Symmetry_Z/2 + Z/2': 'Symmetry: $\\frac{Z}{2} + \\frac{Z}{2}$',
    'volume': 'Volume',
}
column_names = list(display_name_from_short_name)
target = 'signature'
# 分割训练数据集和测试数据集
random_seed = 2 # @param {type: "integer"}
random_state = np.random.RandomState(random_seed)
train_df, validation_and_test_df = train_test_split(
    full_df, random_state=random_state)
validation_df, test_df = train_test_split(
    validation_and_test_df, test_size=.5, random_state=random_state)
train_df.head(2)
Unnamed: 0 hyperbolic_adjoint_torsion_degree hyperbolic_torsion_degree short_geodesic_real_part short_geodesic_imag_part injectivity_radius chern_simons cusp_volume longitudinal_translation meridinal_translation_imag meridinal_translation_real volume Symmetry_0 Symmetry_D3 Symmetry_D4 Symmetry_D6 Symmetry_D8 Symmetry_Z/2 + Z/2 signature
70746 73193 0 10 1.015512 -2.760601 0.507756 0.090530 12.226322 10.685555 1.144192 -0.519157 11.393225 0.0 0.0 0.0 0.0 0.0 1.0 -2
240827 249190 0 14 0.827289 -3.013258 0.413645 0.232453 13.800773 10.453156 1.320249 -0.158522 12.742782 0.0 0.0 0.0 0.0 0.0 1.0 0
print('训练集样本数量:', len(train_df))
print('测试集样本数量:', len(test_df))
训练集样本数量: 182809
测试集样本数量: 30469

使用 AutoML 工具 AutoGluon 来快速验证 #

参考李沐大神的视频

!pip install autogluon
clear_output()
from autogluon.tabular import TabularPredictor

predictor = TabularPredictor(label=target).fit(
    train_df[column_names + [target]],
    tuning_data=validation_df[column_names + [target]],
    time_limit=60
)
No path specified. Models will be saved in: "AutogluonModels/ag-20220209_052125/"
Beginning AutoGluon training ... Time limit = 60s
AutoGluon will save models to "AutogluonModels/ag-20220209_052125/"
AutoGluon Version:  0.3.1
Train Data Rows:    182809
Train Data Columns: 17
Tuning Data Rows:    30468
Tuning Data Columns: 17
Preprocessing data ...
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
	First 10 (of 14) unique label values:  [-2, 0, 2, -8, 4, -4, -6, 8, 6, 10]
	If 'multiclass' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Warning: Some classes in the training set have fewer than 10 examples. AutoGluon will only keep 12 out of 14 classes for training and will not try to predict the rare classes. To keep more classes, increase the number of datapoints from these rare classes in the training data or reduce label_count_threshold.
Fraction of data from classes with at least 10 examples that will be kept for training models: 0.9999452980980149
Train Data Class Count: 12
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    25132.68 MB
	Train Data (Original)  Memory Usage: 29.0 MB (0.1% of available memory)
	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
	Stage 1 Generators:
		Fitting AsTypeFeatureGenerator...
			Note: Converting 6 features to boolean dtype as they only contain 2 unique values.
	Stage 2 Generators:
		Fitting FillNaFeatureGenerator...
	Stage 3 Generators:
		Fitting IdentityFeatureGenerator...
	Stage 4 Generators:
		Fitting DropUniqueFeatureGenerator...
	Types of features in original data (raw dtype, special dtypes):
		('float', []) : 15 | ['chern_simons', 'cusp_volume', 'injectivity_radius', 'longitudinal_translation', 'meridinal_translation_imag', ...]
		('int', [])   :  2 | ['hyperbolic_adjoint_torsion_degree', 'hyperbolic_torsion_degree']
	Types of features in processed data (raw dtype, special dtypes):
		('float', [])     : 9 | ['chern_simons', 'cusp_volume', 'injectivity_radius', 'longitudinal_translation', 'meridinal_translation_imag', ...]
		('int', [])       : 2 | ['hyperbolic_adjoint_torsion_degree', 'hyperbolic_torsion_degree']
		('int', ['bool']) : 6 | ['Symmetry_0', 'Symmetry_D3', 'Symmetry_D4', 'Symmetry_D6', 'Symmetry_D8', ...]
	0.6s = Fit runtime
	17 features in original data used to generate 17 features in processed data.
	Train Data (Processed) Memory Usage: 20.05 MB (0.1% of available memory)
Data preprocessing and feature engineering runtime = 0.79s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
	To change this, specify the eval_metric argument of fit()
Fitting 13 L1 models ...
Fitting model: KNeighborsUnif ... Training model for up to 59.21s of the 59.2s of remaining time.
	0.9354	 = Validation score   (accuracy)
	16.84s	 = Training   runtime
	1.03s	 = Validation runtime
Fitting model: KNeighborsDist ... Training model for up to 41.24s of the 41.23s of remaining time.
	0.9483	 = Validation score   (accuracy)
	16.64s	 = Training   runtime
	0.83s	 = Validation runtime
Fitting model: NeuralNetFastAI ... Training model for up to 23.7s of the 23.7s of remaining time.
	Ran out of time, stopping training early. (Stopping on epoch 0)
	0.9095	 = Validation score   (accuracy)
	31.75s	 = Training   runtime
	0.5s	 = Validation runtime
Fitting model: WeightedEnsemble_L2 ... Training model for up to 59.21s of the -11.39s of remaining time.
	0.9624	 = Validation score   (accuracy)
	4.5s	 = Training   runtime
	0.01s	 = Validation runtime
AutoGluon training complete, total runtime = 76.0s ...
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20220209_052125/")
leaderboard = predictor.leaderboard(test_df, silent=True)
feature_importance = predictor.feature_importance(test_df, silent=True)
!sudo apt-get install -y graphviz graphviz-dev
!pip install pygraphviz
clear_output()
from PIL import Image
Image.open(predictor.plot_ensemble_model())

选择出来的 Ensemble 模型 #

  • KNeighborsDist (Accuracy: 0.9483)
  • NeurallNetFastAI (Accuracy: 0.9485)
  • LightGBMXT (Accuracy: 0.8522 应该是因为时间关系没有充分训练)

最终 Accuracy:0.9689

# 训练过的模型的详细信息
leaderboard
model score_test score_val pred_time_test pred_time_val fit_time pred_time_test_marginal pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L2 0.963996 0.962418 1.488950 1.331209 52.897020 0.007039 0.005616 4.500332 2 True 4
1 KNeighborsDist 0.947488 0.948272 1.049718 0.826673 16.642866 1.049718 0.826673 16.642866 1 True 2
2 KNeighborsUnif 0.934327 0.935373 1.055187 1.031045 16.843561 1.055187 1.031045 16.843561 1 True 1
3 NeuralNetFastAI 0.911287 0.909509 0.432193 0.498919 31.753822 0.432193 0.498919 31.753822 1 True 3
# 特征的重要程度
plt.figure(figsize=(16,8))
sns.barplot(x=feature_importance.importance.values, y=feature_importance.importance.index);

发现特征 #

可以从上图看到最重要的三个特征

通过 AI 辅助的方法,作者发现了这些几何不变量Signature 相关,提出了假设

image.png

并且进行了证明(由于 Dave 并非数学专业,就不详述了,对数学证明感兴趣的同学可以戳下方链接):

THE SIGNATURE AND CUSP GEOMETRY OF HYPERBOLIC KNOTS

drawing

我个人对这些应用于不同领域的机器学习方法很感兴趣,所以把 DeepMind 的研究搬运过来进行翻译,希望能够帮助到同样感兴趣的同学。

image

评论(0条)