0

0

CVPR2022 NAS竞赛Track 2 第1名技术方案分享

P粉084495128

P粉084495128

发布时间:2025-07-30 11:34:24

|

1046人浏览过

|

来源于php中文网

原创

本文介绍2022 CVPR Track2解决方案,聚焦小样本下架构性能预测。预处理含深度编码转换、归一化及Sigmoid处理;模型选择中,梯度提升类算法效果佳,经调参达0.78;尝试多任务学习未果,后通过集成GBRT等模型,结合GPNAS作为最终估计器,优化后得分0.7991。

☞☞☞AI 智能聊天, 问答助手, AI 智能搜索, 免费无限量使用 DeepSeek R1 模型☜☜☜

cvpr2022 nas竞赛track 2 第1名技术方案分享 - php中文网

Our 2022 CVPR Track2 Solution

Accurately Predicting the performance of architecture with small sample training is a very important but not easy task. How to analysis and train data as much as we can and overcome over fitting is the core problem we should deal with. Meanwhile if there is the mult-task problem, we should also think about if we can take advantage of their correlation.

In this track Super Network builds a search space based on ViT-Base.The search space contain depth, num_heads, mpl_ratio and embed_dim.

Pre process the data

Simplest way is directly train all the network after transform the structure to float and predict. The problem of this way is singularity and does not take advantage of information of background information. Therefore, we pre process the data by following step.

GradPen论文
GradPen论文

GradPen是一款AI论文智能助手,深度融合DeepSeek,为您的学术之路保驾护航,祝您写作顺利!

下载
  • Transfer the depth encoding(j,k,l) to integer(1,2,3). Here we consider ordinal encoding instead of one-Hot encoding as we assume depth has some correlation with predictability, the experiment also confirm our thought.
  • If the actual depth of a sub-network is less than 12, its encoding trailing end is padded with 0. Here we change 0 to 2 as we assume the input is 1,2,3 and 2 represent neutral information. after this transformation, the correlation between depth encoding feature and others decreased a lot which is a good thing for fitting later. Then we normalize data that map (1,2,3) to (-1,0,1).
  • Using Sigmoid function as Activiation Function. As we known, rank follow uniform distribution. By Sigmoid function we transfer uniform distribution to Gaussian distribution which is best choice for most model. After we get prediction, we use Sigmoid function again to transfer output back to uniform distribution as rank, then we round up to nearest integer.
  • we also try lots of other ideas such as different activiation function, add gassian noise. None of them has obviously improve across different task.

Model selection

What we did and how we pick our model.

  • Only GPNAS model, the baseline or modified based on baseline, the final score is about 0.67.
  • Other machine learning model such as linear regression, random forest model, kernel ridge regression, tree model, XGBoost, Gradient Boosting and so on. We find Gradient Boosting or other boosting based algorithm have best average result. When we firstly try it, we get score about 0.73. after tunning parameters, we get 0.78.
  • As our problem is multi-task, could we train different task at same time? We consider Multivariate Gradient Boosting based on muliMSE. However, the result is worse than univariate gradient boosting method. We think the reason is different task should consider different hyperparameters and we can not reduce the noise in Multivariate Gradient Boosting effectively.
  • As our trainig sample size is very small, in order to avoid over fitting, we try to ensemble our models. We choose GBRT,HISTGB,CATGB,XGB as our sub model. The obvious way is average all the result, this result improve to about 0.79. Then we try to improve after naive averaging.
  • We tried bagging Gradient Boosting which doesn't improve result. Then we think about stack gradient boosting group of model by a final estimator and the final estimator we use is GPNAS. Stacking allows to use the strength of each individual estimator by using their output as input of a final estimator. As sklearn already provide us this regressor, we modified GPNAS as a sklearn API. Now we get result about 0.792.
  • Until now our submodel of stack model still use same hyperparameters such as loss function, learning rate and depth. Then we consider select different loss function, learning rate and depth for different model as we assume different submodel and task have total different feature. And we add two more GBRT,CATGB submodel with different loss funtion. This round we get result about 0.798.
  • Also we modified and tunning our final estmator GPNAS, such as ridge parameter and instead of identity matrix, we choose inv(X.T*X) as our prior covariance matrix. We get final score 0.7991.
In [12]
#----------------------------------------------------------------------------------------------------------#   Install Packages#----------------------------------------------------------------------------------------------------------# If a persistence installation is required, # you need to use the persistence path as the following: #!mkdir /home/aistudio/external-libraries#!pip install --upgrade sklearn -i https://mirrors.aliyun.com/pypi/simple/ -t /home/aistudio/externallibraries#!conda install lightgbm #!conda install xgboost -i https://mirrors.aliyun.com/pypi/simple/!pip install catboost -i https://mirrors.aliyun.com/pypi/simple/#----------------------------------------------------------------------------------------------------------#   Import Packages#----------------------------------------------------------------------------------------------------------import matplotlib.pyplot as plt
%matplotlib inlineimport catboostimport lightgbmimport xgboostimport sklearnfrom sklearn import ensemblefrom sklearn.experimental import enable_hist_gradient_boostingfrom sklearn.ensemble import *#HistGradientBoostingRegressor,StackingRegressor,BaggingRegressor,ExtraTreesRegressorfrom sklearn.kernel_ridge import *from sklearn.linear_model import *#LinearRegressionfrom sklearn.semi_supervised import *from sklearn.svm import *from sklearn.metrics import mean_squared_errorimport numpy as npimport scipyimport copyimport jsonfrom sklearn.model_selection import cross_val_scorefrom scipy.linalg import hankelprint(sklearn.__version__)
Looking in indexes: https://mirrors.aliyun.com/pypi/simple/
Requirement already satisfied: catboost in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (1.0.6)
Requirement already satisfied: graphviz in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (0.13)
Requirement already satisfied: plotly in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (5.8.0)
Requirement already satisfied: pandas>=0.24.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (1.1.5)
Requirement already satisfied: matplotlib in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (2.2.3)
Requirement already satisfied: six in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (1.16.0)
Requirement already satisfied: scipy in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (1.6.3)
Requirement already satisfied: numpy>=1.16.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (1.20.3)
Requirement already satisfied: python-dateutil>=2.7.3 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pandas>=0.24.0->catboost) (2.8.2)
Requirement already satisfied: pytz>=2017.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pandas>=0.24.0->catboost) (2019.3)
Requirement already satisfied: cycler>=0.10 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->catboost) (0.10.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->catboost) (3.0.8)
Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->catboost) (1.1.0)
Requirement already satisfied: tenacity>=6.2.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from plotly->catboost) (8.0.1)
Requirement already satisfied: setuptools in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from kiwisolver>=1.0.1->matplotlib->catboost) (56.2.0)WARNING: You are using pip version 22.0.4; however, version 22.1.1 is available.
You should consider upgrading via the '/opt/conda/envs/python35-paddle120-env/bin/python -m pip install --upgrade pip' command.0.24.2
In [2]
#----------------------------------------------------------------------------------------------------------#   Load trainning and Testing data#----------------------------------------------------------------------------------------------------------def convert_X(arch_str):
    temp_arch = []    for i,elm in enumerate(arch_str):        if i in [3,6,9,12,15,18,21,24,27,30,33,36]: pass #Get rid of non-info columns,all is 768
        elif elm == 'j': temp_arch.append(1-2) #Transform it to number then central data, normalize Data
        elif elm == 'k': temp_arch.append(2-2) #Transform it to number then central data, normalize Data
        elif elm == 'l': temp_arch.append(3-2)  #Transform it to number then central data, normalize Data
        elif int(elm) == 0: temp_arch.append(2-2)  #Make 0 as 2 as it should contain neutral information(reduce correlation), then central data, normalize Data
        else: temp_arch.append(int(elm)-2)  #central data,normalize Data
    return(temp_arch)with open('./data/data134077/CVPR_2022_NAS_Track2_train.json', 'r') as f:
    train_data = json.load(f)with open('./data/data134077/CVPR_2022_NAS_Track2_test.json', 'r') as f:
    test_data = json.load(f)

    
test_arch_list = []for key in test_data.keys():
    test_arch =  convert_X(test_data[key]['arch'])
    test_arch_list.append(test_arch)
bb = np.array(test_arch_list).T

train_list = [[],[],[],[],[],[],[],[]]
arch_list_train = []
name_list = ['cplfw_rank', 'market1501_rank', 'dukemtmc_rank', 'msmt17_rank', 'veri_rank', 'vehicleid_rank', 'veriwild_rank', 'sop_rank']for key in train_data.keys():    for idx, name in enumerate(name_list):
        train_list[idx].append(train_data[key][name])
    xx = train_data[key]['arch']
    arch_list_train.append(convert_X(train_data[key]['arch']))

Y_all0 = np.array(train_list)
Y_all = np.log((Y_all0+1)/(500-Y_all0)) #Transfer rank data by Sigmoid function
In [3]
#----------------------------------------------------------------------------------------------------------#Correlation reserach #----------------------------------------------------------------------------------------------------------regcoef = np.round(np.array([LinearRegression().fit(np.array(arch_list_train), Y_all[i]).coef_ for i in range(8)]),2)print('Task Corr: ')print(np.round(np.corrcoef(regcoef),2))#correlation bewtween different taskprint('Parameters Corr: ')print(np.round(np.corrcoef(np.array(arch_list_train).T.astype('float')),2))#correlation bewtween different task, we can find the correlation is very low between different arch parameters
Task Corr: 
[[ 1.    0.18  0.12 -0.17 -0.29 -0.3  -0.3  -0.13]
 [ 0.18  1.    0.78  0.81  0.52  0.3   0.53  0.4 ]
 [ 0.12  0.78  1.    0.85  0.62  0.2   0.57  0.52]
 [-0.17  0.81  0.85  1.    0.8   0.42  0.82  0.66]
 [-0.29  0.52  0.62  0.8   1.    0.46  0.77  0.48]
 [-0.3   0.3   0.2   0.42  0.46  1.    0.52  0.12]
 [-0.3   0.53  0.57  0.82  0.77  0.52  1.    0.69]
 [-0.13  0.4   0.52  0.66  0.48  0.12  0.69  1.  ]]
Parameters Corr: 
[[ 1.    0.   -0.02  0.02 -0.09 -0.01  0.03 -0.02  0.09  0.01  0.   -0.05
   0.1  -0.04 -0.05 -0.02  0.04  0.01  0.   -0.08 -0.03  0.05  0.07  0.06
  -0.11]
 [ 0.    1.    0.01 -0.05  0.05  0.12  0.    0.02 -0.01 -0.02  0.01  0.1
   0.05 -0.03 -0.01  0.    0.03 -0.09 -0.01 -0.07  0.02  0.02  0.03  0.05
  -0.07]
 [-0.02  0.01  1.    0.01  0.03  0.01  0.1  -0.06  0.05  0.04  0.01  0.
  -0.01 -0.03 -0.01 -0.08 -0.01  0.03  0.05  0.04  0.01  0.03 -0.01  0.05
   0.03]
 [ 0.02 -0.05  0.01  1.    0.05 -0.03  0.01  0.   -0.02 -0.03 -0.08  0.08
  -0.04 -0.01  0.04 -0.01  0.04  0.03  0.01 -0.03 -0.02 -0.01  0.04  0.05
   0.09]
 [-0.09  0.05  0.03  0.05  1.   -0.02  0.04 -0.06  0.05  0.01 -0.03  0.
  -0.04 -0.03 -0.04  0.01  0.03 -0.03  0.07  0.08  0.04  0.03  0.03  0.
  -0.03]
 [-0.01  0.12  0.01 -0.03 -0.02  1.    0.02 -0.04 -0.04 -0.02 -0.06 -0.
   0.01  0.05  0.01 -0.09  0.   -0.05 -0.08 -0.04  0.01  0.08  0.07 -0.04
  -0.03]
 [ 0.03  0.    0.1   0.01  0.04  0.02  1.   -0.03  0.01 -0.02 -0.02  0.01
  -0.05 -0.04 -0.03  0.02 -0.04 -0.06 -0.03  0.04 -0.03  0.01 -0.02  0.
   0.01]
 [-0.02  0.02 -0.06  0.   -0.06 -0.04 -0.03  1.    0.01  0.04 -0.01 -0.09
   0.07  0.07 -0.01  0.05 -0.    0.05  0.05 -0.11 -0.02 -0.04 -0.   -0.04
   0.02]
 [ 0.09 -0.01  0.05 -0.02  0.05 -0.04  0.01  0.01  1.   -0.06 -0.03 -0.04
   0.06  0.05 -0.09  0.   -0.01  0.01  0.03  0.08 -0.01 -0.03  0.1   0.02
   0.08]
 [ 0.01 -0.02  0.04 -0.03  0.01 -0.02 -0.02  0.04 -0.06  1.   -0.05 -0.06
   0.02 -0.06 -0.06  0.   -0.04 -0.13 -0.03 -0.06 -0.03 -0.01 -0.02  0.03
  -0.06]
 [ 0.    0.01  0.01 -0.08 -0.03 -0.06 -0.02 -0.01 -0.03 -0.05  1.   -0.04
  -0.06  0.   -0.02 -0.06  0.03 -0.04  0.07 -0.09  0.02 -0.01 -0.04  0.01
  -0.02]
 [-0.05  0.1   0.    0.08  0.   -0.    0.01 -0.09 -0.04 -0.06 -0.04  1.
   0.02  0.02  0.09 -0.02 -0.04  0.02  0.    0.02  0.03 -0.03  0.03 -0.04
  -0.03]
 [ 0.1   0.05 -0.01 -0.04 -0.04  0.01 -0.05  0.07  0.06  0.02 -0.06  0.02
   1.   -0.07 -0.01  0.09 -0.06  0.06  0.03  0.01 -0.   -0.03 -0.06 -0.06
   0.02]
 [-0.04 -0.03 -0.03 -0.01 -0.03  0.05 -0.04  0.07  0.05 -0.06  0.    0.02
  -0.07  1.    0.02  0.04  0.    0.05 -0.   -0.04 -0.06 -0.05  0.06 -0.05
  -0.07]
 [-0.05 -0.01 -0.01  0.04 -0.04  0.01 -0.03 -0.01 -0.09 -0.06 -0.02  0.09
  -0.01  0.02  1.    0.02  0.03  0.03  0.02 -0.04  0.03  0.01 -0.05 -0.05
  -0.01]
 [-0.02  0.   -0.08 -0.01  0.01 -0.09  0.02  0.05  0.    0.   -0.06 -0.02
   0.09  0.04  0.02  1.   -0.05  0.05  0.04 -0.04 -0.02 -0.04  0.01  0.03
  -0.03]
 [ 0.04  0.03 -0.01  0.04  0.03  0.   -0.04 -0.   -0.01 -0.04  0.03 -0.04
  -0.06  0.    0.03 -0.05  1.    0.06 -0.   -0.04  0.04 -0.02  0.03  0.02
   0.08]
 [ 0.01 -0.09  0.03  0.03 -0.03 -0.05 -0.06  0.05  0.01 -0.13 -0.04  0.02
   0.06  0.05  0.03  0.05  0.06  1.    0.    0.04 -0.06 -0.02  0.05 -0.
  -0.06]
 [ 0.   -0.01  0.05  0.01  0.07 -0.08 -0.03  0.05  0.03 -0.03  0.07  0.
   0.03 -0.    0.02  0.04 -0.    0.    1.   -0.03 -0.02  0.03 -0.03  0.03
  -0.08]
 [-0.08 -0.07  0.04 -0.03  0.08 -0.04  0.04 -0.11  0.08 -0.06 -0.09  0.02
   0.01 -0.04 -0.04 -0.04 -0.04  0.04 -0.03  1.   -0.    0.07 -0.05 -0.1
  -0.  ]
 [-0.03  0.02  0.01 -0.02  0.04  0.01 -0.03 -0.02 -0.01 -0.03  0.02  0.03
  -0.   -0.06  0.03 -0.02  0.04 -0.06 -0.02 -0.    1.    0.02  0.03 -0.03
   0.05]
 [ 0.05  0.02  0.03 -0.01  0.03  0.08  0.01 -0.04 -0.03 -0.01 -0.01 -0.03
  -0.03 -0.05  0.01 -0.04 -0.02 -0.02  0.03  0.07  0.02  1.   -0.1   0.05
  -0.11]
 [ 0.07  0.03 -0.01  0.04  0.03  0.07 -0.02 -0.    0.1  -0.02 -0.04  0.03
  -0.06  0.06 -0.05  0.01  0.03  0.05 -0.03 -0.05  0.03 -0.1   1.   -0.04
  -0.05]
 [ 0.06  0.05  0.05  0.05  0.   -0.04  0.   -0.04  0.02  0.03  0.01 -0.04
  -0.06 -0.05 -0.05  0.03  0.02 -0.    0.03 -0.1  -0.03  0.05 -0.04  1.
   0.05]
 [-0.11 -0.07  0.03  0.09 -0.03 -0.03  0.01  0.02  0.08 -0.06 -0.02 -0.03
   0.02 -0.07 -0.01 -0.03  0.08 -0.06 -0.08 -0.    0.05 -0.11 -0.05  0.05
   1.  ]]
In [13]
#----------------------------------------------------------------------------------------------------------#    Modify GPNAS code as sklearn API and add some more function#----------------------------------------------------------------------------------------------------------__all__ = ["GPNAS_API"]class GPNAS_API(object):
    _estimator_type = "regressor"
    def __init__(self, cov_w = None, w = None, c_flag=2, m_flag=2, hp_mat = 0.0000001, hp_cov = 0.01, icov = 1):
        self.hp_mat = hp_mat
        self.hp_cov = hp_cov
        self.cov_w = cov_w
        self.w = w 
        self.c_flag = c_flag
        self.m_flag = m_flag
        self.icov = icov #if we use initial cov as prior

    def get_params(self, deep=True):
        return {            "hp_mat": self.hp_mat, 
            "hp_cov": self.hp_cov, 
            "cov_w": self.cov_w, 
            "w": self.w, 
            "c_flag": self.c_flag, 
            "m_flag": self.m_flag, 
            "icov": self.icov, 
            }    def set_params(self, **parameters):
        for parameter, value in parameters.items():            setattr(self, parameter, value)        return self    def _get_corelation(self, mat1, mat2):
        """
        give two typical kernel function
        Auto kernel hyperparameters estimation to be updated
        """

        mat_diff = abs(mat1 - mat2)        if self.c_flag == 1:            return 0.5 * np.exp(-np.dot(mat_diff, mat_diff) / 16)        elif self.c_flag == 2:            return 1 * np.exp(-np.sqrt(np.dot(mat_diff, mat_diff)) / 12)    def _preprocess_X(self, X):
        """
        preprocess of input feature/ tokens of architecture
        more complicated preprocess can be added such as nonlineaer transformation
        """
        X = X.tolist()
        p_X = copy.deepcopy(X)        for feature in p_X: feature.append(1)        return p_X    def _get_cor_mat(self, X):
        """get kernel matrix"""
        X = np.array(X)
        l = X.shape[0]
        cor_mat = []        for c_idx in range(l):
            col = []
            c_mat = X[c_idx].copy()            for r_idx in range(l):
                r_mat = X[r_idx].copy()
                temp_cor = self._get_corelation(c_mat, r_mat)
                col.append(temp_cor)
            cor_mat.append(col)        return np.mat(cor_mat)    def _get_cor_mat_joint(self, X, X_train):
        """
        get kernel matrix
        """
        X = np.array(X)
        X_train = np.array(X_train)
        l_c = X.shape[0]
        l_r = X_train.shape[0]
        cor_mat = []        for c_idx in range(l_c):
            col = []
            c_mat = X[c_idx].copy()            for r_idx in range(l_r):
                r_mat = X_train[r_idx].copy()
                temp_cor = self._get_corelation(c_mat, r_mat)
                col.append(temp_cor)
            cor_mat.append(col)        return np.mat(cor_mat)    def fit(self, X,y):
        self.get_initial_mean(X[0::2],y[0::2])
        self.get_initial_cov(X)        # 更新(训练)gpnas预测器超参数
        self.get_posterior_mean(X[1::2],y[1::2])        
    def predict(self, X):
        X = self._preprocess_X(X)
        X = np.mat(X)        #print('beta',self.w.flatten())
        return X * self.w            
    def get_predict(self, X):
        """
        get the prediction of network architecture X
        """
        X = self._preprocess_X(X)
        X = np.mat(X)        return X * self.w    def get_predict_jiont(self, X, X_train, Y_train):
        """
        get the prediction of network architecture X based on X_train and Y_train
        """
        X = np.mat(X)
        X_train = np.mat(X_train)
        Y_train = np.mat(Y_train)
        m_X = self.get_predict(X)
        m_X_train = self.get_predict(X_train)
        mat_train = self._get_cor_mat(X_train)
        mat_joint = self._get_cor_mat_joint(X, X_train)        return m_X + mat_joint * np.linalg.inv(mat_train + self.hp_mat * np.eye(
            X_train.shape[0])) * (Y_train.T - m_X_train)    def get_initial_mean(self, X, Y):
        """
        get initial mean of w
        """
        X = self._preprocess_X(X)
        X = np.mat(X)
        Y = np.mat(Y)
        self.w = np.linalg.inv(X.T * X + self.hp_mat * np.eye(X.shape[            1])) * X.T * Y.T        #inv(X.T*X)X.T*Y as initial mean
        print('Variance',np.var(Y-X * self.w))#Show variance of residual then we can base this tunning self.hp_cov
        return self.w    def get_initial_cov(self, X):
        """
        get initial coviarnce matrix of w
        """
        X = self._preprocess_X(X)
        X = np.mat(X)        if self.icov == 1: #use inv(X.T*X) as initial covariance
            self.cov_w = self.hp_cov * np.linalg.inv(X.T * X)        elif self.icov == 0:# use identity matrix as initial covariance
            self.cov_w = self.hp_cov * np.eye(X.shape[1])        else:            assert 0,'not available yet'
        return self.cov_w    def get_posterior_mean(self, X, Y):
        """
        get posterior mean of w
        """
        X = self._preprocess_X(X)
        X = np.mat(X)
        Y = np.mat(Y)
        cov_mat = self._get_cor_mat(X)        if self.m_flag == 1:
            self.w = self.w + self.cov_w * X.T * np.linalg.inv(
                np.linalg.inv(cov_mat + self.hp_mat * np.eye(X.shape[0])) + X *
                self.cov_w * X.T + self.hp_mat * np.eye(X.shape[0])) * (
                    Y.T - X * self.w)        else:
            self.w = np.linalg.inv(X.T * np.linalg.inv(
                cov_mat + self.hp_mat * np.eye(X.shape[0])) * X + np.linalg.inv(
                    self.cov_w + self.hp_mat * np.eye(X.shape[                        1])) + self.hp_mat * np.eye(X.shape[1])) * (
                            X.T * np.linalg.inv(cov_mat + self.hp_mat * np.eye(
                                X.shape[0])) * Y.T +
                            np.linalg.inv(self.cov_w + self.hp_mat * np.eye(
                                X.shape[1])) * self.w)        return self.w    def get_posterior_cov(self, X, Y):
        """
        get posterior coviarnce matrix of w
        """
        X = self._preprocess_X(X)
        X = np.mat(X)
        Y = np.mat(Y)
        cov_mat = self._get_cor_mat(X)
        self.cov_mat = np.linalg.inv(
            np.linalg.inv(X.T * cov_mat * X + self.hp_mat * np.eye(X.shape[1]))
            + np.linalg.inv(self.cov_w + self.hp_mat * np.eye(X.shape[                1])) + self.hp_mat * np.eye(X.shape[1]))        return self.cov_mat
In [14]
#----------------------------------------------------------------------------------------------------------#    Modify GPNAS code as sklearn API and add some more function#----------------------------------------------------------------------------------------------------------max_iter = [10000,10000,10000,10000,10000,10000,10000,10000] 

#learning_rate = [0.008,0.038,0.032,0.02,0.025,0.012,0.025,0.006]#learning_rate = [0.01.,0.04.,0.04.,0.04.,0.02.,0.02.,0.04.,0.001.]learning_rate = [0.005,0.038,0.035,0.03,0.025,0.01,0.03,0.01] #Final learning ratemax_depth = [1,3,2,2,2,3,1,3] #depth for GBRT(huber),CATGB(MSE),GBRT2(MSE),CATGB2(huber)max_depth2 = [1,1,1,1,1,1,1,1] #depth for HISTGB,LIGHTGB,XGBlist_est = []

model_GBRT,model_HISTGB,model_CATGB,model_LIGHTGB,model_XGB,model_GBRT2,model_CATGB2= [],[],[],[],[],[],[]for i in range(8):

    params_GBRT = {"n_estimators": max_iter[i],    "max_depth": max_depth[i],    "subsample": .8,    "learning_rate": learning_rate[i],    "loss": 'huber',    "max_features": 'sqrt',    "random_state":1,
    } 
    model_GBRT.append(ensemble.GradientBoostingRegressor(**params_GBRT)) 
    
    params_HISTGB = {    "max_depth": max_depth2[i],    "max_iter":max_iter[i] ,    "learning_rate": learning_rate[i],    "loss": 'least_squares',    "max_leaf_nodes":31,    "min_samples_leaf":5,    "l2_regularization":5,    "random_state":1,
    }
    model_HISTGB.append(HistGradientBoostingRegressor(**params_HISTGB))


    model_CATGB.append(catboost.CatBoostRegressor(iterations= max_iter[i] ,
                             learning_rate= learning_rate[i],
                             depth= max_depth[i],
                             silent=True,
                             task_type="CPU",
                             loss_function= 'RMSE',                     
                             eval_metric='RMSE',
                             random_seed = 1,
                             od_type='Iter',
                             metric_period = 75,
                             od_wait=100,
                             ))

    
    model_LIGHTGB.append(lightgbm.LGBMRegressor(boosting_type='gbdt',learning_rate = learning_rate[i],num_leaves=31,
    max_depth = max_depth2[i], alpha = 0.1, n_estimators = max_iter[i] ,random_state=1))
    
    model_XGB.append(xgboost.XGBRegressor(learning_rate = learning_rate[i],tree_method = 'auto',
    max_depth = max_depth2[i], alpha = 0.8, n_estimators = max_iter[i] ,random_state=1))   

    params_GBRT2 = {"n_estimators": max_iter[i] ,    "max_depth": max_depth[i],    "subsample": .8,    "learning_rate": learning_rate[i],    "loss": 'ls',    "max_features": 'log2',    "random_state":1,
    } 
    model_GBRT2.append(ensemble.GradientBoostingRegressor(**params_GBRT2)) 
    
    model_CATGB2.append(catboost.CatBoostRegressor(iterations= max_iter[i] ,
                             learning_rate= learning_rate[i],
                             depth= max_depth[i],
                             silent=True,
                             task_type="CPU",
                             loss_function=  'Huber:delta=2',                     
                             eval_metric= 'Huber:delta=2',
                             random_seed = 1,
                             od_type='Iter',
                             metric_period = 75,
                             od_wait=100,
                             l2_leaf_reg = 1,
                             subsample = 0.8,
                             ))for i in range(8): 
    list_est.append([
    ('GBRT', model_GBRT[i]),
    ('HISTGB', model_HISTGB[i]),
    ('CATGB',model_CATGB[i]),
    ('LIGHTGB', model_LIGHTGB[i]),
    ('XGB', model_XGB[i]),
    ('GBRT2', model_GBRT2[i]),
    ('CATGB2', model_CATGB2[i]),
    ])
In [6]
#----------------------------------------------------------------------------------------------------------#    In sample training and testing, just for reserach and parameter selection#----------------------------------------------------------------------------------------------------------#X_all_k = np.array(arch_list_train)#X_val = np.array(test_arch_list)#print(np.array(test_arch_list).shape,X_val.shape)#train_num = 400;#gb_list = []#for i in range(8):#    model_final = StackingRegressor(estimators=list_est[i],final_estimator=GPNAS_API(c_flag=2, m_flag=2, hp_mat = hp_list[i], hp_cov = 3, icov = 1),passthrough=False,n_jobs=4)#    Y_all_k = Y_all[i]#    X_train_k, Y_train_k, X_test_k, Y_test_k = X_all_k[0:train_num:1], Y_all_k[0:train_num:1], X_all_k[train_num::1], Y_all_k[train_num::1]#    model_final.fit(X_train_k,Y_train_k)#    gb_list.append(copy.copy(model_final))#    print('Kendalltau:',i,scipy.stats.stats.kendalltau(model_final.predict(X_test_k),Y_test_k))
In [7]
#----------------------------------------------------------------------------------------------------------#    Plot training and testing params result to help make prediction.#----------------------------------------------------------------------------------------------------------#i = 0#params = gb_list[i].get_params()#itr = "n_estimators"#test_score0 = np.zeros((params[itr],), dtype=np.float64)#test_score1 = np.zeros((params[itr],), dtype=np.float64)#for i,y_pred0 in enumerate(model_gb.staged_predict(X_train_k)):#    test_score0[i] = scipy.stats.stats.kendalltau(Y_train_k, y_pred0)[0]#for i,y_pred1 in enumerate(model_gb.staged_predict(X_test_k)):#    test_score1[i] = scipy.stats.stats.kendalltau(Y_test_k, y_pred1)[0]
    #fig = plt.figure(figsize=(6, 6))#plt.subplot(1, 1, 1)#plt.title("Deviance")#plt.plot(#    np.arange(params[itr])[:] + 1,#    test_score0[:],#    "b-",#    label="Training Set Deviance",#)#plt.plot(np.arange(params[itr])[:] + 1, test_score1[:], "r-", label="Test Set Deviance")#plt.legend(loc="upper right")#plt.xlabel("Boosting Iterations")#plt.ylabel("Deviance")#fig.tight_layout()#plt.show()
In [8]
#----------------------------------------------------------------------------------------------------------#   Feature Importance#----------------------------------------------------------------------------------------------------------#from sklearn.inspection import permutation_importance#for model_gb in gb_list:#  feature_importance = model_gb.feature_importances_#  sorted_idx = np.argsort(feature_importance)#  pos = np.arange(sorted_idx.shape[0]) + 0.5#  fig = plt.figure(figsize=(6, 4))#  plt.barh(pos, feature_importance[sorted_idx], align="center")#  plt.yticks(pos, np.array(range(len(feature_importance)))[sorted_idx])#  plt.title("Feature Importance ")
In [9]
#----------------------------------------------------------------------------------------------------------#    Result 1, use GPNAS with inv(X.T*X) as initial covariance prior to stack 'GBRT','HISTGB','CATGB','LIGHTGB','XGB','GBRT2','CATGB2' with cross validation#----------------------------------------------------------------------------------------------------------X_train_k = np.array(arch_list_train)
X_val = np.array(test_arch_list)

rank_all1= []for i in range(len(list_est)):    print('No: ',i)    #stack different regressor by GPNAS with cross validation
    model_final = StackingRegressor(estimators=list_est[i],final_estimator=GPNAS_API(c_flag=2, m_flag=2, hp_mat = 0.5, hp_cov = 3, icov = 1),passthrough=False,n_jobs=4)
    Y_train_k = Y_all[i]
    model_final.fit(X_train_k,Y_train_k)
    zz = np.round((X_val.shape[0]-1)/(1+ np.exp(-1*model_final.predict(X_val)))) #Transfer by Sigmoid function
    print(zz[0])
    rank_all1.append(zz)
No:  0
Variance 3.3439391282630595
---------------------------------------------------------------------------KeyboardInterrupt Traceback (most recent call last)/tmp/ipykernel_165/1501576174.py in  13 Y_train_k = Y_all[i] 14 model_final.fit(X_train_k,Y_train_k)---> 15  zz = np.round((X_val.shape[0]-1)/(1+ np.exp(-1*model_final.predict(X_val)))) #Transfer by Sigmoid function 16 print(zz[0]) 17 rank_all1.append(zz) /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/utils/metaestimators.py in (*args, **kwargs) 118 119  # lambda, but not partial, allows help() to work with update_wrapper --> 120 out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs) 121  # update the docstring of the returned function 122 update_wrapper(out, self.fn) /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/ensemble/_stacking.py in predict(self, X, **predict_params) 244 check_is_fitted(self) 245return self.final_estimator_.predict( --> 246 self.transform(X), **predict_params 247 ) 248 /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/ensemble/_stacking.py in transform(self, X) 702 Prediction outputs for each estimator. 703 """ --> 704  return self._transform(X) 705 706 def _sk_visual_block_(self): /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/ensemble/_stacking.py in _transform(self, X) 215predictions = [ 216 getattr(est, meth)(X) --> 217  for est, meth in zip(self.estimators_, self.stack_method_) 218 if est != 'drop' 219 ] /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/ensemble/_stacking.py in (.0) 216 getattr(est, meth)(X) 217 for est, meth in zip(self.estimators_, self.stack_method_) --> 218  if est != 'drop' 219 ] 220 return self._concatenate_predictions(X, predictions) /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py in predict(self, X) 1086  # Return inverse link of raw predictions after converting 1087  # shape (n_samples, 1) to (n_samples,) -> 1088 returnself._loss.inverse_link_function(self._raw_predict(X).ravel()) 1089 1090 def staged_predict(self, X): /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.pyin _raw_predict(self, X) 749 raw_predictions += self._baseline_prediction 750 self._predict_iterations( --> 751 X, self._predictors, raw_predictions, is_binned 752 ) 753  return raw_predictions /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py in _predict_iterations(self, X, predictors, raw_predictions, is_binned) 771 known_cat_bitsets=known_cat_bitsets, 772f_idx_map=f_idx_map) --> 773 raw_predictions[k, :] += predict(X) 774 775  def _staged_raw_predict(self, X): /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/predictor.pyin predict(self, X, known_cat_bitsets, f_idx_map) 65 out = np.empty(X.shape[0], dtype=Y_DTYPE) 66 _predict_from_raw_data(self.nodes, X, self.raw_left_cat_bitsets, ---> 67 known_cat_bitsets, f_idx_map, out) 68  return out 69KeyboardInterrupt: 
In [ ]
#----------------------------------------------------------------------------------------------------------# Save Result 1(My result is ran locally which is a little different from here)#----------------------------------------------------------------------------------------------------------for idx,key in enumerate(test_data.keys()):    #print(key)
    test_data[key]['cplfw_rank'] = int(rank_all1[0][idx])
    test_data[key]['market1501_rank'] = int(rank_all1[1][idx])
    test_data[key]['dukemtmc_rank'] = int(rank_all1[2][idx])
    test_data[key]['msmt17_rank'] = int(rank_all1[3][idx])
    test_data[key]['veri_rank'] = int(rank_all1[4][idx])
    test_data[key]['vehicleid_rank'] = int(rank_all1[5][idx])
    test_data[key]['veriwild_rank'] = int(rank_all1[6][idx])
    test_data[key]['sop_rank'] = int(rank_all1[7][idx])print('Ready to save results!')with open('./CVPR_2022_NAS_Track2_submit_ACCNAS_1.json', 'w') as f:
    json.dump(test_data, f)
In [10]
#----------------------------------------------------------------------------------------------------------# Result 2, use GPNAS with identity matrix as initial covariance prior to stack 'GBRT','HISTGB','CATGB','LIGHTGB','XGB','GBRT2','CATGB2' with cross validation#----------------------------------------------------------------------------------------------------------X_train_k = np.array(arch_list_train)
X_val = np.array(test_arch_list)

rank_all2= []for i in range(len(list_est)):    print('No: ',i)    #stack different regressor by GPNAS with cross validation
    model_final = StackingRegressor(estimators=list_est[i],final_estimator=GPNAS_API(c_flag=2, m_flag=2, hp_mat = 0.5, hp_cov = 0.01, icov = 0),passthrough=False,n_jobs=4)
    Y_train_k = Y_all[i]
    model_final.fit(X_train_k,Y_train_k)
    zz = np.round((X_val.shape[0]-1)/(1+ np.exp(-1*model_final.predict(X_val)))) #Transfer by Sigmoid function
    print(zz[0])
    rank_all2.append(zz)
No:  0
Variance 3.3439391282630595
[[42459.]]
No:  1
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py:706: UserWarning: A worker stopped while some jobs were given to the executor. This can be caused by a too short worker timeout or by a memory leak.
  "timeout or by a memory leak.", UserWarning
Variance 6.0485427006730506
[[28879.]]
No:  2
Variance 6.006150913480946
[[47402.]]
No:  3
Variance 6.103473245508546
[[72536.]]
No:  4
Variance 6.025486965953786
[[92206.]]
No:  5
Variance 5.8364155067895815
[[70782.]]
No:  6
Variance 6.52441906729945
[[92304.]]
No:  7
Variance 5.725986746928656
[[90033.]]
In [11]
#----------------------------------------------------------------------------------------------------------# Save Result 2(My result is ran locally which is a little different from here)#----------------------------------------------------------------------------------------------------------for idx,key in enumerate(test_data.keys()):    #print(key)
    test_data[key]['cplfw_rank'] = int(rank_all2[0][idx])
    test_data[key]['market1501_rank'] = int(rank_all2[1][idx])
    test_data[key]['dukemtmc_rank'] = int(rank_all2[2][idx])
    test_data[key]['msmt17_rank'] = int(rank_all2[3][idx])
    test_data[key]['veri_rank'] = int(rank_all2[4][idx])
    test_data[key]['vehicleid_rank'] = int(rank_all2[5][idx])
    test_data[key]['veriwild_rank'] = int(rank_all2[6][idx])
    test_data[key]['sop_rank'] = int(rank_all2[7][idx])print('Ready to save results!')with open('./CVPR_2022_NAS_Track2_submit_ACCNAS_2.json', 'w') as f:
    json.dump(test_data, f)
Ready to save results!
In [ ]
#----------------------------------------------------------------------------------------------------------#    Multivariate Gradient Boosting (result is not that good final score 0.78759) #----------------------------------------------------------------------------------------------------------#from catboost import Pool, CatBoostRegressor#import numpy as np#train_num = 400;gb_list = [];rank_list= []#index_list = [[0],[1],[2],[3],[4],[5],[6],[7]]#index_list = [[0],[0,1],[0,2],[0,3],[0,4],[0,5],[0,6],[0,7]]#[50, 17, 43, 18, 3, 10, 28, 31]#index_list = [[1,0],[1,1],[1,2],[1,3],[1,4],[1,5],[1,6],[1,7]]#1,1#index_list = [[1],[1,2],[1,3],[1,4],[1,2],[1,2,3],[1,2,3,4],[1,2,3,4,5]]#[48, 19, 36, 30, 0, 22, 23, 22]#index_list = [[2,0],[2,1],[2],[2,3],[2,4],[2,5],[2,6],[2,7]]#[1, 30, 68, 35, 25, 15, 14, 12]#index_list = [[3,0],[3,1],[3,2],[3,3],[3,4],[3,5],[3,6],[3,7]]#3,3#index_list = [[4,0],[4,1],[4,2],[4,3],[4,4],[4,5],[4,6],[4,7]]#[0, 25, 27, 31, 70, 8, 23, 16]#index_list = [[5,0],[5,1],[5,2],[5,3],[5,4],[5,5],[5,6],[5,7]]#[7, 18, 27, 22, 21, 56, 33, 16]#index_list = [[6,0],[6,1],[6,2],[6,3],[6,4],[6,5],[6],[6,7]]#[0, 25, 19, 29, 19, 24, 79, 5]#index_list = [[7,0],[7,1],[7,2],[7,3],[7,4],[7,5],[7,6],[7]]#[3, 14, 23, 20, 28, 19, 11, 82]#xx = np.zeros([len(index_list),50])#for j in range(xx.shape[1]):#  for i in range(xx.shape[0]):#    md_catboost = CatBoostRegressor(iterations=10000,#                             learning_rate=.005,#                             depth=2,#                             verbose=0,#                             #silent=True,#                             task_type="CPU",#                             l2_leaf_reg=1,#                             loss_function= 'MultiRMSE',#                             eval_metric= 'MultiRMSE',#                             random_seed = 1,#                             bagging_temperature = 0.1,#                             od_type= 'Iter', #                             metric_period = 75,#                             od_wait=100)#    X_all_k, Y_all_k  = np.array(arch_list_train).astype('float'), Y_all.astype('float').T[:,index_list[i]]#    X_train_k, X_test_k, Y_train_k, Y_test_k = sklearn.model_selection.train_test_split(#        X_all_k, Y_all_k, test_size=0.2, shuffle=True,random_state=j)##    md_catboost.fit(X_train_k,Y_train_k)#    y_predict = md_catboost.predict(X_test_k)#    if len(index_list[i]) == 1:#        mse = sklearn.metrics.mean_squared_error(y_predict,Y_test_k)#    else:#        mse = sklearn.metrics.mean_squared_error(y_predict[:,0],Y_test_k[:,0])#    print('MSE:','i',i,'j',j,index_list[i],mse)#    print('Kendalltau:',scipy.stats.stats.kendalltau(y_predict[:,0],Y_test_k[:,0]))]#    xx[i,j] = np.round(mse,5)#from catboost import Pool, CatBoostRegressor#X_train_k, Y_train_k = X_all_k, Y_all_k#print(X_train_k.shape, Y_train_k.shape, X_test_k.shape, Y_test_k.shape)#dtrain = Pool(X_train_k, label=Y_train_k)#dvalid = Pool(X_test_k, label=Y_test_k)#md_catboost.fit(dtrain,eval_set=dvalid, use_best_model=True,early_stopping_rounds=None)#y_predict = md_catboost.predict(dvalid)#[print('Kendalltau:',scipy.stats.stats.kendalltau(y_predict,Y_test_k))]
In [ ]
#----------------------------------------------------------------------------------------------------------#    Other experiment we try#----------------------------------------------------------------------------------------------------------#Only CAT regressor#	Kendalltau:0.79394		#Only HIST regressor#	Kendalltau:0.78626	#Only GB regressor#	Kendalltau:0.79383		#Only XGB regressor #	Kendalltau:0.78741	#Only LIGHTGB regressor#	Kendalltau:0.78647		#5 regressor combine with same learning rate and same depth parameters(depth = 1)#	Kendalltau:0.79321		#5 regressor combine n_iters = 5000 with different learning rate for different regressor#	Kendalltau:0.79457	#5 regressor combine n_iters = 10000#	Kendalltau:0.79696	
        #7 regressor combine, hp_mat = 1, learn_rate*0.8#	Kendalltau:0.79608	#7 regressor combine, hp_mat = 1#	Kendalltau:0.79697	
        #7 regressor combine, hp_mat = 1, learn_rate*2#	Kendalltau:0.79732	#7 regressor combine, hp_mat = 1, tunning learn_rate#	Kendalltau:0.79769	#7 regressor combine, hp_mat = 0.4#	Kendalltau:0.79785	#7 regressor combine, hp_mat = 0.5, all depth parameter equal to 1 #	Kendalltau:0.79389	#7 regressor combine, hp_mat = 0.5, tunning depth parameter#	Kendalltau:0.79788	  #7 regressor combine, hp_mat = 0.5, round up final int rank#	Kendalltau:0.79796	#7 regressor combine, hp_mat = 0.5, tunning learn_rate after new depth parameter#	Kendalltau:0.79859

相关专题

更多
Sass和less的区别
Sass和less的区别

Sass和less的区别有语法差异、变量和混合器的定义方式、导入方式、运算符的支持、扩展性等。本专题为大家提供Sass和less相关的文章、下载、课程内容,供大家免费下载体验。

202

2023.10.12

python中print函数的用法
python中print函数的用法

python中print函数的语法是“print(value1, value2, ..., sep=' ', end=' ', file=sys.stdout, flush=False)”。本专题为大家提供print相关的文章、下载、课程内容,供大家免费下载体验。

185

2023.09.27

css中float用法
css中float用法

css中float属性允许元素脱离文档流并沿其父元素边缘排列,用于创建并排列、对齐文本图像、浮动菜单边栏和重叠元素。想了解更多float的相关内容,可以阅读本专题下面的文章。

574

2024.04.28

C++中int、float和double的区别
C++中int、float和double的区别

本专题整合了c++中int和double的区别,阅读专题下面的文章了解更多详细内容。

100

2025.10.23

if什么意思
if什么意思

if的意思是“如果”的条件。它是一个用于引导条件语句的关键词,用于根据特定条件的真假情况来执行不同的代码块。本专题提供if什么意思的相关文章,供大家免费阅读。

768

2023.08.22

lambda表达式
lambda表达式

Lambda表达式是一种匿名函数的简洁表示方式,它可以在需要函数作为参数的地方使用,并提供了一种更简洁、更灵活的编码方式,其语法为“lambda 参数列表: 表达式”,参数列表是函数的参数,可以包含一个或多个参数,用逗号分隔,表达式是函数的执行体,用于定义函数的具体操作。本专题为大家提供lambda表达式相关的文章、下载、课程内容,供大家免费下载体验。

206

2023.09.15

python lambda函数
python lambda函数

本专题整合了python lambda函数用法详解,阅读专题下面的文章了解更多详细内容。

190

2025.11.08

Python lambda详解
Python lambda详解

本专题整合了Python lambda函数相关教程,阅读下面的文章了解更多详细内容。

50

2026.01.05

c++ 根号
c++ 根号

本专题整合了c++根号相关教程,阅读专题下面的文章了解更多详细内容。

25

2026.01.23

热门下载

更多
网站特效
/
网站源码
/
网站素材
/
前端模板

精品课程

更多
相关推荐
/
热门推荐
/
最新课程
最新Python教程 从入门到精通
最新Python教程 从入门到精通

共4课时 | 18.3万人学习

Django 教程
Django 教程

共28课时 | 3.4万人学习

SciPy 教程
SciPy 教程

共10课时 | 1.2万人学习

关于我们 免责申明 举报中心 意见反馈 讲师合作 广告合作 最新更新
php中文网:公益在线php培训,帮助PHP学习者快速成长!
关注服务号 技术交流群
PHP中文网订阅号
每天精选资源文章推送

Copyright 2014-2026 https://www.php.cn/ All Rights Reserved | php.cn | 湘ICP备2023035733号