交通速度数据集处理与邻接矩阵生成:基于 METR-LA、PEMS-BAY、PEMSD7L、PEMSD7M 的完整代码指南

交通速度数据集处理与邻接矩阵生成:基于 METR-LA、PEMS-BAY、PEMSD7L、PEMSD7M 的完整代码指南 交通速度数据集处理与邻接矩阵生成基于 METR-LA、PEMS-BAY、PEMSD7L、PEMSD7M 的完整代码指南文章目录 交通速度数据集处理与邻接矩阵生成基于 METR-LA、PEMS-BAY、PEMSD7L、PEMSD7M 的完整代码指南 数据集结构与文件说明文件目录结构 一、原始数据加载与预处理1. generate_training_data.py 示例METR-LA 二、滑动窗口划分训练、验证、测试集2. sliding_window_split.py 三、根据距离和连接关系生成邻接矩阵3. adjacency_matrix.py 四、将 .npz 文件转换为 .csv 文件以便查看4. npz_to_csv.py注意事项 六、参考交通速度数据集METR-LA、PEMS-BAY、PEMSD7L、PEMSD7M含原始数据通过滑动窗口划分train、val、test数据代码根据距离和连接关系生成邻接矩阵代码。添加了npz_to_csv代码目的是查看数据 交通速度数据集处理与邻接矩阵生成基于 METR-LA、PEMS-BAY、PEMSD7L、PEMSD7M 的完整代码指南 数据集结构与文件说明文件目录结构traffic_speed_dataset/ ├── METR-LA/ │ ├── generate_training_data.py │ └── metr-la.h5 ├── PEMS-BAY/ │ ├── generate_training_data.py │ └── pems-bay.h5 └── utils/ ├── npz_to_csv.py ├── sliding_window_split.py └── adjacency_matrix.py 一、原始数据加载与预处理1.generate_training_data.py示例METR-LAimporth5pyimportnumpyasnpdefload_traffic_data(file_path):withh5py.File(file_path,r)asf:dataf[data][:]timestampsf[date][:]returndata,timestampsif__name____main__:file_pathmetr-la.h5data,timestampsload_traffic_data(file_path)print(fData shape:{data.shape})print(fTimestamps shape:{timestamps.shape}) 二、滑动窗口划分训练、验证、测试集2.sliding_window_split.pyimportnumpyasnpdefsliding_window_split(data,window_size,stride1,train_ratio0.6,val_ratio0.2):n_samples(len(data)-window_size)//stride1X[]y[]foriinrange(0,n_samples*stride,stride):X.append(data[i:iwindow_size])y.append(data[iwindow_size])Xnp.array(X)ynp.array(y)# 划分训练、验证、测试集total_sizelen(X)train_sizeint(total_size*train_ratio)val_sizeint(total_size*val_ratio)X_train,y_trainX[:train_size],y[:train_size]X_val,y_valX[train_size:train_sizeval_size],y[train_size:train_sizeval_size]X_test,y_testX[train_sizeval_size:],y[train_sizeval_size:]return(X_train,y_train),(X_val,y_val),(X_test,y_test)if__name____main__:datanp.random.rand(1000,10)# 示例数据实际应替换为真实数据window_size12(X_train,y_train),(X_val,y_val),(X_test,y_test)sliding_window_split(data,window_size)print(fTrain set shape:{X_train.shape},{y_train.shape})print(fValidation set shape:{X_val.shape},{y_val.shape})print(fTest set shape:{X_test.shape},{y_test.shape}) 三、根据距离和连接关系生成邻接矩阵3.adjacency_matrix.pyimportnumpyasnpimportscipy.sparseasspdefgenerate_adjacency_matrix(distance_file,sensor_ids,normalized_k0.1): Generate an adjacency matrix from a list of distances. :param distance_file: str, path to the txt file containing sensor distances. :param sensor_ids: list, list of sensor IDs. :param normalized_k: float, parameter for normalization. :return: np.ndarray, adjacency matrix. withopen(distance_file,r)asf:linesf.readlines()num_sensorslen(sensor_ids)dist_mxnp.zeros((num_sensors,num_sensors),dtypenp.float32)dist_mx[:]np.infforlineinlines:_,sensor_id,neighbor_sensor_id,distanceline.strip().split( )ifsensor_idnotinsensor_idsorneighbor_sensor_idnotinsensor_ids:continuedist_mx[sensor_ids.index(sensor_id)][sensor_ids.index(neighbor_sensor_id)]float(distance)dist_mx[sensor_ids.index(neighbor_sensor_id)][sensor_ids.index(sensor_id)]float(distance)distancesdist_mx[~np.isinf(dist_mx)].flatten()stddistances.std()adj_mxnp.exp(-np.square(dist_mx/std))# Make the adjacent matrix symmetric by taking the max.adj_mxnp.maximum.reduce([adj_mx,adj_mx.T])# Sets entries that lower than a threshold, i.e., k, to zero for sparsity.adj_mx[adj_mxnormalized_k]0returnadj_mxif__name____main__:distance_filedata/sensor_graph/adj_mx.txt# 替换为你的距离文件路径sensor_ids[sensor_1,sensor_2,sensor_3]# 替换为你的传感器ID列表adj_mxgenerate_adjacency_matrix(distance_file,sensor_ids)print(Adjacency Matrix:\n,adj_mx) 四、将.npz文件转换为.csv文件以便查看4.npz_to_csv.pyimportnumpyasnpimportpandasaspddefnpz_to_csv(npz_file,csv_file):datanp.load(npz_file)dfpd.DataFrame(datadata[arr_0])df.to_csv(csv_file,indexFalse)if__name____main__:npz_filedata.npz# 替换为你的 .npz 文件路径csv_filedata.csv# 输出的 .csv 文件路径npz_to_csv(npz_file,csv_file)print(fConverted{npz_file}to{csv_file})注意事项确保所有依赖项已安装numpy,pandas,scipy,h5py。根据实际数据集调整参数如window_size,stride,normalized_k等。对于不同的数据集如 METR-LA、PEMS-BAY、PEMSD7L、PEMSD7M可能需要调整部分代码以适应其特定格式和特性。 六、参考METR-LA 数据集官方文档PEMS-BAY 数据集官方文档Scikit-learn 官方文档NumPy 官方文档Pandas 官方文档