手把手教你将DOTA遥感数据集标注转为COCO格式（附完整Python代码）-尧图企业网站定制

手把手教你将DOTA遥感数据集标注转为COCO格式附完整Python代码遥感图像中的车辆检测是智慧交通、港口监控等场景中的核心任务。DOTA数据集作为遥感领域最具影响力的基准数据集之一其标注格式与通用目标检测框架如MMDetection、Detectron2常用的COCO格式存在显著差异。本文将深入解析两种格式的转换逻辑并提供一套经过实战检验的Python解决方案。1. 为什么需要转换标注格式DOTA数据集采用旋转框标注OBB每个物体由四个角点坐标表示这种格式能更精确地捕捉遥感图像中物体的朝向和形状。而COCO格式使用水平矩形框HBB仅需左上角和右下角坐标。两种格式的核心差异体现在三个方面几何表示DOTA(x1,y1,x2,y2,x3,y3,x4,y4)四对坐标COCO[x_min,y_min,width,height]归一化坐标数据结构# DOTA标注示例每行一个物体 x1 y1 x2 y2 x3 y3 x4 y4 class_name difficulty # COCO标注结构 { images: [{file_name: 1.jpg, id: 1,...}], annotations: [{bbox: [x,y,w,h], category_id: 1,...}], categories: [{id: 1, name: car},...] }适用场景DOTA专为航空图像优化适合旋转物体检测COCO通用检测基准主流框架原生支持提示当使用YOLOv5、Faster R-CNN等框架时COCO格式能直接兼容大多数开源代码库和数据增强管道。2. 转换核心逻辑与代码实现2.1 关键步骤分解转换过程需要处理三个核心问题坐标转换将旋转框转化为外接水平矩形类别映射匹配DOTA与COCO的类别体系文件结构重组从每图单独标注到集中式JSON存储2.2 完整转换代码以下代码实现了端到端的格式转换包含异常处理和可视化验证import os import json from PIL import Image import numpy as np class DOTA2COCOConverter: def __init__(self, class_mappingNone): :param class_mapping: 自定义类别映射字典 self.class_mapping class_mapping or { small-vehicle: 1, large-vehicle: 2, ship: 3 } def _get_enclosing_bbox(self, points): 将旋转框转为水平矩形框 x_coords points[::2] y_coords points[1::2] x_min, x_max min(x_coords), max(x_coords) y_min, y_max min(y_coords), max(y_coords) return [x_min, y_min, x_max - x_min, y_max - y_min] def parse_dota_annotation(self, txt_path): 解析单个DOTA标注文件 annotations [] with open(txt_path, r) as f: for line in f.readlines(): if line.strip() : continue parts line.strip().split() if len(parts) 9: continue points list(map(float, parts[:8])) class_name parts[8] difficulty int(parts[9]) if len(parts) 9 else 0 if class_name not in self.class_mapping: continue bbox self._get_enclosing_bbox(points) annotation { bbox: bbox, category_id: self.class_mapping[class_name], iscrowd: 0, area: bbox[2] * bbox[3] } annotations.append(annotation) return annotations def convert(self, img_dir, ann_dir, output_json): 执行批量转换 coco_data { images: [], annotations: [], categories: [ {id: id, name: name} for name, id in self.class_mapping.items() ] } annotation_id 1 for img_name in os.listdir(img_dir): if not img_name.lower().endswith((.png, .jpg, .jpeg)): continue img_path os.path.join(img_dir, img_name) base_name os.path.splitext(img_name)[0] txt_path os.path.join(ann_dir, base_name .txt) if not os.path.exists(txt_path): continue with Image.open(img_path) as img: width, height img.size image_id len(coco_data[images]) 1 coco_data[images].append({ id: image_id, file_name: img_name, width: width, height: height }) annotations self.parse_dota_annotation(txt_path) for ann in annotations: ann.update({ image_id: image_id, id: annotation_id }) coco_data[annotations].append(ann) annotation_id 1 with open(output_json, w) as f: json.dump(coco_data, f, indent2)3. 实战应用与验证3.1 典型目录结构建议采用如下目录组织DOTA_dataset/ ├── images/ │ ├── 0001.png │ └── 0002.png └── annotations/ ├── 0001.txt └── 0002.txt3.2 执行转换converter DOTA2COCOConverter() converter.convert( img_dirDOTA_dataset/images, ann_dirDOTA_dataset/annotations, output_jsoncoco_annotations.json )3.3 验证结果使用COCO API检查转换质量from pycocotools.coco import COCO import matplotlib.pyplot as plt coco COCO(coco_annotations.json) img_ids coco.getImgIds() img_info coco.loadImgs(img_ids[0])[0] plt.imshow(Image.open(os.path.join(DOTA_dataset/images, img_info[file_name]))) ann_ids coco.getAnnIds(imgIdsimg_info[id]) annotations coco.loadAnns(ann_ids) for ann in annotations: bbox ann[bbox] plt.gca().add_patch(plt.Rectangle( (bbox[0], bbox[1]), bbox[2], bbox[3], fillFalse, edgecolorred, linewidth2 )) plt.show()4. 常见问题与优化建议4.1 典型错误排查错误现象可能原因解决方案JSON文件为空路径错误或类别不匹配检查路径是否存在确认class_mapping覆盖所有类别标注框偏移坐标归一化问题确保使用原始像素坐标不进行归一化类别ID混乱重复的类别映射确保class_mapping中每个类别有唯一ID4.2 性能优化技巧并行处理对大型数据集使用多进程加速from multiprocessing import Pool def process_image(args): img_path, ann_path args # 处理逻辑... with Pool(4) as p: p.map(process_image, file_pairs)增量写入处理超大数据集时避免内存溢出with open(output_json, w) as f: f.write({images: [], annotations: [], categories: [...]}\n) # 分批追加数据可视化校验开发阶段建议对10%的样本进行人工复核在实际车辆检测项目中这种转换通常只需执行一次。建议将转换后的COCO文件与原始数据一起归档并在README中记录转换参数便于后续复现。

相关新闻

告别重复点击：用AI视觉语言模型UI-TARS-desktop实现自然语言控制电脑的终极指南

RoLA框架：单图像驱动的机器人交互场景物理仿真

ALTER TABLE：MySQL 增强表结构的最佳实践与避坑指南

零 Token 消耗！Agnes 多模态 Agent 全栈实战指南

从零开始：用Python处理ABIDE I脑成像数据（附完整代码与数据下载指南）

独家披露：Sora 2艺术复现未公开API调用层协议与motion token embedding映射表（限时开放24小时下载）

自动化你的标定流程：用Python脚本一键处理Livox Mid-70的PCD与图像数据

工业智能一体机1000元和5000元差在哪？采购避坑指南

PMBOK8新架构：绩效域取代过程组

微信小程序获取手机号全流程实战：从button绑定到后端解密，附赠常见错误码（102/40001/45011）一键排查手册

VSCode安装+汉化+使用保姆级教程（详细图文+视频教程)

基于STM32与BLE 5.0的本地化传感器数据显示系统设计与实现

毕业论文神器！2026最新AI论文写作软件测评与推荐

基于指数矩的车牌识别解析方案【附代码】

前轮驱动自行车机器人建模与自适应控制策略优化【附代码】

从陌生到熟悉：Royal TSX中文汉化包的体验地图之旅

时延最优化设计

别再重启了！Windows 11下dwm.exe内存飙升，我用Intel官方工具升级显卡驱动搞定