文章/答案/技术大牛

发布

社区首页 >问答首页 >利用图像序列作为训练数据集的TensorFlow对象检测API

问利用图像序列作为训练数据集的TensorFlow对象检测API
EN

Stack Overflow用户

提问于 2017-07-14 03:11:11

回答 1查看 2K关注 0票数 0

我想从Tensorflow对象检测API中训练一个ssd-inception v2模型。我想使用的训练数据集是一堆不同大小的裁剪图像，没有边框，因为裁剪本身就是边框。

我遵循create_pascal_tf_record.py示例，相应地替换边界框和分类部分，以生成TFRecords，如下所示：

def dict_to_tf_example(imagepath, label):
    image = Image.open(imagepath)
    if image.format != 'JPEG':
         print("Skipping file: " + imagepath)
         return
    img = np.array(image)
    with tf.gfile.GFile(imagepath, 'rb') as fid:
        encoded_jpg = fid.read()
    # The reason to store image sizes was demonstrated
    # in the previous example -- we have to know sizes
    # of images to later read raw serialized string,
    # convert to 1d array and convert to respective
    # shape that image used to have.
    height = img.shape[0]
    width = img.shape[1]
    key = hashlib.sha256(encoded_jpg).hexdigest()
    # Put in the original images into array
    # Just for future check for correctness

    xmin = [5.0/100.0]
    ymin = [5.0/100.0]
    xmax = [95.0/100.0]
    ymax = [95.0/100.0]
    class_text = [label['name'].encode('utf8')]
    classes = [label['id']]
    example = tf.train.Example(features=tf.train.Features(feature={
        'image/height':dataset_util.int64_feature(height),
        'image/width': dataset_util.int64_feature(width),
        'image/filename': dataset_util.bytes_feature(imagepath.encode('utf8')),
        'image/source_id': dataset_util.bytes_feature(imagepath.encode('utf8')),
        'image/encoded': dataset_util.bytes_feature(encoded_jpg),
        'image/key/sha256': dataset_util.bytes_feature(key.encode('utf8')),
        'image/format': dataset_util.bytes_feature('jpeg'.encode('utf8')),        
        'image/object/class/text': dataset_util.bytes_list_feature(class_text),
        'image/object/class/label': dataset_util.int64_list_feature(classes),
        'image/object/bbox/xmin': dataset_util.float_list_feature(xmin),
        'image/object/bbox/xmax': dataset_util.float_list_feature(xmax),
        'image/object/bbox/ymin': dataset_util.float_list_feature(ymin),
        'image/object/bbox/ymax': dataset_util.float_list_feature(ymax)
    }))

    return example


def main(_):

  data_dir = FLAGS.data_dir
  output_path = os.path.join(data_dir,FLAGS.output_path + '.record')
  writer = tf.python_io.TFRecordWriter(output_path)
  label_map = label_map_util.load_labelmap(FLAGS.label_map_path)
  categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=80, use_display_name=True)
  category_index = label_map_util.create_category_index(categories)
  category_list = os.listdir(data_dir)
  gen = (category for category in categories if category['name'] in category_list)
  for category in gen:
    examples_path = os.path.join(data_dir,category['name'])
    examples_list = os.listdir(examples_path)
    for example in examples_list:
        imagepath = os.path.join(examples_path,example)

        tf_example = dict_to_tf_example(imagepath,category)
        writer.write(tf_example.SerializeToString())
 #       print(tf_example)

  writer.close()

包围框是硬编码，包括整个图像。相应地，标签被赋予相应的目录。我使用mscoco_label_map.pbxt标记和ssd_inception_v2_pets.config作为我的管道的基础。

我训练并冻结了模型，以便与jupyter笔记本示例一起使用。然而，最终的结果是围绕整个图像的单个框。知道出了什么问题吗？

image-processing

tensorflow

computer-vision

deep-learning

conv-neural-network

回答 1

Stack Overflow用户

回答已采纳

发布于 2017-07-14 09:08:55

对象检测算法/网络通常通过预测边界框和类的位置来工作。因此，培训数据通常需要包含边框数据。通过使用始终与图像大小相同的边界框向模型提供培训数据，您很可能会得到垃圾预测，包括一个总是勾勒出图像的框。

这听起来像是你的训练数据的问题。您不应该给裁剪的图像，而是完整的图像/场景与您的对象注释。你现在基本上是在训练分类器。

尝试使用没有裁剪的图像的正确样式进行培训，看看您是如何进行的。

票数 5

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/45093955

复制

相似问题

问利用图像序列作为训练数据集的TensorFlow对象检测API
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问利用图像序列作为训练数据集的TensorFlow对象检测APIEN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问利用图像序列作为训练数据集的TensorFlow对象检测API
EN