Uber LaneGCN的开源代码的训练数据使用了Argoverse Motion Forecasting数据集。
Argoverse Motion Forecasting is a curated collection of 324,557 scenarios, each 5 seconds long, for training and validation. Each scenario contains the 2D, birds-eye-view centroid of each tracked object sampled at 10 Hz.
trajectories for the agent of interest (red), self-driving vehicle (green), and all other objects of interest in the scene (light blue).
本文主要记录Uber LaneGCN是如何处理轨迹数据和高精地图数据的。
Argoverse Motion Forecasting每个scenario是一个CSV文件,每行包含"TIMESTAMP, TRACK_ID, OBJECT_TYPE, X, Y, CITY_NAME"五个字段。
从这些数据中提取目标车辆和社会车辆的Past Trajectories的步骤如下:
city = copy.deepcopy(self.avl[idx].city)
"""TIMESTAMP,TRACK_ID,OBJECT_TYPE,X,Y,CITY_NAME"""
df = copy.deepcopy(self.avl[idx].seq_df)
trajs = np.concatenate((
df.X.to_numpy().reshape(-1, 1),
df.Y.to_numpy().reshape(-1, 1)), 1)
objs = df.groupby(['TRACK_ID', 'OBJECT_TYPE']).groups
keys = list(objs.keys())
obj_type = [x[1] for x in keys]
agt_idx = obj_type.index('AGENT')
idcs = objs[keys[agt_idx]]
agt_traj = trajs[idcs]
agt_ts = np.sort(np.unique(df['TIMESTAMP'].values))
mapping = dict()
for i, ts in enumerate(agt_ts):
mapping[ts] = i
...
steps = [mapping[x] for x in df['TIMESTAMP'].values]
steps = np.asarray(steps, np.int64)
...
agt_step = steps[idcs]
ctx_trajs, ctx_steps = [], []
for key in keys:
idcs = objs[key]
ctx_trajs.append(trajs[idcs])
ctx_steps.append(steps[idcs])
data = dict()
data['city'] = city
data['trajs'] = [agt_traj] + ctx_trajs
data['steps'] = [agt_step] + ctx_steps
orig = data['trajs'][0][19].copy().astype(np.float32)
pre = data['trajs'][0][18] - orig
theta = np.pi - np.arctan2(pre[1], pre[0])
rot = np.asarray([
[np.cos(theta), -np.sin(theta)],
[np.sin(theta), np.cos(theta)]], np.float32)
使用第21个点到第50个点作为Ground Truth。Ground Truth直接使用了原始坐标,并没有进行坐标系归一化。
gt_pred = np.zeros((30, 2), np.float32)
has_pred = np.zeros(30, np.bool)
future_mask = np.logical_and(step >= 20, step < 50)
post_step = step[future_mask] - 20
post_traj = traj[future_mask]
gt_pred[post_step] = post_traj
has_pred[post_step] = 1
为了跟Ground Truth的坐标系保持一致,代码中对神经网络输出的轨迹坐标进行逆变换,从局部坐标系转换回全局坐标系,代码如下:
# prediction
out = self.pred_net(actors, actor_idcs, actor_ctrs)
rot, orig = gpu(data["rot"]), gpu(data["orig"])
# transform prediction to world coordinates
for i in range(len(out["reg"])):
out["reg"][i] = torch.matmul(out["reg"][i], rot[i]) + orig[i].view(1, 1, 1, -1)
使用第20个点作为坐标原点,第19个点与第20个点的朝向作为x轴的正方向,对输入轨迹的坐标进行旋转和平移。
obs_mask = step < 20
step = step[obs_mask]
traj = traj[obs_mask]
idcs = step.argsort()
step = step[idcs]
traj = traj[idcs]
for i in range(len(step)):
if step[i] == 19 - (len(step) - 1) + i:
break
step = step[i:]
traj = traj[i:]
feat = np.zeros((20, 3), np.float32)
feat[step, :2] = np.matmul(rot, (traj - orig.reshape(-1, 2)).T).T
feat[step, 2] = 1.0
ctrs.append(feat[-1, :2].copy())
feat[1:, :2] -= feat[:-1, :2]
feat[step[0], :2] = 0
feats.append(feat)
x_min, x_max, y_min, y_max = self.config['pred_range']
radius = max(abs(x_min), abs(x_max)) + max(abs(y_min), abs(y_max))
lane_ids = self.am.get_lane_ids_in_xy_bbox(data['orig'][0], data['orig'][1], data['city'], radius)
lane_ids = copy.deepcopy(lane_ids)
lanes = dict()
for lane_id in lane_ids:
lane = self.am.city_lane_centerlines_dict[data['city']][lane_id]
lane = copy.deepcopy(lane)
centerline = np.matmul(data['rot'], (lane.centerline - data['orig'].reshape(-1, 2)).T).T
x, y = centerline[:, 0], centerline[:, 1]
if x.max() < x_min or x.min() > x_max or y.max() < y_min or y.min() > y_max:
continue
else:
"""Getting polygons requires original centerline"""
polygon = self.am.get_lane_segment_polygon(lane_id, data['city'])
polygon = copy.deepcopy(polygon)
lane.centerline = centerline
lane.polygon = np.matmul(data['rot'], (polygon[:, :2] - data['orig'].reshape(-1, 2)).T).T
lanes[lane_id] = lane
Lane Graph Node的几何信息是车道中心线坐标序列前后两个点的中点。
lane_ids = list(lanes.keys())
ctrs, feats, turn, control, intersect = [], [], [], [], []
for lane_id in lane_ids:
lane = lanes[lane_id]
ctrln = lane.centerline
num_segs = len(ctrln) - 1
ctrs.append(np.asarray((ctrln[:-1] + ctrln[1:]) / 2.0, np.float32))
Node Feature还考虑了车道线的朝向(Orientation),即
feats.append(np.asarray(ctrln[1:] - ctrln[:-1], np.float32))
获取车道的转向(左转、直行、右转、调头)、是否有红绿灯、是否在路口等属性信息;
x = np.zeros((num_segs, 2), np.float32)
if lane.turn_direction == 'LEFT':
x[:, 0] = 1
elif lane.turn_direction == 'RIGHT':
x[:, 1] = 1
else:
pass
turn.append(x)
control.append(lane.has_traffic_control * np.ones(num_segs, np.float32))
intersect.append(lane.is_intersection * np.ones(num_segs, np.float32))
如下图所示,假设当前预测范围内包含三个Lane(不同颜色代表不同的Lane),这些Lane组成的Lane Graph共有5个Node,分别赋予它们唯一编号:0,1,2,3,4,5。
node_idcs = []
count = 0
for i, ctr in enumerate(ctrs):
node_idcs.append(range(count, count + len(ctr)))
count += len(ctr)
num_nodes = count
构建前驱后继关系。pre['v'][i]是pre['u'][i]的前驱(0<=i<5),suc[v][i]是suc[u][i]的后继(0<=i<5)。
在上图所示的Lane Graph中,Single Scale的前驱(pre)关系如下:
pre['u'] = [2,3,4,1,5]
pre['v'] = [1,2,3,0,4]
在上图所示的Lane Graph中,Single Scale后继(suc)关系如下:
suc['u'] = [1,2,3,4,0]
suc['v'] = [2,3,4,5,1]
pre, suc = dict(), dict()
for key in ['u', 'v']:
pre[key], suc[key] = [], []
for i, lane_id in enumerate(lane_ids):
lane = lanes[lane_id]
idcs = node_idcs[i]
pre['u'] += idcs[1:]
pre['v'] += idcs[:-1]
if lane.predecessors is not None:
for nbr_id in lane.predecessors:
if nbr_id in lane_ids:
j = lane_ids
.index(nbr_id)
pre['u'].append(idcs[0])
pre['v'].append(node_idcs[j][-1])
suc['u'] += idcs[:-1]
suc['v'] += idcs[1:]
if lane.successors is not None:
for nbr_id in lane.successors:
if nbr_id in lane_ids:
j = lane_ids.index(nbr_id)
suc['u'].append(idcs[-1])
suc['v'].append(node_idcs[j][0])
为了捕捉复杂的拓扑和长距离的lane依赖关系,在训练数据中,会将沿着Lane的前驱(Pre)和后继(Suc)方向的Multi Scale的Node的关系计算出来。
下面公式中,C是Multi Scale的层数。
计算Multi Scale前驱和后继关系的代码如下。其中nbr是Single Scale的前驱(Pre)和后继(Suc)关系;num_nodes是Graph中Node的总数量;num_scales是Multi Scale的总层数,代码中默认取6。
def dilated_nbrs(nbr, num_nodes, num_scales):
data = np.ones(len(nbr['u']), np.bool)
csr = sparse.csr_matrix((data, (nbr['u'], nbr['v'])), shape=(num_nodes, num_nodes))
mat = csr
nbrs = []
for i in range(1, num_scales):
mat = mat * mat
nbr = dict()
coo = mat.tocoo()
nbr['u'] = coo.row.astype(np.int64)
nbr['v'] = coo.col.astype(np.int64)
nbrs.append(nbr)
return nbrs
Multi-Scale计算是通过邻接矩阵的乘法实现的,以前驱(pre)为例,它的Single Scale的邻接矩阵是:
本文所有的代码均来自Uber官方的开源代码。
https://github.com/uber-research/LaneGCN
- END -