我正在寻找一个正则表达式,它可以帮助我从长字符串中获得字符串子集,而无需多次运行拆分方法,所以我想我可以使用regex,但目前我失败了。
这是我的输入文本:
Platform->Not Machine Specific Studio->Symantec Title->Symantec Promotional Book Home Internet Security这是我用regexp做的尝试
尝试1
regexp:/[A-Z]+->([A-Z]+|[0-9]+)(\s*|([A-Z]|[0-9])+)*$/gim
结果:Title->Symantec Promotional Book Home Internet Security
尝试2
regexp:/[A-Z]+->([A-Z]+|[0-9]+)(\s*|([A-Z]|[0-9])+)*/gim
结果:
Platform->Not Machine Specific Studio
Title->Symantec Promotional Book Home Internet Security预期结果:
Platform->Not Machine Specific
Studio->Symantec
Title->Symantec Promotional Book Home Internet Security结果中的每一行都应该是匹配的
发布于 2020-10-22 03:10:51
您可以尝试以下regex模式:
\S+->.*?(?= \S+->|$)
下面是对这种模式的解释:
\S+ match an initial key
-> match "->"
.*? match a value, consisting of one or possibly more words, under reaching
(?= \S+->|$) the next key followed by "->" OR the end of the string编辑:
如果您是通过Python脚本执行此操作的,那么我建议在这里使用re.findall:
inp = "Platform->Not Machine Specific Studio->Symantec Title->Symantec Promotional Book Home Internet Security"
matches = re.findall(r'\S+->.*?(?= \S+->|$)', inp)
print(matches)这些指纹:
['Platform->Not Machine Specific', 'Studio->Symantec',
'Title->Symantec Promotional Book Home Internet Security']https://stackoverflow.com/questions/64474617
复制相似问题